MLOps.community

Demetrios

Relaxed Conversations around getting AI into production, whatever shape that may come in (agentic, traditional ML, LLMs, Vibes, etc)

  1. hace 1 día

    The Current State of Agentic Retrieval - Qdrant Roundtable

    Qdrant Roundtable episode: The Current State of Agentic Retrieval Join the Community: https://go.mlops.community/YTJoinIn Get the newsletter: https://go.mlops.community/YTNewsletter MLOps GPU Guide: https://go.mlops.community/gpuguide Big shout-out to Qdrant for the collaboration! // Abstract AI agents are only as good as the information they can find, retrieve, and remember. In this community roundtable with the Qdrant team, we explored the latest advances in agentic memory, vector search, retrieval systems, and production AI architectures. As AI agents move beyond simple chatbots into systems that can reason across large amounts of information, retrieval is becoming one of the most important layers in the AI stack. The discussion covered the real-world challenges of building agents that remember what matters, forget what doesn't, and consistently retrieve the right context at the right time. If you're building AI agents, RAG systems, or production AI applications, this conversation offers practical insights into where retrieval is headed and what it takes to build reliable, scalable agentic systems. // Bio Ewa Szyszka Ewa is a Developer Relations professional based in San Francisco with a background in Computer Science and Hardware Engineering, passionate about bridging the gap between technology and the developer community. She holds a BSc in Computer Science and an MSc in Electronics, bringing a strong blend of deep technical foundations and communication skills to her work. Dylan Couzon Dylan is based in New York City, and he helps developers build better AI applications. He is passionate about AI, programming, open source, and robotics, and enjoys sharing what he’s building and learning along the way. Neil Kanungo Neil is an experienced professional with expertise in data science, developer relations, and product growth. Currently serving as the Head of Developer Relations at Qdrant, Neil previously held the position of VP of Product Led Growth & Developer Relations at KX, where significant increases in product registration and user activation were achieved. At TIBCO, Neil managed a team focused on enhancing the adoption of TIBCO Spotfire through various initiatives, including tutorial videos and live webinars. With a strong technical background, Neil has developed innovative solutions in analytics, machine learning, and data visualization across multiple roles, including Engineering Data Analyst and Asset Integrity Engineer at Enterprise Products. Neil holds a Bachelor of Science in Radiation Physics from The University of Texas at Austin, a Master of Science in Mechanical Engineering from Texas Tech University, and is pursuing a Master in Applied Data Science from the University of Michigan. Evgeniya Sukhodolskaya Developer Relations at Qdrant with 8 years of IT experience across software engineering, machine learning, and technical management, and 4 years in Developer Relations. Holds a Master’s in Machine Learning, Data Analytics, and Data Engineering. Passionate about NLP, data-centric AI, and the role of vector search in advancing AI technologies. Andrei Cristea Andrei is a Berlin-based Developer Relations Engineer at Qdrant, a prominent open-source vector database. With a Master’s degree in Artificial Intelligence from TU Munich, his expertise bridges AI, data infrastructure, and knowledge engineering. Hosted by Demetrios // Related Links Website: https://qdrant.tech/ ~~~~~~~~ ✌️Connect With Us ✌️ ~~~~~~~ Catch all episodes, blogs, newsletters, and more: https://go.mlops.community/TYExplore Join our Slack community [https://go.mlops.community/slack] Follow us on X/Twitter [@mlopscommunity](https://x.com/mlopscommunity) or [LinkedIn](https://go.mlops.community/linkedin)] Sign up for the next meetup: [https://go.mlops.community/register] MLOps Swag/Merch: [https://shop.mlops.community/]

    59 min
  2. AI Agents in Healthcare?

    hace 2 días

    AI Agents in Healthcare?

    Kingsley Madikaegbu is the founder of HealID, a startup building agentic AI on top of the Model Context Protocol (MCP) for one of the most heavily regulated environments there is: healthcare. Recorded at MCP Dev Summit North America in New York, Kingsley sits down with Alex Salkever of the Agentic AI Foundation to break down how you give patients, doctors, caregivers, and family members each their own agent over the same medical record — without breaching HIPAA, leaking PHI, or letting an agent quietly go off the rails. In this conversation:🏗️ The four-layer architecture — Dumb data at the bottom, then access permissions, then MCP, then reasoning agents on top. Why logic never touches the data layer.🔐 MCP vs REST — Why enforcing per-role compliance in a REST API meant encoding permissions everywhere, and how MCP collapses that mess.🪪 HIPAA, auditability & traceability — Proving a specific person (not a snooping agent) accessed a record, with a full audit trail that regulators actually accept.🎟️ The nightclub-bouncer analogy — How MCP reorganizes the entire "club" per guest instead of just checking a VIP list.⌚ Wearables & real-world data — Turning an Apple Watch arrhythmia signal into a triaged, severity-scored workflow with doctors in the loop.🧭 Deterministic vs model-driven — Why anything clinical or regulatory stays binary, and the agent-as-coach (not decision-maker) pattern for patients.🛑 Keeping agents on the leash — Tool restriction, behavioral metadata, and drift/anomaly detection so an agent can't reinterpret its own job.⚡ The instant kill switch — Revoke permission, and the agent returns a hard 404, never partial data.⚖️ The liability question — When an agent follows a designed workflow and something goes wrong, who's responsible: patient, host, or provider? The industry hasn't decided.📋 Kingsley's MCP wishlist — Built-in traceability (OTEL-style spans), native time-bound enforcement, and guardrails against agent-to-agent data leakage.If you're building agentic systems for healthcare, finance, legal, or any regulated industry where "the agent did it" isn't a good enough answer — this one's for you.Links & Resources🔗 HealID — https://gethealid.com/🔗 Kingsley Madikaegbu — https://www.linkedin.com/in/kmadikaegbu🔗 Alex Salkever / Agentic AI Foundation — linkedin.com/in/alexsalkever🔗 MCP Dev Summit North America — https://events.linuxfoundation.org/mcp-dev-summit-north-america/Timestamps:[00:00] Intro[00:13] AI Agent Liability[01:10] MCP in Healthcare AI[06:30] MCP vs REST Architecture[11:29] Healthcare Integration Challenges[18:29] Non-compliant Patient Challenges[24:13] Deterministic vs Model-Driven Workflows[28:08] AI in Healthcare Conversations[34:38] Agent-to-agent workflows in healthcare[38:02] Future MCP security

    39 min
  3. Coding Agents Are Secretly General Agents

    hace 5 días

    Coding Agents Are Secretly General Agents

    In this episode: 🧠 Coding agents are generalist agents — why "positive transfer" means an agent that's better at code is better at everything, and how that makes them "AGI-complete" ⏳ "Code will be solved in a year" — what the automation of knowledge work actually looks like, and why Jay joined ClickUp to be on it 🏗️ Why the labs are crushing AI startups — free-for-two-years deals, Windsurf losing Claude access, and the brutal economics of building on top of frontier models 🔗 The real moat is convergence — context, surfaces, and unit economics, a.k.a. "Cursor for your whole job" 💬 Slack's data walls & the Glean problem — why fragmentation is the enemy and a single system of record wins 🧪 RLVR & verifiability — why code became the perfect training ground for agents, and how to tell if you're even getting better 🔬 LLMs are running the frontier of science — Putnam 12/12, Erdős problems, simulating a cell, and vibe-writing economics papers 🚗 The car wash test that still breaks GPT-5 — spiky models, world models, Plato's cave, and the "stochastic parrot" debate 🏖️ Plus: mechanistic interpretability as "brain surgery," catastrophic forgetting, the danger of deleting knowledge from models, and a pitch for a "resort for LLMs" Whether you're building agents, leading an AI team, or just trying to figure out what "agentic" really means for everyday work — this one's a fun, deep ride. 🔗 Links & Resources Jay Hack: linkedin.com/in/jayhack ClickUp: clickup.com MLOps Community: go.mlops.community Mentioned: Gödel, Escher, Bach (Douglas Hofstadter) · "Machine Learning: The High-Interest Credit Card of Technical Debt" (Sculley et al.) · Periodic Labs · Ginkgo Bioworks · Physical Intelligence

    1 h 12 min
  4. The Dark Side of MCP Servers

    23 jun

    The Dark Side of MCP Servers

    Sam Partee (CTO & co-founder of Arcade.dev) and Nate Barbettini (Founding Engineer at Arcade.dev) sit down at the MCP Dev Summit to unpack what nobody wants to admit about the Model Context Protocol: the security model is still full of sharp edges. From tool poisoning and prompt injection to why OAuth got bolted onto the spec, this is a builder 's-eye view of where MCP breaks — and how to ship agents safely anyway. What we get into:🔓 OAuth on MCP — Why the spec adopted OAuth as its authorization standard, and the class of spoofing attacks it shuts down.☠️ Tool poisoning — How a malicious server hides instructions in tool descriptions, and why your agent trusts them by default.🧪 MCP Debugger & ToolBench — Shining a light on the rough edges by grading servers from S-tier to F-tier.🖥️ Sandboxing agents — Giving an agent a shell and a file system without handing over the keys to your machine.📜 Allow lists — Why MCP has client-level allow lists but skills mostly don't — and why that worries them.🔄 The auto-update problem — How skills and servers that silently update become a supply-chain risk ("rug pulls").✅ SOC 2, honestly — Why the controls are voluntary, misunderstood, and actually about best practices.🤖 AI-generated PRs — The new behaviors to watch for as agents start writing and merging code. If you build agents, ship MCP servers, or are responsible for AI security at your company, this one's for you. 🔗 Links & ResourcesArcade.dev: https://www.arcade.devArcade MCP framework (GitHub): https://github.com/ArcadeAI/arcade-mcpSam Partee (GitHub): https://github.com/sparteeNate Barbettini (LinkedIn): https://www.linkedin.com/in/nbarbettiniMLOps.community: https://mlops.community ⏱️ Timestamps[00:00] Skills, agents, and local context [08:36] MCP Debugger grades your server [10:34] Why AI clients are still buggy [20:54] Why agents shouldn’t always have shell access [22:44] “I have a spicy take.” [26:27] “Do not build your own auth.” [31:14] The “checking someone else’s email” problem [35:40] “OAuth is the best worst option.” [43:50] The future of AI entertainment [46:19] Tool poisoning explained [50:49] “Trust me, bro,” is not a security solution [52:45] MCP registries as the App Store model [1:00:28] AI-generated PRs and speed vs quality [1:02:37] Why behavior-driven development is coming back [1:08:11] Have we already reached AGI? #MCP #AIAgentSecurity #ToolPoisoning

    1 h 10 min
  5. Sandboxing, Agent Harnesses, and Agent Teamwork

    19 jun

    Sandboxing, Agent Harnesses, and Agent Teamwork

    Shahram Anver is the Co-Founder and CEO of Cleric, the autonomous AI SRE that investigates and root-causes production issues like an experienced teammate — often in under two minutes. Before Cleric, Shahram led MLOps, DevOps, and FinOps platform engineering at Gojek, Southeast Asia's super-app. In this conversation, he breaks down why production operations never kept pace with AI-accelerated development, and why the real unlock for an AI SRE isn't faster triage — it's an agent that *learns* and compounds operational memory across your whole org. In this episode: 🔧 The on-call problem — Why one broken service still drags ten engineers onto a call, and how AI changes that 🤖 What an AI SRE actually is — How Cleric investigates across your existing observability stack instead of adding another tool 🧠 Learning over MTTR — Why Shahram argues the value isn't alert triage, it's an agent that gets better every investigation 🪜 Ramping like a new engineer — Explore the environment, learn from the work, talk to the team 🔁 The investigate–measure–learn loop — Turning what worked on one incident into context for the next 🕸️ Knowledge graphs & operational memory — Mapping teams, clusters, and dependencies so insight from one team helps another ⚡ Under two minutes to root cause — What "fast" really requires in a live production environment 🚀 The road to autonomy — From assisted investigation toward self-healing infrastructure If you're an SRE, platform engineer, DevOps lead, or anyone building or buying AI agents for production, this one's for you. 🔗 Links & Resources Cleric: https://cleric.ai Shahram on LinkedIn: https://www.linkedin.com/in/shahramanver/ Willem Pienaar (Co-Founder/CTO): https://www.linkedin.com/in/willempienaar/ Cleric launches the first self-learning AI SRE: https://cleric.ai/blog/cleric-launches-the-first-self-learning-ai-sre MLOps Community: https://mlops.community Join the community: https://go.mlops.community/slack ⏱️ Timestamps [00:00] Tech Jargon Confusion [00:27] Harness vs Model [08:48] Model Evolution in Cleric [13:36] Sandboxing and Simulated Environments [20:40] Shifting AI Perceptions [24:10] Managing Humans vs Agents [31:32] Steering Parallel Agents [34:16] Human Decision Integration in Models [43:28] 80/20 Data Split [49:40] Becoming a Skill [53:35] 2027 Agent Autonomy [59:14] Agent Learning in Production [1:04:31] Software as Personal Capabilities [1:08:31] Vibe Coding vs Durability [1:18:23] Wrap up #AISRE #SiteReliabilityEngineering #AIAgents

    1 h 20 min
  6. 17 jun

    Zipline Roundtable episode: Building Real-Time ML Systems with Zipline + Chronon

    Zipline Roundtable episode: Building Real-Time ML Systems with Zipline + ChrononJoin the Community: https://go.mlops.community/YTJoinInGet the newsletter: https://go.mlops.community/YTNewsletterMLOps GPU Guide: https://go.mlops.community/gpuguideBig shout-out to ZiplineAI for the collaboration!// AbstractReal-time ML use cases like personalization and risk decisioning come with a unique set of challenges: serving fresh feature values at low latency for inference, generating temporally consistent backfills for training, and building complex chains of on-demand, batch, and streaming transformations. In this roundtable, practitioners from Intuit, CreditKarma, Depop, and OpenAI share how they use Zipline and the OSS Chronon project to solve these challenges and deploy real-time ML use cases in production.// BioGerman KrikorianGerman is a Software Engineer on the Feature Platform team at Credit Karma. Since joining the company during the early development of its recommendation system, they have played a key role in building and scaling the platform over the years. Their work focuses on feature pipelines and the feature store, which serves as critical infrastructure supporting numerous teams and business verticals across the organization.Ben MagyarBen is an engineer at Depop working on ML and data systems. Before Depop, he worked on Search at Etsy. Most of his work is around the infrastructure and operational problems that come with running ML systems at scale.Raj KatakamRaj architects ML Infrastructure at Credit Karma (Intuit). He holds a Master's in Software Engineering from Carnegie Mellon and a B.Tech in EECE from IIT Kharagpur. His interests include ML Infrastructure, Distributed Systems, Real-Time Data Processing, and Generative AI. His current focus is on providing feature engineering platforms, production GenAI infrastructure, vector databases, ML model serving, and MLOps pipelines for fraud detection, personalized recommendations, financial insights, and model explainability.Mick JermsurawongLed Flyte ML training/experimentation at Stripe, and now led Chronon for ML features at OpenAIHosted by Demetrios// Related LinksWebsite: https://zipline.ai/https://chronon.ai/~~~~~~~~ ✌️Connect With Us ✌️ ~~~~~~~Catch all episodes, blogs, newsletters, and more: https://go.mlops.community/TYExploreJoin our Slack community [https://go.mlops.community/slack]Follow us on X/Twitter [@mlopscommunity](https://x.com/mlopscommunity) or [LinkedIn](https://go.mlops.community/linkedin)] Sign up for the next meetup: [https://go.mlops.community/register]MLOps Swag/Merch: [https://shop.mlops.community/]Connect with Demetrios on LinkedIn: /dpbrinkmConnect with German on LinkedIn: /e2zdkwh8cxghydg/Connect with Raj on LinkedIn: /rajkiran2190Connect with Mick on LinkedIn:/mick-jermsurawong/

    51 min
  7. MCP Servers Are Becoming the UI for AI Agents

    16 jun

    MCP Servers Are Becoming the UI for AI Agents

    Naseem Al-Naji is the co-founder of MCPcat.io and the creator of Opal — a builder with deep roots in privacy-first developer tooling. In this conversation, he breaks down why MCP servers have become a black box in production, and how MCPcat gives teams X-ray vision into how agents and users actually behave. What we get into: 🐱 What MCPcat Is — Open-source analytics and live debugging built specifically for MCP servers 🎬 Session Replay — Watch an agent's full journey through your server, tool call by tool call 🎯 Agent Intent & Goals — Understand "why" a tool was called, not just that it was 🔍 Trace Debugging — Find exactly where agents and users get stuck or confused 🚨 Catching Hallucinations — How issue tracking surfaces when an LLM goes off the rails 🔒 Privacy-First by Design — Client-side redaction so sensitive data never leaves your environment ⚡ One-Line Integration — Python, TypeScript, and Go SDKs that drop into existing stacks 📊 Works With Your Stack — Native support for OpenTelemetry, Datadog, and Sentry 🚀 The Future of MCP — Where agent observability and the MCP ecosystem are heading If you build, ship, or maintain MCP servers — or you're trying to figure out why your AI agents misbehave in production — this one's for you. 🔔 Subscribe, like, and share for more conversations on agentic AI: ▶️ YouTube: https://www.youtube.com/@AAIFAgenticConversations🎧 Spotify: https://open.spotify.com/show/033rZZJrQOVSSmhcStFhZA?si=rUNjFuNqRvGvAEWwqms7TA Links & Resources: 🐱 MCPcat: https://mcpcat.io 💻 MCPcat on GitHub: https://github.com/mcpcat 👤 Naseem on LinkedIn: https://www.linkedin.com/in/naseem-al-naji 🐙 Naseem on GitHub: https://github.com/naji247 Timestamps: [00:00] Intro [01:41] MCP Needs Gatekeepers [06:32] Measuring MCP Success [13:57] MCPAT Feature Rollouts [18:50] MCP Server Query Optimization [26:48] UI Design Shift [29:14] MCP Server Design Choices [33:51] User Journey Traceability [40:40] Agent Experience Evaluation [45:23] AI Model Improvement Strategies #MCP #AIAgents #Observability

    47 min
  8. Agents & the $40M Bet on Multiplayer AI

    12 jun

    Agents & the $40M Bet on Multiplayer AI

    Stanislas Polu is Co-Founder & CTO of Dust — the enterprise AI agent platform used by 51,000 workers at 3,000+ companies. Before Dust, he spent three years on OpenAI's research team under Ilya Sutskever, working on mathematical reasoning in language models, and prior to that was an engineer at Stripe. He brings a rare combination of frontier AI research and product-building experience to the enterprise agent space. Agents & the $40M Bet on Multiplayer AI // MLOps Podcast #384 with Stanislas Polu, Co-Founder & CTO of Dust 🤖 What is Dust? — How Dust enables teams to build and deploy AI agents powered by internal company data, and why the "multiplayer AI" model is winning in enterprise. 🧠 From OpenAI Research to Startup Founder — Stanislas's journey from studying mathematical reasoning in LLMs under Ilya Sutskever to co-founding an enterprise AI company in Paris with Gabriel Hubert. 🚀 The $40M Series B — What Dust is building with fresh funding, the bet on human-agent collaboration as the future of work, and what "multiplayer AI" actually means in practice. 🔄 The Outer-Loop Era — Stanislas's framework for thinking about where AI agents create the most value: not just automating tasks, but rewiring how work gets done across entire organizations. ⚠️ What Most Enterprise AI Gets Wrong — The biggest mistakes companies make when deploying AI agents, why adoption fails, and how Dust achieves 70%+ weekly adoption rates. 📊 Building Reliable Agent Infrastructure — Lessons from scaling to thousands of companies: observability, governance, data security, and why enterprise AI is harder than it looks. 🛠️ Horizontal vs. Vertical AI Platforms — Why Dust chose to build a horizontal enterprise agent platform and how that decision shapes product, go-to-market, and technical architecture. This episode is essential for AI/ML engineers, enterprise AI leads, and anyone building or deploying AI agents at scale inside organizations. 🔗 Links & Resources: • Dust: https://dust.tt • Stanislas Polu on X/Twitter: https://x.com/spolu • Dust on LinkedIn: https://www.linkedin.com/company/dust-tt • Dust $40M Series B announcement: https://dust.tt/blog • "The Outer-Loop Era" talk by Stanislas (dotconferences): https://www.youtube.com/watch?v=_outer_loop • Dust + Stripe MCP integration: https://stripe.com/customers/dust • Dust + Datadog observability case study: https://datadoghq.com/case-studies/dust ⏱️ Timestamps [00:00] Future of Work [00:19] Dust Scaling Lessons [04:44] Human-Agent Collaboration [14:24] Pod as Workspace [22:30] Work Flow Optimization [29:37] Multiplayer Collaboration Vision [39:55] Token Economics and Inference [47:20] AI Pricing Challenges [52:36] Dust vs Co-work [57:06] Agentic Work Infrastructure [1:04:23] Stateful Sandbox Challenges [1:09:58] Product Use Case Discussion [1:14:05] Agent Data Interaction Needs [1:20:09] Wrap up #EnterpriseAI #AIAgents #Dust

    1 h 21 min
4.6
de 5
24 calificaciones

Acerca de

Relaxed Conversations around getting AI into production, whatever shape that may come in (agentic, traditional ML, LLMs, Vibes, etc)

También te podría interesar