MLOps.community

Demetrios

Relaxed Conversations around getting AI into production, whatever shape that may come in (agentic, traditional ML, LLMs, Vibes, etc)

  1. How Sierra AI Does Context Engineering

    -1 J

    How Sierra AI Does Context Engineering

    Zack Reneau-Wedeen is the Head of Product at Sierra, leading the development of enterprise-ready AI agents — from Agent Studio 2.0 to the Agent Data Platform — with a focus on richer workflows, persistent memory, and high-quality voice interactions. How Sierra Does Context Engineering, Zack Reneau-Wedeen // MLOps Podcast #350 Join the Community: https://go.mlops.community/YTJoinIn Get the newsletter: https://go.mlops.community/YTNewsletter // Abstract Sierra’s Zack Reneau-Wedeen claims we’re building AI all wrong and that “context engineering,” not bigger models, is where the real breakthroughs will come from. In this episode, he and Demetrios Brinkmann unpack why AI behaves more like a moody coworker than traditional software, why testing it with real-world chaos (noise, accents, abuse, even bad mics) matters, and how Sierra’s simulations and model “constellations” aim to fix the industry’s reliability problems. They even argue that decision trees are dead, replaced by goals, guardrails, and speculative execution tricks that make voice AI actually usable. Plus: how Sierra trains grads to become product-engineering hybrids, and why obsessing over customers might be the only way AI agents stop disappointing everyone. // Related Links Website: https://www.zackrw.com/ ~~~~~~~~ ✌️Connect With Us ✌️ ~~~~~~~ Catch all episodes, blogs, newsletters, and more: https://go.mlops.community/TYExplore Join our Slack community [https://go.mlops.community/slack] Follow us on X/Twitter [@mlopscommunity](https://x.com/mlopscommunity) or [LinkedIn](https://go.mlops.community/linkedin)] Sign up for the next meetup: [https://go.mlops.community/register] MLOps Swag/Merch: [https://shop.mlops.community/] Connect with Demetrios on LinkedIn: /dpbrinkm Connect with Zack on LinkedIn: /zackrw/ Timestamps: [00:00] Electron cloud vs energy levels [03:47] Simulation vs red teaming [06:51] Access control in models [10:12] Voice vs text simulations [13:12] Speaker-adaptive turn-taking [18:26] Accents and model behavior [23:52] Outcome-based pricing risks [31:40] AI cross-pollination strategies [41:26] Ensemble of models explanation [46:47] Real-time agents vs decision trees [50:15] Code and no-code mix [54:04] Goals and guardrails explained [56:23] Wrap up [57:31] APX program!

    1 h 4 min
  2. Overcoming Challenges in AI Agent Deployment: The Sweet Spot for Governance and Security // Spencer Reagan // #349

    -6 J

    Overcoming Challenges in AI Agent Deployment: The Sweet Spot for Governance and Security // Spencer Reagan // #349

    Spencer Reagan leads R&D at Airia, working on secure AI-agent orchestration, data governance systems, and real-time signal fusion technologies for regulated and defense environments. Overcoming Challenges in AI Agent Deployment: The Sweet Spot for Governance and Security // MLOps Podcast #349 with Spencer Reagan, R&D at Airia. Join the Community: https://go.mlops.community/YTJoinIn Get the newsletter: https://go.mlops.community/YTNewsletter Shoutout to Airia for powering this MLOps Podcast episode. // Abstract Spencer Reagan thinks it might be, and he’s not shy about saying so. In this episode, he and Demetrios Brinkmann get real about the messy, over-engineered state of agent systems, why LLMs still struggle in the wild, and how enterprises keep tripping over their own data chaos. They unpack red-teaming, security headaches, and the uncomfortable truth that most “AI platforms” still don’t scale. If you want a sharp, no-fluff take on where agents are actually headed, this one’s worth a listen. // Bio Passionate about technology, software, and building products that improve people's lives. // Related Links Website: https://airia.com/ Machine Learning, AI Agents, and Autonomy // Egor Kraev // MLOps Podcast #282 - https://youtu.be/zte3QDbQSek Re-Platforming Your Tech Stack // Michelle Marie Conway & Andrew Baker // MLOps Podcast #281 - https://youtu.be/1ouSuBETkdA ~~~~~~~~ ✌️Connect With Us ✌️ ~~~~~~~ Catch all episodes, blogs, newsletters, and more: https://go.mlops.community/TYExplore Join our Slack community [https://go.mlops.community/slack] Follow us on X/Twitter [@mlopscommunity](https://x.com/mlopscommunity) or [LinkedIn](https://go.mlops.community/linkedin)] Sign up for the next meetup: [https://go.mlops.community/register] MLOps Swag/Merch: [https://shop.mlops.community/] Connect with Demetrios on LinkedIn: /dpbrinkm Connect with Spencer on LinkedIn: /spencerreagan/ Timestamps: [00:00] AI industry future [00:55] Use cases in software [05:44] LLMs for data normalization [11:02] ROI and overengineering [15:58] Street width history [20:58] High ROI examples [25:16] AI building challenges [33:37] Budget control challenges [39:30] Airia Orchestration platform [46:25] Agent evaluation breakdown [53:48] Wrap up

    54 min
  3. Hardening Agents for E-commerce Scale: From RL Alignment to Reliability // Panel 2

    2 DÉC.

    Hardening Agents for E-commerce Scale: From RL Alignment to Reliability // Panel 2

    Thanks to Prosus Group for collaborating on the Agents in Production Virtual Conference 2025. Abstract // The discussion centers on highly technical yet practical themes, such as the use of advanced post-training techniques like Direct Preference Optimization (DPO) and Parameter-Efficient Fine-Tuning (PEFT) to ensure LLMs maintain stability while specializing for e-commerce domains. We compare the implementation challenges of Computer-Using Agents in automating legacy enterprise systems versus the stability issues faced by conversational agents when inputs become unpredictable in production. We will analyze the role of cloud infrastructure in supporting the continuous, iterative training loops required by Reinforcement Learning-based agents for e-commerce! Bio // Paul van der Boor (Panel Host) // Paul van der Boor is a Senior Director of Data Science at Prosus and a member of its internal AI group. Arushi Jain (Panelist) // Arushi is a Senior Applied Scientist at Microsoft, working on LLM post-training for Computer-Using Agent (CUA) through Reinforcement Learning. She previously completed Microsoft’s competitive 2-year AI Rotational Program (MAIDAP), building and shipping AI-powered features across four product teams. She holds a Master’s in Machine Learning from the University of Michigan, Ann Arbor, and a Dual Degree in Economics from IIT Kanpur. At Michigan, she led the NLG efforts for the Alexa Prize Team, securing a $250K research grant to develop a personalized, active-listening socialbot. Her research spans collaborations with Rutgers School of Information, Virginia Tech’s Economics Department, and UCLA’s Center for Digital Behavior. Beyond her technical work, Arushi is a passionate advocate for gender equity in AI. She leads the Women in Data Science (WiDS) Cambridge community, scaling participation in her ML workshops from 25 women in 2020 to 100+ in 2025—empowering women and non-binary technologists through education and mentorship. Swati Bhatia // Passionate about building and investing in cutting-edge technology to drive positive impact. Currently shaping the future of AI/ML at Google Cloud. 10+ years of global experience across the U.S, EMEA, and India in product, strategy & venture capital (Google, Uber, BCG, Morpheus Ventures). Audi Liu // I’m passionate about making AI more useful and safe. Why? Because AI will be ubiquitous in every workflow, powering our lives just like how electricity revolutionized our society - It’s pivotal we get it right. At Inworld AI, we believe all future software will be powered by voice. As a Sr Product Manager at Inworld, I'm focused on building a real-time voice API that empowers developers to create engaging, human-like experiences. Inworld offers state-of-the-art voice AI at a radically accessible price - No. 1 on Hugging Face and Artificial Analysis, instant voice cloning, rich multilingual support, real-time streaming, and emotion plus non-verbal control, all for just $5 per million characters. Isabella Piratininga // Experienced Product Leader with over 10 years in the tech industry, shaping impactful solutions across micro-mobility, e-commerce, and leading organizations in the new economy, such as OLX, iFood, and now Nubank. I began my journey as a Product Owner during the early days of modern product management, contributing to pivotal moments like scaling startups, mergers of major tech companies, and driving innovation in digital banking. My passion lies in solving complex challenges through user-centered product strategies. I believe in creating products that serve as a bridge between user needs and business goals, fostering value and driving growth. At Nubank, I focus on redefining financial experiences and empowering users with accessible and innovative solutions.

    29 min
  4. Building Cursor: A Fireside Chat with VP Solutions Ricky Doar

    27 NOV.

    Building Cursor: A Fireside Chat with VP Solutions Ricky Doar

    Ricky Doar is the VP of Solutions at Cursor, where he leads forward-deployed engineers. A seasoned product and technical leader with over a decade of experience in developer tools and data platforms, Ricky previously served as VP of Field Engineering at Vercel, where he led global technical solutions for the company's next-generation frontend platform. Prior to Vercel, Ricky held multiple leadership roles at Segment (acquired by Twilio), including Director of Product Management for Twilio Engage, Group Product Manager for Personas, and RVP of Solutions Engineering for the West and APAC regions. He also worked as a Product Engineer and Senior Sales Engineer at Mixpanel, bringing deep technical expertise to customer-facing roles. Thanks to  Prosus Group for collaborating on the Agents in Production Virtual Conference 2025. In this session, Ricky Doar, VP of Solutions at Cursor, shares actionable insights from leading large-scale AI developer tool implementations at the world’s top enterprises. Drawing on field experience with organizations at the forefront of transformation, Ricky highlights key best practices, observed power-user patterns, and deployment strategies that maximize value and ensure smooth rollout. Learn what distinguishes high-performing teams, how tailored onboarding accelerates adoption, and which support resources matter most for driving enterprise-wide success. A Prosus | MLOps Community Production

    27 min
  5. Relational Foundation Models: Unlocking the Next Frontier of Enterprise AI // Jure Leskovec // #348

    25 NOV.

    Relational Foundation Models: Unlocking the Next Frontier of Enterprise AI // Jure Leskovec // #348

    Dr. Jure Leskovec is the Chief Scientist at Kumo.AI and a Stanford professor, working on relational foundation models and graph-transformer systems that bring enterprise databases into the foundation-model era. Relational Foundation Models: Unlocking the Next Frontier of Enterprise AI // MLOps Podcast #348 with Jure Leskovec, Professor and Chief Scientist, Stanford University and Kumo.AI. Join the Community: https://go.mlops.community/YTJoinIn Get the newsletter: https://go.mlops.community/YTNewsletter // Abstract Today’s foundation models excel at text and images—but they miss the relationships that define how the world works. In every enterprise, value emerges from connections: customers to products, suppliers to shipments, molecules to targets. This talk introduces Relational Foundation Models (RFMs)—a new class of models that reason over interactions, not just data points. Drawing on advances in graph neural networks and large-scale ML systems, I’ll show how RFMs capture structure, enable richer reasoning, and deliver measurable business impact. Audience will learn where relational modeling drives the biggest wins, how to build the data backbone for it, and how to operationalize these models responsibly and at scale. // Bio Jure Leskovec is the co-founder of Kumo.AI, an enterprise AI company pioneering AI foundation models that can reason over structured business data. He is also a Professor of Computer Science at Stanford University and a leading researcher in artificial intelligence, best known for pioneering Graph Neural Networks and creating PyG, the most widely used graph learning toolkit. Previously, Jure served as Chief Scientist at Pinterest and as an investigator at the Chan Zuckerberg BioHub. His research has been widely adopted in industry and government, powering applications at companies such as Meta, Uber, YouTube, Amazon, and more. He has received top awards in AI and data science, including the ACM KDD Innovation Award. // Related Links Website: https://cs.stanford.edu/people/jure/ https://www.youtube.com/results?search_query=jure+leskovec Please watch Jure's keynote: https://www.youtube.com/watch?v=Rcfhh-V7x2U ~~~~~~~~ ✌️Connect With Us ✌️ ~~~~~~~ Catch all episodes, blogs, newsletters, and more: https://go.mlops.community/TYExplore Join our Slack community [https://go.mlops.community/slack] Follow us on X/Twitter [@mlopscommunity](https://x.com/mlopscommunity) or [LinkedIn](https://go.mlops.community/linkedin)] Sign up for the next meetup: [https://go.mlops.community/register] MLOps Swag/Merch: [https://shop.mlops.community/] Connect with Demetrios on LinkedIn: /dpbrinkm Connect with Jure on LinkedIn: /leskovec Timestamps: [00:00] Structured data value [00:26] Breakdown of ML Claims [05:04] LLMs vs recommender systems [10:09] Building a relational model [15:47] Feature engineering impact [20:42] Knowledge graph inference [26:45] Advertising models scale [32:57] Feature stores evolution [38:00] Training model compute needs [42:34] Predictive AI for agents [45:32] Leveraging faster predictive models [48:00] Wrap up

    49 min
  6. Context Engineering, Context Rot, & Agentic Search with the CEO of Chroma, Jeff Huber

    21 NOV.

    Context Engineering, Context Rot, & Agentic Search with the CEO of Chroma, Jeff Huber

    Jeff Huber is the CEO of ​Chroma, working on context engineering and building reliable retrieval infrastructure for AI systems. Context Engineering, Context Rot, & Agentic Search with the CEO of Chroma, Jeff Huber // MLOps Podcast #348. Join the Community: https://go.mlops.community/YTJoinIn Get the newsletter: https://go.mlops.community/YTNewsletter // Abstract Jeff Huber drops some hard truths about “context rot” — the slow decay of AI memory that’s quietly breaking your favorite models. From retrieval chaos to the hidden limits of context windows, he and Demetrios Brinkmann unpack why most AI systems forget what matters and how Chroma is rethinking the entire retrieval stack. It’s a bold look at whether smarter AI means cleaner context — or just better ways to hide the mess. // Bio Jeff Huber is the CEO and cofounder of Chroma. Chroma has raised $20M from top investors in Silicon Valley and builds modern search infrastructure for AI. // Related Links Website: https://www.trychroma.com/ ~~~~~~~~ ✌️Connect With Us ✌️ ~~~~~~~ Catch all episodes, blogs, newsletters, and more: https://go.mlops.community/TYExplore Join our Slack community [https://go.mlops.community/slack] Follow us on X/Twitter [@mlopscommunity](https://x.com/mlopscommunity) or [LinkedIn](https://go.mlops.community/linkedin)] Sign up for the next meetup: [https://go.mlops.community/register] MLOps Swag/Merch: [https://shop.mlops.community/] Connect with Demetrios on LinkedIn: /dpbrinkm Connect with Jeff on LinkedIn: /jeffchuber/ Timestamps: [00:00] AI intelligence context clarity [00:37] Context rot explanation [03:02] Benchmarking context windows [05:09] Breaking down search eras [10:50] Agent task memory issues [17:21] Semantic search limitations [22:54] Context hygiene in AI [30:15] Chroma on-device functionality [38:23] Vision for precision systems [43:07] ML model deployment challenges [44:17] Wrap up

    45 min
  7. Reliable Voice Agents

    18 NOV.

    Reliable Voice Agents

    Brooke Hopkins is the CEO of Coval, a company making voice agents more reliable. Reliable Voice Agents // MLOps Podcast #347 with Brooke Hopkins, Founder of Coval. Join the Community: https://go.mlops.community/YTJoinIn Get the newsletter: https://go.mlops.community/YTNewsletter // Abstract Voice AI is finally growing up—but not without drama. Brooke Hopkins joins Demetrios Brinkmann to unpack why most “smart” voice systems still feel dumb, what it actually takes to make them reliable, and how startups are quietly outpacing big tech in building the next generation of voice agents. // Bio Brooke Hopkins is the founder of Coval, a simulation and evaluation platform for AI agents. She previously led the evaluation job infrastructure at Waymo. There, her team was responsible for the developer tools for launching and running simulations, and she engineered many of the core simulation systems from the ground up. // Related Links Website: https://www.coval.dev/ ~~~~~~~~ ✌️Connect With Us ✌️ ~~~~~~~ Catch all episodes, blogs, newsletters, and more: https://go.mlops.community/TYExplore Join our Slack community [https://go.mlops.community/slack] Follow us on X/Twitter [@mlopscommunity](https://x.com/mlopscommunity) or [LinkedIn](https://go.mlops.community/linkedin)] Sign up for the next meetup: [https://go.mlops.community/register] MLOps Swag/Merch: [https://shop.mlops.community/] Connect with Demetrios on LinkedIn: /dpbrinkm Connect with Brooke on LinkedIn: /bnhop/ Timestamps: [00:00] Workshop feedback [02:21] IVR frustration and transition [05:06] Voice use cases in business [11:00] Voice AI reliability challenge [18:46] Voice AI reliability issues [24:35] Injecting context [27:16] Conversation flow analysis [34:52] AI overgeneralization and confidence [37:41] Wrap up

    38 min
  8. The Future of AI Operations: Insights from PwC AI Managed Services

    14 NOV.

    The Future of AI Operations: Insights from PwC AI Managed Services

    Rani Radhakrishnan is a Principal at PwC US, leading work on AI-managed services, autonomous agents, and data-driven transformation for enterprises. The Future of AI Operations: Insights from PwC AI Managed Services // MLOps Podcast #345 with Rani Radhakrishnan, Principal, Technology Managed Services - AI, Data Analytics and Insights at PwC US. Huge thanks to PwC for supporting this episode! Join the Community: https://go.mlops.community/YTJoinIn Get the newsletter: https://go.mlops.community/YTNewsletter // Abstract In today’s data-driven IT landscape, managing ML lifecycles and operations is converging. On this podcast, we’ll explore how end-to-end ML lifecycle practices extend to proactive, automation-driven IT operations. We'll discuss key MLOps concepts—CI/CD pipelines, feature stores, model monitoring—and how they power anomaly detection, event correlation, and automated remediation. // Bio Rani Radhakrishnan, a Principal at PwC, currently leads the AI Managed Services and Data & Insight teams in PwC US Technology Managed Services. Rani excels at transforming data into strategic insights, driving informed decision-making, and delivering innovative solutions. Her leadership is marked by a deep understanding of emerging technologies and a commitment to leveraging them for business growth. Rani’s ability to align and deliver AI solutions with organizational outcomes has established her as a thought leader in the industry. Her passion for applying technology to solve tough business challenges and dedication to excellence continue to inspire her teams and help drive success for her clients in the rapidly evolving AI landscape. // Related Links Website: https://www.pwc.com/us/managedservices https://www.pwc.com/us/en/tech-effect.html ~~~~~~~~ ✌️Connect With Us ✌️ ~~~~~~~ Catch all episodes, blogs, newsletters, and more: https://go.mlops.community/TYExplore Join our Slack community [https://go.mlops.community/slack] Follow us on X/Twitter [@mlopscommunity](https://x.com/mlopscommunity) or [LinkedIn](https://go.mlops.community/linkedin)] Sign up for the next meetup: [https://go.mlops.community/register] MLOps Swag/Merch: [https://shop.mlops.community/] Connect with Demetrios on LinkedIn: /dpbrinkm Connect with Rani on LinkedIn: /rani-radhakrishnan-163615 Timestamps: [00:00] Getting to Know Rani [01:54] Managed services [03:50] AI usage reflection [06:21] IT operations and MLOps [11:23] MLOps and agent deployment [14:35] Startup challenges in managed services [16:50] Lift vs practicality in ML [23:45] Scaling in production [27:13] Data labeling effectiveness [29:40] Sustainability considerations [37:00] Product engineer roles [40:21] Wrap up

    41 min
4,6
sur 5
23 notes

À propos

Relaxed Conversations around getting AI into production, whatever shape that may come in (agentic, traditional ML, LLMs, Vibes, etc)

Vous aimeriez peut‑être aussi