Vanishing Gradients

Hugo Bowne-Anderson

A podcast for people who build with AI. Long-format conversations with people shaping the field about agents, evals, multimodal systems, data infrastructure, and the tools behind them. Guests include Jeremy Howard (fast.ai), Hamel Husain (Parlance Labs), Shreya Shankar (UC Berkeley), Wes McKinney (creator of pandas), Samuel Colvin (Pydantic) and more. hugobowne.substack.com

  1. -10 H

    Episode 70: 1,400 Production AI Deployments

    There’s a company who spent almost $50,000 because an agent went into an infinite loop and they forgot about it for a month. It had no failures and I guess no one was monitoring these costs. It’s nice that people do write about that in the database as well. After it happened, they said: watch out for infinite loops. Watch out for cascading tool failures. Watch out for silent failures where the agent reports it has succeeded when it didn’t! We Discuss: * Why the most successful teams are ripping out and rebuilding their agent systems every few weeks as models improve, and why over-engineering now creates technical debt you can’t afford later; * The $50,000 infinite loop disaster and why “silent failures” are the biggest risk in production: agents confidently report success while spiraling into expensive mistakes; * How ELIOS built emergency voice agents with sub-400ms response times by aggressively throwing away context every few seconds, and why these extreme patterns are becoming standard practice; * Why DoorDash uses a three-tier agent architecture (manager, progress tracker, and specialists) with a persistent workspace that lets agents collaborate across hours or days; * Why simple text files and markdown are emerging as the best “continual learning” layer: human-readable memory that persists across sessions without fine-tuning models; * The 100-to-1 problem: for every useful output, tool-calling agents generate 100 tokens of noise, and the three tactics (reduce, offload, isolate) teams use to manage it; * Why companies are choosing Gemini Flash for document processing and Opus for long reasoning chains, and how to match models to your actual usage patterns; * The debate over vector databases versus simple grep and cat, and why giving agents standard command-line tools often beats complex APIs; * What “re-architect” as a job title reveals about the shift from 70% scaffolding / 30% model to 90% model / 10% scaffolding, and why knowing when to rip things out is the may be the most important skill today. You can also find the full episode on Spotify, Apple Podcasts, and YouTube. You can also interact directly with the transcript here in NotebookLM: If you do so, let us know anything you find in the comments! 👉 Want to learn more about Building AI-Powered Software? Check out our Building AI Applications course. It’s a live cohort with hands on exercises and office hours. Our final cohort starts March 10, 2026. Here is a 25% discount code for readers. 👈 Show Notes Links * Alex Strick van Linschoten on LinkedIn * Alex Strick van Linschoten on Twitter/X * LLMOps Database * LLMOps Database Dataset on Hugging Face * Hugo’s MCP Server for LLMOps Database * Alex’s Blog: What 1,200+ Production Deployments Reveal About LLMOps in 2025 * Previous Episode: Practical Lessons from 750 Real-World LLM Deployments * Previous Episode: Tales from 400 LLM Deployments * Context Rot Research by Chroma * Hugo’s Post: AI Agent Harness - 3 Principles for Context Engineering * Hugo’s Post: The Rise of Agentic Search * Episode with Nick Alice Moy: The Post-Coding Era * Hugo’s Personal Podcast Prep Skill Gist * Claude Tool Search Documentation * Gastown on GitHub (Steve Yegge) * Welcome to Gastown by Steve Yegge * ZenML - Open Source MLOps & LLMOps Framework * Upcoming Events on Luma * Vanishing Gradients on YouTube * Watch the podcast livestream on YouTube * Join the final cohort of our Building AI Applications course in March, 2026 (25% off for listeners) 👉 Want to learn more about Building AI-Powered Software? Check out our Building AI Applications course. It’s a live cohort with hands on exercises and office hours. Our final cohort starts March 10, 2026. Here is a 25% discount code for readers. 👈 This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit hugobowne.substack.com

    1 h 10 min
  2. 3 FÉVR.

    Episode 69: Python is Dead. Long Live Python! With the Creators of pandas & Parquet

    > It’s the agent writing the code. And it’s the development loop of writing the code, building testing, write the code, build test and iterating. And so I do think we’ll see for many types of software, a shift away from Python towards other programming languages. I think Go is probably the best language for those like other types of software projects. And like I said, I haven’t written a line of Go code in my life. – Wes McKinney (creator of pandas Principal Architect at Posit), Wes McKinney, Marcel Kornacker, and Alison Hill join Hugo to talk about the architectural shift for multimodal AI, the rise of “agent ergonomics,” and the evolving role of developers in an AI-generated future. We Discuss: * Agent Ergonomics: Optimize for agent iteration speed, shifting from human coding to fast test environments, potentially favoring languages like Go; * Adversarial Code Review: Deploy diverse AI models to peer-review agent-generated code, catching subtle bugs humans miss; * Multimodal Data Verbs: Make operations like resizing and rotating native to your database to eliminate data-plumbing bottlenecks; * Taste as Differentiator: Value “taste”—the ability to curate and refine the best output from countless AI-generated options—over sheer execution speed; * 100x Software Volume: Embrace ephemeral, just-in-time software; prioritize aggressive generation and adversarial testing over careful planning for quality. You can also find the full episode on Spotify, Apple Podcasts, and YouTube. You can also interact directly with the transcript of the workshop & fireside chat here in NotebookLM: If you do so, let us know anything you find in the comments! 👉 Want to learn more about Building AI-Powered Software? Check out our Building AI Applications course. It’s a live cohort with hands on exercises and office hours. Here is a discount code for readers. 👈 This was a fireside chat at the end of a livestreamed workshop we did on building multimodal AI systems with Pixeltable. Check out the full workshop below (all code here on Github): Links and Resources * Wes McKinney on LinkedIn * Marcel Kornacker on LinkedIn * Alison Hill on LinkedIn * Spicy Takes * Palmer Penguins * Pixeltable * Posit * Positron * Building Multimodal AI Systems Workshop Repository * Pixeltable Docs: LLM Tool Calling with MCP Servers * Pixeltable Docs: Working with Pydantic * Upcoming Events on Luma * Vanishing Gradients on YouTube * Watch the podcast video on YouTube * Join the final cohort of our Building AI Applications course in March, 2026 (25% off for listeners) https://maven.com/hugo-stefan/building-ai-apps-ds-and-swe-from-first-principles?promoCode=vgfs What people said during the workshop “I think the interface looks amazing/simple. Strong work! 🦾” — @goldentribe “This is quite amazing. Watching this I felt the same way when I first leant pandas, NumPy and scikit and how well i was able to manipulate and wrangle data. PixelTable feels seamless and looks as good as those legendary frameworks but for Multimodal Data.” — @vinod7 “This is all extremely cool to see, I love the API and the approach.” — @steveb4191 “Thanks so much, Hugo! That was very insightful! Great work Alison and Marcel!” — @vinod7 “Just wrapped up watching a replay of the Pixeltable workshop. So cool!! Love the notebooks and working examples. The important parts were covered and worked beautifully 🕺” — @therobbrennan 👉 Want to learn more about Building AI-Powered Software? Check out our Building AI Applications course. It’s a live cohort with hands on exercises and office hours. Here is a discount code for readers. 👈 This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit hugobowne.substack.com

    55 min
  3. 23 JANV.

    Episode 68: A Builder’s Guide to Agentic Search & Retrieval with Doug Turnbull & John Berryman

    The best way to build a horrible search product? Don’t ever measure anything against what a user wants. Search veterans Doug Turnbull (Led Search at Reddit + Shopify; Wrote Relevant Search + AI Powered Search) and John Berryman (Early Engineer on Github Copilot; Author of Relevant Search + Prompt Engineering for LLMs), join Hugo to talk about how to build Agentic Search Applications. We Discuss: * The evolution of information retrieval as it moves from traditional keyword search toward “agentic search“ and what this means for builders. * John’s five-level maturity model (you can prototype today!) for AI adoption, moving from Trad Search to conversational AI to asynchronous research assistants that reason about result quality. * The Agentic Search Builders Playbook, including why and how you should “hand-roll” your own agentic loops to maintain control; * The importance of “revealed preferences” that LLM-judges often miss (evaluations must use real clickstream data to capture “revealed preferences” that semantic relevance alone cannot infer) * Patterns and Anti-Patterns for Agentic Search Applications * Learning and teaching Search in the Age of Agents You can find the full episode on Spotify, Apple Podcasts, and YouTube. You can also interact directly with the transcript here in NotebookLM: If you do so, let us know anything you find in the comments! 👉 Want to learn more about Building AI-Powered Software? Check out our Building AI Applications course. It’s a live cohort with hands on exercises and office hours. Here is a discount code for readers. 👈 Doug and Hugo are also doing a free lightning lesson on Feb 20 about How To Build Your First Agentic Search Application! You’ll walk away with a framework & code to build your first agentic search app. Register here to join live or get the recording after. Links and Resources Guests * Arcturus Labs (John’s website) * Software Doug (Doug’s website) * John Berryman on LinkedIn * Doug Turnbull on LinkedIn Books * Relevant Search by Doug Turnbull & John Berryman (Manning) * AI-Powered Search by Doug Turnbull (Manning) * Prompt Engineering for LLMs by John Berryman (O’Reilly) Blog Posts * Incremental AI Adoption for E-commerce by John Berryman * Roaming RAG – RAG without the Vector Database by John Berryman * Agents Turn Simple Keyword Search into Compelling Search Experiences by Doug Turnbull * A Simple Agentic Loop with Just Python Functions by Doug Turnbull * Agentic Code Generation to Optimize a Search Reranker by Doug Turnbull * LLM Judges Aren’t the Shortcut You Think by Doug Turnbul (Hugo’s 5 minute video below) * Malleable Software by Ink & Switch (inc. Geoffrey Lit) * Patterns and Anti-Patterns for Building with AI by Hugo Bowne-Anderson Other Resources * The Rise of Agentic Search, a recent VG Podcast with Jeff Huber * Karpathy on Cognitive Core LLMs * Cheat at Search with Agents course by Doug Turnbull (use code: vanishinggradients for $200 off) * Upcoming Events on Luma * Vanishing Gradients on YouTube * Watch the podcast video on YouTube * Join the final cohort of our Building AI Applications course in Q1, 2026 (25% off for listeners) Timestamps (for YouTube livestream) 00:00 How to Build Agentic Search & Retrieval Systems 02:48 Defining Search and AI 03:26 Evolution of Search Technologies08:46 Search in E-commerce and Other Domains 12:15 Combining Search and AI: RAG and LLMs 23:50 User Intent and Search Optimization 29:47 Levels of AI Integration in Search 32:25 Exploring the Complexity of Search in Various Domains 33:49 The Evolution and Impact of Agentic Search 34:07 Defining Terms: RAG and Agentic Search 34:52 The Research Loop and Tool Interaction 35:55 Formal Protocols and Structured Outputs 38:39 Building Agentic Search Experiences: Tips and Advice 41:50 The Importance of Empathy in AI and Search Development 54:30 The Role of UX in Search Applications 01:01:15 Future of Search: Malleable User Interfaces 01:02:38 Exploring Malleable Software 01:04:20 The Coordination Challenge in Software Development 01:05:23 The Impact of Claude Code & Claude Cowork 01:06:22 The Future of Knowledge Work with AI 01:12:39 Evaluating Search Algorithms with AI 01:15:15 The Role of Agents in Search Optimization 01:29:55 Teaching AI and Search Techniques 01:34:25 Final Thoughts and Farewell 👉 Want to learn more about Building AI-Powered Software? Check out our Building AI Applications course. It’s a live cohort with hands on exercises and office hours. Here is a discount code for readers. 👈 https://maven.com/hugo-stefan/building-ai-apps-ds-and-swe-from-first-principles?promoCode=vgpod This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit hugobowne.substack.com

    1 h 29 min
  4. 14 JANV.

    Episode 67: Saving Hundreds of Hours of Dev Time with AI Agents That Learn

    This is continual learning, right? Everyone has been talking about continual learning as the next challenge in AI. Actually, it’s solved. Just tell it to keep some notes somewhere. Sure, it’s not, it’s not machine learning, but in some ways it is because when it will load this text file again, it will influence what it does … And it works so well: it’s easy to understand. It’s easy to inspect, it’s easy to evolve and modify! Eleanor Berger and Isaac Flaath, the minds behind Elite AI Assisted Coding, join Hugo to talk about how to redefine software development through effective AI-assisted coding, leveraging “specification-first” approaches and advanced agentic workflows. We Discuss: * Markdown learning loops: Use simple agents.md files for agents to self-update rules and persist context, creating inspectable, low-cost learning; * Intent-first development: As AI commoditizes syntax, defining clear specs and what makes a result “good” becomes the core, durable developer skill; * Effortless documentation: Leverage LLMs to distill messy “brain dumps” or walks-and-talks into structured project specifications, offloading context faster; * Modular agent skills: Transition from MCP servers to simple markdown-based “skills” with YAML and scripts, allowing progressive disclosure of tool details; * Scheduled async agents: Break the chat-based productivity ceiling by using GitHub Actions or Cron jobs for agents to work on issues, shifting humans to reviewers; * Automated tech debt audits: Deploy background agents to identify duplicate code, architectural drift, or missing test coverage, leveraging AI to police AI-induced messiness; * Explicit knowledge culture: AI agents eliminate “cafeteria chat” by forcing explicit, machine-readable documentation, solving the perennial problem of lost institutional knowledge; * Tiered model strategy: Optimize token spend by using high-tier “reasoning” models (e.g., Opus) for planning and low-cost, high-speed models (e.g., Flash) for execution; * Ephemeral software specs: With near-zero generation costs, software shifts from static products to dynamic, regenerated code based on a permanent, underlying specification. You can also find the full episode on Spotify, Apple Podcasts, and YouTube. You can also interact directly with the transcript here in NotebookLM: If you do so, let us know anything you find in the comments! 👉 Eleanor & Isaac are teaching their next cohort of their Elite AI Assisted Coding course starting this week. They’re kindly giving readers of Vanishing Gradients 25% off. Use this link.👈 👉 Want to learn more about Building AI-Powered Software? Check out our Building AI Applications course. It’s a live cohort with hands on exercises and office hours. Here is a discount code for readers. 👈 Show Notes * Elite AI Assisted Coding Substack * Eleanor Berger on LinkedIn * Isaac Flaath on LinkedIn * Elite AI Assisted Coding Course (Use the code HUGO for 25% off) * How to Build an AI Agent with AI-Assisted Coding * Eleanor/Isaac’s blog post “The SpecFlow Process for AI Coding” * Eleanor’s growing list of (free) tutorials on Agent Skills * Eleanor’s YouTube playlist on agent skills * Eleanor’s blog post “Are (Agent) Skills the New Apps” * Simon Willison’s blog post on skills/general computer automation/data journalism agents * Eleanor/Isaac’s blog post about asynchronous client agents in GitHub actions * Eleanor/Isaac’s blog post on agentic coding workflows with Hang Yu, Product Lead for Qoder @ Alibaba * Upcoming Events on Luma * Vanishing Gradients on YouTube * Watch the podcast video on YouTube * Join the final cohort of our Building AI Applications course in Q1, 2026 (25% off for listeners) Timestamps (for YouTube livestream) 00:00 Introduction to Elite AI Assisted Coding 02:24 Starting a New AI Project: Best Practices 03:19 The Importance of Context in AI Projects 07:19 Specification-First Planning 12:01 Sharing Intent and Documentation 18:27 Living Documentation and Continual Learning 24:36 Choosing the Right Tools and Models 29:18 Managing Costs and Token Usage 40:16 Using Different Models for Different Tasks 43:41 Mastering One Model for Better Results 44:54 The Rise of Agent Skills in 2026 45:34 Understanding the Importance of Skills 47:18 Practical Applications of Agent Skills 01:11:43 Security Concerns with AI Agents 01:15:02 Collaborative AI-Assisted Coding 01:18:59 Future of AI-Assisted Coding 01:22:27 Key Takeaways for Effective AI-Assisted Coding Live workshop with Eleanor, Isaac, & Hugo We also recently did a 90-minute workshop on How to Build an AI Agent with AI-Assisted Coding. We wrote a blog post on it for those who don’t have 90 minutes right now. Check it out here. I then made a 4 min video about it all for those who don’t have time to read the blog post. 👉 Want to learn more about Building AI-Powered Software? Check out our Building AI Applications course. It’s a live cohort with hands on exercises and office hours. Here is a discount code for readers. 👈 https://maven.com/hugo-stefan/building-ai-apps-ds-and-swe-from-first-principles?promoCode=vg-ei This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit hugobowne.substack.com

    1 h 18 min
  5. 8 JANV.

    Episode 66: The Agent Paradox - Why Moderna's Most Productive AI Systems Aren't Agents

    Surprise. We don’t have agents. I actually went in and did an audit of all the LLM applications that we’ve developed internally. And if you were to take Anthropic’s definition of workflow versus agent, we don’t have agents. I would not classify any of our applications as agents. x Eric Ma, who leads Research Data Science in the Data Science and AI group at Moderna, joins Hugo on moving past the hype of autonomous agents to build reliable, high-value workflows. We discuss: * Reliable Workflows: Prioritize rigid workflows over dynamic AI agents to ensure reliability and minimize stochasticity in production environments; * Permission Mapping: The true challenge in regulated environments is security, specifically mapping permissions across source documents, vector stores, and model weights; * Trace Log Risk: LLM execution traces pose a regulatory risk, inadvertently leaking restricted data like trade secrets or personal information; * High-Value Data Work: LLMs excel at transforming archived documents and freeform forms into required formats, offloading significant “janitorial” work from scientists; * “Non-LLM” First: Solve problems with simpler tools like Python or ML models before LLMs to ensure robustness and eliminate generative AI stochasticity; * Contextual Evaluation: Tailor evaluation rigor to consequences; low-stakes tools can be “vibe-checked,” while patient safety outputs demand exhaustive error characterization; * Serverless Biotech Backbone: Serverless infrastructure like Modal and reactive notebooks such as Marimo empowers biotech data scientists for rapid deployment without heavy infrastructure overhead. You can also find the full episode on Spotify, Apple Podcasts, and YouTube. You can also interact directly with the transcript here in NotebookLM: If you do so, let us know anything you find in the comments! 👉 Want to learn more about Building AI-Powered Software? Check out our Building AI Applications course. It’s a live cohort with hands on exercises and office hours. Our final cohort is in Q1, 2206. Here is a 35% discount code for readers. 👈 https://maven.com/hugo-stefan/building-ai-apps-ds-and-swe-from-first-principles?promoCode=vgch 👉 Eric & Hugo have a free upcoming livestream workshop: Building Tools for Thinking with AI (register to join live or get the recording afterwards) 👈 Show notes * Eric’s website * Eric Ma on LinkedIn * Eric’s blog * Eric’s data science newsletter * Building Effective AI Agents by the Anthropic team * Wow, Marimo from Eric’s blog * Wow, Modal from Eric’s blog * Upcoming Events on Luma * Watch the podcast video on YouTube * Join the final cohort of our Building AI Applications course in Q1, 2026 (35% off for listeners) Timestamps 00:00 Defining Agents and Workflows 02:04 Challenges in Regulated Environments 04:24 Eric Ma's Role at Moderna, Leading Research Data Science in the Data Science and AI Group 12:37 Document Reformatting and Automation 15:42 Data Security and Permission Mapping 20:05 Choosing the Right Model for Production 20:41 Evaluating Model Changes with Benchmarks 23:10 Vibe-Based Evaluation vs. Formal Testing 27:22 Security and Fine-Tuning in LLMs 28:45 Challenges and Future of Fine-Tuning 34:00 Security Layers and Information Leakage 37:48 Wrap-Up and Final Remarks 👉 Want to learn more about Building AI-Powered Software? Check out our Building AI Applications course. It’s a live cohort with hands on exercises and office hours. Our final cohort is in Q1, 2026. Here is a 35% discount code for readers. 👈 https://maven.com/hugo-stefan/building-ai-apps-ds-and-swe-from-first-principles?promoCode=vgch This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit hugobowne.substack.com

    43 min
  6. 2025-12-19

    Episode 65: The Rise of Agentic Search

    We’re really moving from a world where humans are authoring search queries and humans are executing those queries and humans are digesting the results to a world where AI is doing that for us. Jeff Huber, CEO and co-founder of Chroma, joins Hugo to talk about how agentic search and retrieval are changing the very nature of search and software for builders and users alike. We Discuss: * “Context engineering”, the strategic design and engineering of what context gets fed to the LLM (data, tools, memory, and more), which is now essential for building reliable, agentic AI systems; * Why simply stuffing large context windows is no longer feasible due to “context rot” as AI applications become more goal-oriented and capable of multi-step tasks * A framework for precisely curating and providing only the most relevant, high-precision information to ensure accurate and dependable AI systems; * The “agent harness”, the collection of tools and capabilities an agent can access, and how to construct these advanced systems; * Emerging best practices for builders, including hybrid search as a robust default, creating “golden datasets” for evaluation, and leveraging sub-agents to break down complex tasks * The major unsolved challenge of agent evaluation, emphasizing a shift towards iterative, data-centric approaches. You can also find the full episode on Spotify, Apple Podcasts, and YouTube. You can also interact directly with the transcript here in NotebookLM: If you do so, let us know anything you find in the comments! 👉 Want to learn more about Building AI-Powered Software? Check out our Building AI Applications course. It’s a live cohort with hands on exercises and office hours. Our final cohort is in Q1, 2206. Here is a 35% discount code for readers. 👈 Oh! One more thing: we’ve just announced a Vanishing Gradients livestream for January 21 that you may dig: * A Builder’s Guide to Agentic Search & Retrieval with Doug Turnbull and John Berryman (register to join live or get the recording afterwards. Show notes * Jeff Huber on Twitter * Jeff Huber on LinkedIn * Try Chroma! * Context Rot: How Increasing Input Tokens Impacts LLM Performance by The Chroma Team * AI Agent Harness, 3 Principles for Context Engineering, and the Bitter Lesson Revisited * From Context Engineering to AI Agent Harnesses: The New Software Discipline * Generative Benchmarking by The Chroma Team * Effective context engineering for AI agents by The Anthropic Team * Making Sense of Millions of Conversations for AI Agents by Ivan Leo (Manus) and Hugo * How we built our multi-agent research system by The Anthropic Team * Upcoming Events on Luma * Watch the podcast video on YouTube 👉 Want to learn more about Building AI-Powered Software? Check out our Building AI Applications course. It’s a live cohort with hands on exercises and office hours. Our final cohort is in Q1, 2206. Here is a 35% discount code for readers. 👈 https://maven.com/hugo-stefan/building-ai-apps-ds-and-swe-from-first-principles?promoCode=vgch This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit hugobowne.substack.com

    52 min
  7. Episode 64: Data Science Meets Agentic AI with Michael Kennedy (Talk Python)

    2025-12-03

    Episode 64: Data Science Meets Agentic AI with Michael Kennedy (Talk Python)

    We have been sold a story of complexity. Michael Kennedy (Talk Python) argues we can escape this by relentlessly focusing on the problem at hand, reducing costs by orders of magnitude in software, data, and AI. In this episode, Michael joins Hugo to dig into the practical side of running Python systems at scale. They connect these ideas to the data science workflow, exploring which software engineering practices allow AI teams to ship faster and with more confidence. They also detail how to deploy systems without unnecessary complexity and how Agentic AI is fundamentally reshaping development workflows. We talk through: - Escaping complexity hell to reduce costs and gain autonomy - The specific software practices, like the "Docker Barrier", that matter most for data scientists - How to replace complex cloud services with a simple, robust $30/month stack - The shift from writing code to "systems thinking" in the age of Agentic AI - How to manage the people-pleasing psychology of AI agents to prevent broken code - Why struggle is still essential for learning, even when AI can do the work for you LINKS Talk Python In Production, the Book! (https://talkpython.fm/books/python-in-production) Just Enough Python for Data Scientists Course (https://training.talkpython.fm/courses/just-enough-python-for-data-scientists) Agentic AI Programming for Python Course (https://training.talkpython.fm/courses/agentic-ai-programming-for-python) Talk Python To Me (https://talkpython.fm/) and a recent episode with Hugo as guest: Building Data Science with Foundation LLM Models (https://talkpython.fm/episodes/show/526/building-data-science-with-foundation-llm-models) Python Bytes podcast (https://pythonbytes.fm/) Upcoming Events on Luma (https://lu.ma/calendar/cal-8ImWFDQ3IEIxNWk) Watch the podcast video on YouTube (https://youtube.com/live/jfSRxxO3aRo?feature=share) Join the final cohort of our Building AI Applications course starting Jan 12, 2026 (35% off for listeners) (https://maven.com/hugo-stefan/building-ai-apps-ds-and-swe-from-first-principles?promoCode=vgrav): https://maven.com/hugo-stefan/building-ai-apps-ds-and-swe-from-first-principles?promoCode=vgrav This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit hugobowne.substack.com

    1 h 3 min
  8. Episode 63: Why Gemini 3 Will Change How You Build AI Agents with Ravin Kumar (Google DeepMind)

    2025-11-22

    Episode 63: Why Gemini 3 Will Change How You Build AI Agents with Ravin Kumar (Google DeepMind)

    Gemini 3 is a few days old and the massive leap in performance and model reasoning has big implications for builders: as models begin to self-heal, builders are literally tearing out the functionality they built just months ago... ripping out the defensive coding and reshipping their agent harnesses entirely. Ravin Kumar (Google DeepMind) joins Hugo to breaks down exactly why the rapid evolution of models like Gemini 3 is changing how we build software. They detail the shift from simple tool calling to building reliable "Agent Harnesses", explore the architectural tradeoffs between deterministic workflows and high-agency systems, the nuance of preventing context rot in massive windows, and why proper evaluation infrastructure is the only way to manage the chaos of autonomous loops. They talk through: - The implications of models that can "self-heal" and fix their own code - The two cultures of agents: LLM workflows with a few tools versus when you should unleash high-agency, autonomous systems. - Inside NotebookLM: moving from prototypes to viral production features like Audio Overviews - Why Needle in a Haystack benchmarks often fail to predict real-world performance - How to build agent harnesses that turn model capabilities into product velocity - The shift from measuring latency to managing time-to-compute for reasoning tasks LINKS From Context Engineering to AI Agent Harnesses: The New Software Discipline, a podcast Hugo did with Lance Martin, LangChain (https://high-signal.delphina.ai/episode/context-engineering-to-ai-agent-harnesses-the-new-software-discipline) Context Rot: How Increasing Input Tokens Impacts LLM Performance (https://research.trychroma.com/context-rot) Effective context engineering for AI agents by Anthropic (https://www.anthropic.com/engineering/effective-context-engineering-for-ai-agents) Upcoming Events on Luma (https://lu.ma/calendar/cal-8ImWFDQ3IEIxNWk) Watch the podcast video on YouTube (https://youtu.be/CloimQsQuJM) Join the final cohort of our Building AI Applications course starting Jan 12, 2026 (https://maven.com/hugo-stefan/building-ai-apps-ds-and-swe-from-first-principles?promoCode=vgrav): https://maven.com/hugo-stefan/building-ai-apps-ds-and-swe-from-first-principles?promoCode=vgrav This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit hugobowne.substack.com

    1 h

À propos

A podcast for people who build with AI. Long-format conversations with people shaping the field about agents, evals, multimodal systems, data infrastructure, and the tools behind them. Guests include Jeremy Howard (fast.ai), Hamel Husain (Parlance Labs), Shreya Shankar (UC Berkeley), Wes McKinney (creator of pandas), Samuel Colvin (Pydantic) and more. hugobowne.substack.com

Vous aimerez peut-être aussi