Vanishing Gradients

Hugo Bowne-Anderson

A podcast for people who build with AI. Long-format conversations with people shaping the field about agents, evals, multimodal systems, data infrastructure, and the tools behind them. Guests include Jeremy Howard (fast.ai), Hamel Husain (Parlance Labs), Shreya Shankar (UC Berkeley), Wes McKinney (creator of pandas), Samuel Colvin (Pydantic) and more. hugobowne.substack.com

  1. 2 HR AGO

    Agentic Engineering and the Lost Art of Verification

    > I almost don’t read code now. My approach with Roborev is it’s like my code reader. The mantra is: Roborev reads every line of code that is generated. It gets read multiple times. And so, whenever I push up a pull request, the branch gets re-reviewed. And so by the time I’m merging a pull request into a repository, the code has all been read by agents four or five times minimum. I look at the code in terms of structural detail: does it look right? — Wes McKinney (creator of pandas, POSIT) Wes, Jeremiah Lowin (Prefect), and Randy Olson (Good Eye Labs) join Hugo and his cohost Thomas Wiecki (PyMC Labs) for the premiere of Show Us Your Agent Skills, a live session where guests walk us through the exact skills, workflows, and setups they use to work with agents every day. We Discuss: * Wes McKinney on why he barely writes, or even reads, code anymore, his “software factory” of parallel agents, and RoboRev, the background reviewer that reads every line four or five times before he merges; * The shift from “vibe coding” to agentic engineering, and why verification, not reading, is the part that actually matters; * Jeremiah Lowin on years of context engineering: trickling voice memos, recorded meetings, and morning briefs into his agent’s memory substrate as a true “second brain”; * Why Jeremiah picked OpenCode specifically for how deeply he can customize its memory, and what he’s building with FastMCP, Prefab, and Cardboard; * Randy Olson on encoding human judgment, like Tufte’s rules for data visualization, directly into agent skills, so the agents themselves perform the verification; * The “digital twin” Randy loads into his agents as a thought partner that pushes back instead of agreeing; * Skills as thin drivers, progressive disclosure, and managing context rot across extended sessions; * The rise of ephemeral, “just for me” software that agents finally make viable. Skills and workflows discussed and shown in the episode: * Wes’s RoboRev background code reviewer, his “software factory” dashboard, and his agentic engineering setup built on the Superpowers skills framework; * Jeremiah’s “explain” skill (which anchors every other skill he has), his voice memo memory pipeline, his FastMCP and Prefab projects, and Cardboard, his ephemeral presentation tool; * Randy’s data visualization verifier skills, his digital twin thought partner prompt, his cron job reports for colleagues, and his reflect and improve skill design pattern. Check out the GitHub repo where we’re starting to drop some of these skills and workflows for you to grab and try yourself. You can also find the full episode on Spotify, Apple Podcasts, and YouTube. You can also interact directly with the transcript here in NotebookLM: If you do so, let us know anything you find in the comments! Up next on Show Us Your Agent Skills: Hilary Mason (CEO, HiddenDoor), Bryan Bischof (Theory Ventures), Eric Ma (Research DS lead, Moderna Therapeutics), and Tomasz Tunguz (Theory Ventures). Register on lu.ma to join live, or catch the recording afterwards. 👉 Want to learn how to apply agentic engineering to the world of data science? Come build the future of Agentic Data Science with us in our upcoming course. It’s a live cohort with hands on exercises, capstones, and reusable agent skills, OSS code, and notebooks that will 10x your data science projects. Sign up here.👈 LINKS * spicytakes.org, Wes McKinney’s website * RoboRev, Wes’s background code reviewer * Agents View, Wes’s agent session database * Middleman, Wes’s local GitHub dashboard * Superpowers, Jesse Vincent’s skills framework that Wes builds on * An Open Source Maintainer’s Guide to Saying No, by Jeremiah Lowin * FastMCP * Prefab, Jeremiah’s Python DSL for generative UIs * Beautiful Charts with AI, by Randy Olson * The Coding Agent is Dead, by Amp * Building Effective Agents, by the Anthropic team * Show Us Your Agent Skills, the GitHub repo where we are dropping skills and workflows from the show * Upcoming Events on Luma * Vanishing Gradients on YouTube * Watch the podcast video on YouTube * Come build the future of Agentic Data Science with us in our upcoming course. How You Can Support Vanishing Gradients Vanishing Gradients is a podcast, workshop series, blog, and newsletter focused on what you can build with AI right now. Over 70 episodes with expert practitioners from Google DeepMind, Netflix, Stanford, and elsewhere. Hundreds of hours of free, hands-on workshops. All independent, all free. If you want to help keep it going: * Become a paid subscriber, from $8/month * Share this with a builder who’d find it useful * Subscribe to our YouTube channel. Get full access to Vanishing Gradients at hugobowne.substack.com/subscribe

    1hr 32min
  2. 23 APR

    Next Level AI Evals for 2026

    There are a lot of reasons why we should do AI evals. For many companies doing AI evals is the way to build the feedback loop into the product development lifecycle. So it is like your compass. We’re using AI evals as a compass to guide product development and also product iteration. And also, many times we need evals to function as the pass or fail gate in release decisions. Whether this product is good enough for release or whether it is good enough for experiment, evals are also used in that. Stella Wenxing Liu, Head of Applied Science at ASU, and Eddie Landesberg, Staff Data Scientist at Google, join Hugo to talk about why AI evaluation is evolving from “vibe checks” into a rigorous, multi-disciplinary science and how causal inference will take AI evals to the next level in 2026. Vanishing Gradients is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber. They Discuss: * Team-Centric AI Evals, integrating product managers, data scientists, and SMEs under a “benevolent dictator” (or not!) to ensure comprehensive and effective evaluation; * Custom Evaluation Metrics, moving beyond generic vendor metrics to analyze raw data and identify specific failure modes, avoiding generic product outcomes; * AI as Policy Evaluation, framing AI evaluation as a causal inference problem to estimate counterfactual performance of new “policies” (prompts, models) and predict online AB test outcomes; * Clear Product Constraints, defining what an AI product should not do with strict guardrails to prevent misuse, control costs, and avoid brand dilution; * Calibrated LLM Judges, statistically aligning LLM-as-a-judge with human experts using causal inference to ensure valid proxies for human welfare and business objectives; * Essential Data Curiosity, fostering a culture of manual data inspection to build intuition before relying on automated error analysis or agents, ensuring effective system design; * Statistical AI Evaluation, shifting from unit-test thinking to non-deterministic distributions, using confidence intervals and power analysis to discern genuine improvements from statistical noise; * Proactive Regulatory Compliance, developing rigorous, defensible internal evaluation standards now to gain a competitive advantage as vague AI regulations move towards enforced compliance; * Human-Centric Benchmarking, grounding AI systems in human judgment and user values, moving beyond automated scores to build resilient and differentiated AI. You can also find the full episode on Spotify, Apple Podcasts, and YouTube. You can also interact directly with the transcript here in NotebookLM: If you do so, let us know anything you find in the comments! 👉 Stella has just started teaching a cohort of her AI Evals and Analytics Playbook course starting this week. She’s kindly giving listeners of Vanishing Gradients 30% off with this link.👈 Our flagship course Building AI Applications just wrapped its final cohort but we’re cooking up something new. If you want to be first to hear about it (and help shape what we build), drop your thoughts here. LINKS * Stella Wenxing Liu on LinkedIn * Eddie Landesberg on LinkedIn * Stella’s AI Evals & Analytics Playbook course on Maven (30% community discount) * CJE (Causal Judge Evaluation) package by Eddie * Trillion Dollar Coach * Goodhart’s Law * Upcoming Events on Luma * Vanishing Gradients on YouTube * Watch the podcast video on YouTube How You Can Support Vanishing Gradients Vanishing Gradients is a podcast, workshop series, blog, and newsletter focused on what you can build with AI right now. Over 70 episodes with expert practitioners from Google DeepMind, Netflix, Stanford, and elsewhere. Hundreds of hours of free, hands-on workshops. All independent, all free. If you want to help keep it going: * Become a paid subscriber, from $8/month * Share this with a builder who’d find it useful * Subscribe to our YouTube channel. Thanks for reading Vanishing Gradients! This post is public so feel free to share it. Get full access to Vanishing Gradients at hugobowne.substack.com/subscribe

    54 min
  3. 15 APR

    Privacy Theater Is Not Privacy Engineering: What It Actually Takes to Ship Safe AI

    Katharine Jarmul, Privacy in ML/AI Expert & Author of Practical Data Privacy, joins Hugo to unpack why most AI privacy advice is theater: and what technical privacy actually looks like when you’re shipping LLMs, agents, and multimodal systems into the real world. In this episode, we dig into how to build defensible systems in an era of AI agents and multimodal models: why system prompts (and your entire agent harness!) should be considered public by default, and why “privacy observability” is as critical as data observability for anyone building with LLMs today. Multimodal is what changes the threat model: identifiers hide in images, audio, and metadata, not just text, and the old anonymization playbook doesn’t cover it. Vanishing Gradients is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber. We Discuss: * No Convenience Tax, you don’t have to trade privacy for utility: high-utility AI products can be privacy-preserving through technical controls like privacy routing and input sanitization; * Public Prompts and Harnesses: assume any instruction or secret in a system prompt or agent harness will be exfiltrated; don’t put sensitive info there in the first place; * Privacy Observability, tag and track data flows so information is used only for its original intended purpose: catch design flaws before they become legal problems; * Technical Privacy, implement mathematical and statistical constraints directly into ML systems and data flows so privacy is measurable and enforceable, not aspirational; * Tiered Guardrails, a three-layer approach: deterministic filters for hard rules, algorithmic models for nuanced classification, and internal alignment training for behavioral baselines; * Federated Learning Is Not Privacy, model updates in FL leak sensitive data on their own: you must layer differential privacy or encrypted computation on top, or you’re reverse-engineerable; * Anonymization Spectrum, navigate the “grayscale” of privacy in multimodal AI, balancing data utility and individual risk as identifiers hide in non-obvious places; * Privacy Champions, embed privacy accountability directly into development by training and incentivizing engineers inside product teams; * Red Teaming as Ritual, your goal is to attack yourself: practice thinking like an attacker, and turn privacy testing into an organization-wide creative ritual rather than a siloed security task. You can also find the full episode on Spotify, Apple Podcasts, and YouTube. You can also interact directly with the transcript here in NotebookLM: If you do so, let us know anything you find in the comments! 👉 Katharine is teaching her next cohort of Practical AI Privacy starting April 20. She’s kindly giving readers of Vanishing Gradients 10% off. Use this link. I’ll be taking it so hope to see you there!👈 Our flagship course Building AI Applications just wrapped its final cohort but we’re cooking up something new. If you want to be first to hear about it (and help shape what we build), drop your thoughts here. LINKS * Practical AI Privacy course on Maven (10% off with code build-with-privacy) * Katharine Jarmul on LinkedIn * Probably Private — Katharine’s website & newsletter * Practical Data Privacy (Katharine’s book) * Let’s Build an AI Privacy Router — Lightning Lesson * Practical AI Privacy: Agents & Local LLMs (newsletter issue) * A Deep Dive into Memorization in Deep Learning (kjamistan blog) * Microsoft Presidio * Llama Guard 3 8B on Hugging Face * Nicholas Carlini * From Magic to Malware: How OpenClaws Agent Skills Become an Attack Surface (1Password) * Owning Ethics (Metcalf, Moss, boyd — Data & Society) * Hugo on guardrails in LLM applications * Upcoming Events on Luma * Vanishing Gradients on YouTube * Watch the podcast video on YouTube How You Can Support Vanishing Gradients Vanishing Gradients is a podcast, workshop series, blog, and newsletter focused on what you can build with AI right now. Over 70 episodes with expert practitioners from Google DeepMind, Netflix, Stanford, and elsewhere. Hundreds of hours of free, hands-on workshops. All independent, all free. If you want to help keep it going: * Become a paid subscriber, from $8/month * Share this with a builder who’d find it useful * Subscribe to our YouTube channel. Thanks for reading Vanishing Gradients! This post is public so feel free to share it. Get full access to Vanishing Gradients at hugobowne.substack.com/subscribe

    1hr 7min
  4. 13 APR

    LLM Architecture in 2026: What You Need to Know with Sebastian Raschka

    If you take a model release as an anchor point, let’s say Nemotron 3 or Qwen 3.5, you can go in both directions: You can either plug them into an agent and play around with that, or you can look, okay, what does the model look like under the hood? What are the ingredients? What type of attention mechanism do they use? What are currently research techniques that could make that even better in the next generation of models? What can we swap out, basically? And I’m interested in both of these! Sebastian Raschka, Independent AI Researcher and author of Build a Large Language Model from Scratch, joins Hugo to talk about what’s changed in AI architecture, from post-training to hybrid models, and why understanding what’s under the hood matters more than ever for developers building in the agentic era. Sebastian’s upcoming book, Build a Reasoning Model from Scratch, currently available for pre-order on Amazon and in early access on Manning! Vanishing Gradients is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber. We Discuss: * Ed Tech for Agents: should we design educational content specifically for agentic systems, or is there a better approach? * Inference Scaling is the new frontier, driving “gold-level” performance during generation via parallel sampling and internal meta-judges; * Hybrid Architectures from Qwen 3.5 and Nemotron 3 scale almost linearly, making long-context agentic workflows significantly more affordable and performant; * Multi-head Latent Attention (MLA), developed by DeepSeek, wins the KV cache war by drastically reducing memory overhead without performance hits; * Agent Harnesses need to be continuously simplified as frontier models are post-trained on agent trajectories. Teams that don’t strip back their scaffolding risk the harness getting in the way of a more capable model. * “AI Psychosis”: the cognitive load of supervising self-supervising agents, and why we’re all conducting an orchestra we were never trained to conduct; * Sebastian’s AI Stack: a surprisingly simple setup (Mac mini, Codex, Ollama) with a ~20-item QA checklist, delegating the boring work to preserve energy for creative development; * Fine-tuning is now an economic decision, optimizing costs and latency for high-volume tasks where long system prompts outweigh a one-time training run; * Process Reward Models (PRMs) are the next frontier, verifying intermediate reasoning steps to solve “hallucination in the middle” for complex math and code tasks; * “Implementation Does Not Lie”: Sebastian’s layer-by-layer verification philosophy, comparing from-scratch builds against HuggingFace references to catch details invisible in papers; * Architecture Details dictate inference stack choices; nuances like RMSNorm stability or RoPE flavors are critical for optimal performance and troubleshooting; * The Distillation Loop drives open-weight parity, enabling specialized, “frontier-class” models by “pre-digesting” frontier outputs without multi-million dollar training risks. You can also find the full episode on Spotify, Apple Podcasts, and YouTube. You can also interact directly with the transcript here in NotebookLM: If you do so, let us know anything you find in the comments! Our flagship course Building AI Applications just wrapped its final cohort but we’re cooking up something new. If you want to be first to hear about it (and help shape what we build), drop your thoughts here. Links and Resources * Build a Reasoning Model (From Scratch): Sebastian’s new book, currently available for pre-order on Amazon and in early access on Manning. You’ll learn how reasoning LLMs actually work by starting with a pre-trained base LLM and adding reasoning capabilities step by step in code. A hands-on follow-up to Build a Large Language Model from Scratch. * LLM Architecture Gallery: Sebastian’s collection of architecture figures and fact sheets from his blog posts, updated with each major model release. A go-to visual reference for comparing what’s changed under the hood across model generations. * Sebastian Raschka on LinkedIn * Sebastian’s website * Ahead of AI (Sebastian’s Substack) * Build a Large Language Model from Scratch * PinchBench: OpenClaw Benchmark Leaderboard * DeepSeekMath-V2: Towards Self-Verifiable Mathematical Reasoning * Gated Delta Networks: Improving Mamba2 with Delta Rule (ICLR 2025) * DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning * Hugging Face Model Hub * Upcoming Events on Luma * Vanishing Gradients on YouTube A Bit More on Agent Harnesses * Components of A Coding Agent by Sebastian * How To Build An Agent that Builds its own Harness by Hugo and Ivan Leo (DeepMind, ex-Manus) * Build Your Own Deep Research Agent with Hugo & Ivan Leo (Google DeepMind, ex-Manus): In this livestream, you’ll learn how to build a production-grade agent harness from scratch in pure Python; * AI Agent Harness, 3 Principles for Context Engineering, and the Bitter Lesson Revisited with Lance Martin (Anthropic), Duncan Gilchrist (Delphina), and Hugo * The Post-Coding Era: What Happens When AI Writes the System? with Nicholas Moy (Google DeepMind), Duncan Gilchrist (Delphina), and Hugo * What is an Agent Harness? from What 300+ Engineers from Netflix, Amazon, and Instacart Asked About AI Engineering. How You Can Support Vanishing Gradients Vanishing Gradients is a podcast, workshop series, blog, and newsletter focused on what you can build with AI right now. Over 70 episodes with expert practitioners from Google DeepMind, Netflix, Stanford, and elsewhere. Hundreds of hours of free, hands-on workshops. All independent, all free. If you want to help keep it going: * Become a paid subscriber, from $8/month * Share this with a builder who’d find it useful * Subscribe to our YouTube channel. Thanks for reading Vanishing Gradients! This post is public so feel free to share it. Get full access to Vanishing Gradients at hugobowne.substack.com/subscribe

    1hr 18min
  5. 20 MAR

    Episode 72: Why Agents Solve the Wrong Problem (and What Data Scientists Do Instead)

    I often see what I would consider to be b******t evals, especially in data, like write this dumb SQL. Almost every one of these dumb SQL questions that I’ve seen for benchmarks are just so either obviously easy or overwhelmingly adversarial. They just, they don’t feel valuable as a data scientist, it’s something that you probably would never ask a real data scientist to do. So I went out my way to create real ones. Let me read one to you. Bryan Bischof, Head of AI at Theory Ventures, joins Hugo to talk about what happened when 150 people spent six hours using AI agents to answer real data science questions across SQL tables, log files, and 750,000 PDFs. They Discuss: * Failure Funnels, pinpoint where agent reasoning breaks down using causal-chain binary evaluations instead of vague 1-5 scales; * Median Score: 23 out of 65, what happened when world-class engineers turned agents loose on real data work, and why general-purpose coding agents with human prodding beat fancy frameworks; * Zero-Cost Submissions Kill Trust, without a penalty for wrong answers, agents hill-climb to correct submissions through brute force instead of building confidence; * Data Science is “Zooming”, moving beyond binary decisions to iterative problem framing, refining “does our inventory suck?” into a tractable hypothesis; * MCP as Semantic Layer, model your organization’s proprietary knowledge once and distribute it to whatever LLM interface your team prefers; * The Subagent vs. Tool Debate, a distinction that adds cognitive load without hiding complexity; * Self-Orchestration Gap, agents don’t yet realize they should trigger specialized extraction frameworks like DocETL instead of reading 750K PDFs one by one; * The Future of Evals, from vibe checks to objective functions and continuous user feedback that lets systems converge on reliability. You can also find the full episode on Spotify, Apple Podcasts, and YouTube. You can also interact directly with the transcript here in NotebookLM: If you do so, let us know anything you find in the comments! 👉 Want to learn more about Building AI-Powered Software? Check out our Building AI Applications course. It’s a live cohort with hands on exercises and office hours. Our final cohort has started. Registration is still open. All sessions are recorded so don’t worry about having missed any. Here is a 25% discount code for readers. 👈 LINKS * Bryan Bischof on Twitter/X * Bryan Bischof on LinkedIn * Theory Ventures * The Hunt for a Trustworthy Data Agent (blog post) * America’s Next Top Modeler GitHub repo * Hamel’s evals FAQ: How do I evaluate agentic workflows? * DocETL * LLM Judges and AI Agents at Scale (Hugo’s podcast with Shreya Shankar) * When Your Metrics Are Lying (Cimo Labs) * Lessons from a Year of Building with LLMs (livestream on YouTube) * Bryan Bischof: The Map is Not the Territory (YouTube) * Upcoming Events on Luma * Vanishing Gradients on YouTube * Watch the podcast video on YouTube 👉 Want to learn more about Building AI-Powered Software? Check out our Building AI Applications course. It’s a live cohort with hands on exercises and office hours. Our final cohort has started. Registration is still open. All sessions are recorded so don’t worry about having missed any. Here is a 25% discount code for readers. 👈 Get full access to Vanishing Gradients at hugobowne.substack.com/subscribe

    1hr 34min
  6. 18 FEB

    Episode 71: Durable Agents - How to Build AI Systems That Survive a Crash with Samuel Colvin

    Our thesis is that AI is still just engineering… those people who tell us for fun and profit, that somehow AI is so, so profound, so new, so different from anything that’s gone before that it somehow eclipses the need for good engineering practice are wrong. We need that good engineering practice still, and for the most part, most things are not new. But there are some things that have become more important with AI. One of those is durability. Samuel Colvin, Creator of Pydantic AI, joins Hugo to talk about applying battle-tested software engineering principles to build durable and reliable AI agents. They Discuss: * Production agents require engineering-grade reliability: Unlike messy coding agents, production agents need high constraint, reliability, and the ability to perform hundreds of tasks without drifting into unusual behavior; * Agents are the new “quantum” of AI software: Modern architecture uses discrete “agentlets”: small, specialized building blocks stitched together for sub-tasks within larger, durable systems; * Stop building “chocolate teapot” execution frameworks: Ditch rudimentary snapshotting; use battle-tested durable execution engines like Temporal for robust retry logic and state management; * AI observability will be a native feature: In five years, AI observability will be integrated, with token counts and prompt traces becoming standard features of all observability platforms; * Split agents into deterministic workflows and stochastic activities: Ensure true durability by isolating deterministic workflow logic from stochastic activities (IO, LLM calls) to cache results and prevent redundant model calls; * Type safety is essential for enterprise agents: Sacrificing type safety for flexible graphs leads to unmaintainable software; professional AI engineering demands strict type definitions for parallel node execution and state recovery; * Standardize on OpenTelemetry for portability: Use OpenTelemetry (OTel) to ensure agent traces and logs are portable, preventing vendor lock-in and integrating seamlessly into existing enterprise monitoring. You can also find the full episode on Spotify, Apple Podcasts, and YouTube. You can also interact directly with the transcript here in NotebookLM: If you do so, let us know anything you find in the comments! 👉 Want to learn more about Building AI-Powered Software? Check out our Building AI Applications course. It’s a live cohort with hands on exercises and office hours. Here is a 25% discount code for listeners. 👈 LINKS * Samuel Colvin on LinkedIn * Pydantic * Pydantic Stack Demo repo * Deep research example code * Temporal * DBOS (Postgres alternative to Temporal) * Upcoming Events on Luma * Vanishing Gradients on YouTube * Watch the podcast video on YouTube 👉Want to learn more about Building AI-Powered Software? Check out our Building AI Applications course. It’s a live cohort with hands on exercises and office hours. Our final cohort starts March 10, 2026. Here is a 25% discount code for listeners.👈 https://maven.com/hugo-stefan/building-ai-apps-ds-and-swe-from-first-principles?promoCode=vgfs Get full access to Vanishing Gradients at hugobowne.substack.com/subscribe

    51 min
  7. 12 FEB

    Episode 70: 1,400 Production AI Deployments

    There’s a company who spent almost $50,000 because an agent went into an infinite loop and they forgot about it for a month. It had no failures and I guess no one was monitoring these costs. It’s nice that people do write about that in the database as well. After it happened, they said: watch out for infinite loops. Watch out for cascading tool failures. Watch out for silent failures where the agent reports it has succeeded when it didn’t! We Discuss: * Why the most successful teams are ripping out and rebuilding their agent systems every few weeks as models improve, and why over-engineering now creates technical debt you can’t afford later; * The $50,000 infinite loop disaster and why “silent failures” are the biggest risk in production: agents confidently report success while spiraling into expensive mistakes; * How ELIOS built emergency voice agents with sub-400ms response times by aggressively throwing away context every few seconds, and why these extreme patterns are becoming standard practice; * Why DoorDash uses a three-tier agent architecture (manager, progress tracker, and specialists) with a persistent workspace that lets agents collaborate across hours or days; * Why simple text files and markdown are emerging as the best “continual learning” layer: human-readable memory that persists across sessions without fine-tuning models; * The 100-to-1 problem: for every useful output, tool-calling agents generate 100 tokens of noise, and the three tactics (reduce, offload, isolate) teams use to manage it; * Why companies are choosing Gemini Flash for document processing and Opus for long reasoning chains, and how to match models to your actual usage patterns; * The debate over vector databases versus simple grep and cat, and why giving agents standard command-line tools often beats complex APIs; * What “re-architect” as a job title reveals about the shift from 70% scaffolding / 30% model to 90% model / 10% scaffolding, and why knowing when to rip things out is the may be the most important skill today. You can also find the full episode on Spotify, Apple Podcasts, and YouTube. You can also interact directly with the transcript here in NotebookLM: If you do so, let us know anything you find in the comments! 👉 Want to learn more about Building AI-Powered Software? Check out our Building AI Applications course. It’s a live cohort with hands on exercises and office hours. Our final cohort starts March 10, 2026. Here is a 25% discount code for readers. 👈 Show Notes Links * Alex Strick van Linschoten on LinkedIn * Alex Strick van Linschoten on Twitter/X * LLMOps Database * LLMOps Database Dataset on Hugging Face * Hugo’s MCP Server for LLMOps Database * Alex’s Blog: What 1,200+ Production Deployments Reveal About LLMOps in 2025 * Previous Episode: Practical Lessons from 750 Real-World LLM Deployments * Previous Episode: Tales from 400 LLM Deployments * Context Rot Research by Chroma * Hugo’s Post: AI Agent Harness - 3 Principles for Context Engineering * Hugo’s Post: The Rise of Agentic Search * Episode with Nick Moy: The Post-Coding Era * Hugo’s Personal Podcast Prep Skill Gist * Claude Tool Search Documentation * Gastown on GitHub (Steve Yegge) * Welcome to Gastown by Steve Yegge * ZenML - Open Source MLOps & LLMOps Framework * Upcoming Events on Luma * Vanishing Gradients on YouTube * Watch the podcast livestream on YouTube * Join the final cohort of our Building AI Applications course in March, 2026 (25% off for listeners) 👉 Want to learn more about Building AI-Powered Software? Check out our Building AI Applications course. It’s a live cohort with hands on exercises and office hours. Our final cohort starts March 10, 2026. Here is a 25% discount code for readers. 👈 Get full access to Vanishing Gradients at hugobowne.substack.com/subscribe

    1hr 10min
  8. 3 FEB

    Episode 69: Python is Dead. Long Live Python! With the Creators of pandas & Parquet

    > It’s the agent writing the code. And it’s the development loop of writing the code, building testing, write the code, build test and iterating. And so I do think we’ll see for many types of software, a shift away from Python towards other programming languages. I think Go is probably the best language for those like other types of software projects. And like I said, I haven’t written a line of Go code in my life. – Wes McKinney (creator of pandas Principal Architect at Posit), Wes McKinney, Marcel Kornacker, and Alison Hill join Hugo to talk about the architectural shift for multimodal AI, the rise of “agent ergonomics,” and the evolving role of developers in an AI-generated future. We Discuss: * Agent Ergonomics: Optimize for agent iteration speed, shifting from human coding to fast test environments, potentially favoring languages like Go; * Adversarial Code Review: Deploy diverse AI models to peer-review agent-generated code, catching subtle bugs humans miss; * Multimodal Data Verbs: Make operations like resizing and rotating native to your database to eliminate data-plumbing bottlenecks; * Taste as Differentiator: Value “taste”—the ability to curate and refine the best output from countless AI-generated options—over sheer execution speed; * 100x Software Volume: Embrace ephemeral, just-in-time software; prioritize aggressive generation and adversarial testing over careful planning for quality. You can also find the full episode on Spotify, Apple Podcasts, and YouTube. You can also interact directly with the transcript of the workshop & fireside chat here in NotebookLM: If you do so, let us know anything you find in the comments! 👉 Want to learn more about Building AI-Powered Software? Check out our Building AI Applications course. It’s a live cohort with hands on exercises and office hours. Here is a discount code for readers. 👈 This was a fireside chat at the end of a livestreamed workshop we did on building multimodal AI systems with Pixeltable. Check out the full workshop below (all code here on Github): Links and Resources * Wes McKinney on LinkedIn * Marcel Kornacker on LinkedIn * Alison Hill on LinkedIn * Spicy Takes * Palmer Penguins * Pixeltable * Posit * Positron * Building Multimodal AI Systems Workshop Repository * Pixeltable Docs: LLM Tool Calling with MCP Servers * Pixeltable Docs: Working with Pydantic * Upcoming Events on Luma * Vanishing Gradients on YouTube * Watch the podcast video on YouTube * Join the final cohort of our Building AI Applications course in March, 2026 (25% off for listeners) https://maven.com/hugo-stefan/building-ai-apps-ds-and-swe-from-first-principles?promoCode=vgfs What people said during the workshop “I think the interface looks amazing/simple. Strong work! 🦾” — @goldentribe “This is quite amazing. Watching this I felt the same way when I first leant pandas, NumPy and scikit and how well i was able to manipulate and wrangle data. PixelTable feels seamless and looks as good as those legendary frameworks but for Multimodal Data.” — @vinod7 “This is all extremely cool to see, I love the API and the approach.” — @steveb4191 “Thanks so much, Hugo! That was very insightful! Great work Alison and Marcel!” — @vinod7 “Just wrapped up watching a replay of the Pixeltable workshop. So cool!! Love the notebooks and working examples. The important parts were covered and worked beautifully 🕺” — @therobbrennan 👉 Want to learn more about Building AI-Powered Software? Check out our Building AI Applications course. It’s a live cohort with hands on exercises and office hours. Here is a discount code for readers. 👈 Get full access to Vanishing Gradients at hugobowne.substack.com/subscribe

    55 min

About

A podcast for people who build with AI. Long-format conversations with people shaping the field about agents, evals, multimodal systems, data infrastructure, and the tools behind them. Guests include Jeremy Howard (fast.ai), Hamel Husain (Parlance Labs), Shreya Shankar (UC Berkeley), Wes McKinney (creator of pandas), Samuel Colvin (Pydantic) and more. hugobowne.substack.com

You Might Also Like