Max Agency

LangChain

Welcome to Max Agency, a podcast about how the best AI agents are actually being built. Hosted by Harrison Chase, CEO of LangChain, each episode goes deep with the builders designing, deploying, and learning from real agent systems in the wild. From architecture decisions to evals, tooling, and failure modes, Max Agency is for people who want to understand what it really takes to build useful agents.

Folgen

  1. How Listen builds a system of AI Agents & subagents for specialized tasks | Florian Juengermann, CTO

    23. APR.

    How Listen builds a system of AI Agents & subagents for specialized tasks | Florian Juengermann, CTO

    Florian Juengermann is the co-founder and CTO of Listen, an AI startup that turns qualitative research across hundreds of interviews, surveys, and focus groups into structured, traceable insights. Listen's agents analyze responses at scale, and Florian has rearchitected the system multiple times to get there. In this conversation, he walks through the virtual table architecture at the core of their Research Agent, how small models run map-reduce classification across thousands of open-ended responses, and the self-reviewing feedback subagent that catches errors during long async runs. We also discuss: The three agents inside Listen's platformHow Listen rearchitected from a simple RAG bot to a multi-agent system multiple timesWhy the PowerPoint subagent was completely rebuilt using Claude's code SDKContextual prompt engineering as an alternative to skillsHow Listen keeps report numbers live as new interview responses come inWhen to trigger the long-running agent vs. showing early resultsWhat Florian looks for when hiring agent engineers References: AnthropicChatGPTClaudeClaude Code SDKE2BEmotional IntelligenceGPT MiniHaikuListenOpenAIPandasPostgresPythonResearch AgentRenderZoom Where to find Florian: LinkedInTwitter/X Where to find Harrison: LinkedInTwitter/X Where to find LangChain: WebsiteDocs Send feedback or questions to maxagency@langchain.dev Timestamps 00:00 Introduction 01:25 The three agents inside Listen's platform 03:15 Live chat vs. long async runs, and how Listen tunes for each 05:33 Under the hood of the Research Agent 06:37 Listen's virtual table architecture 07:34 How small models classify thousands of open-ended responses 10:05 Running code in a sandbox: how E2B fits in 11:52 Why Listen rebuilt the PowerPoint subagent from scratch 14:11 Contextual prompt engineering instead of skills 16:32 The feedback subagent that reviews its own reports 18:14 How Listen runs evals in production 19:47 Unexpected ways users push the agent to its limits 21:42 How many times Listen has rearchitected, and why 24:59 Trace observability: depth over breadth 26:10 Lessons from running Claude Code SDK inside E2B 27:42 Memory: what's solved and what isn't 29:10 The Composer agent UX: co-editing a document with AI 35:50 How Listen keeps report numbers live as new responses come in 43:47 What Listen looks for when hiring agent engineers

    48 Min.
  2. How Hex Builds AI Agents: Making Agents Reason Like Human Data Analysts | Izzy Miller, AI Engineer

    9. APR.

    How Hex Builds AI Agents: Making Agents Reason Like Human Data Analysts | Izzy Miller, AI Engineer

    Izzy Miller is an AI engineer at Hex, an AI analytics platform that was one of the first companies to ship data agents to real paying users. Today, Hex runs a multi-agent system with nearly 100K tokens of tools, and Izzy is building a 90-day simulation to evaluate whether those agents actually get smarter over time. In this conversation, he walks through the harness decisions that shaped their architecture, the failure modes Hex is seeing at scale, and what it takes to build an eval that no current model can pass. We also discuss: Why data agents are harder to verify than coding agentsUnder the hood of Hex’s agentsHow Hex is unifying separate agentsWhy most eval sets are badThe 90-day simulation for long-horizon evalsHow Izzy went from marketing to AI engineer References: Andon LabsAnthropicBarry McCardelChatGPTClaude CodeClaude Sonnet 4.6DBTGPT-3.5 TurboGPT-5.3 Codex SparkGPT-5.4HexLangChainLangSmithLookerOpenAIOpus 4.6Satya NadellaSnowflakeVending Machine Where to find Izzy: LinkedInTwitter/X Where to find Harrison: LinkedInTwitter/X Where to find LangChain: WebsiteDocs Send feedback or questions to maxagency@langchain.dev Timestamps: 01:35 Where Hex's notebook agent started 03:46 The moment Hex knew it was time for agents 07:36 Why data agents are harder to verify than coding agents 09:30 How Hex is unifying separate agents 13:28 Under the hood of the notebook agent 15:41 The harness features that are now holding the agent back 17:41 Why Hex built their own orchestrator 18:59 Managing nearly 100K tokens of tools 20:49 Ephemeral queries and agent behavior trade-offs 24:46 The UX problem with showing agents' thinking 27:28 Why verification is harder than transparency for data agents 31:00 Memory, context conflicts, and collapse modes 34:38 How Hex built their internal eval system 39:29 Why most eval sets are bad 44:30 The 900% quota eval that every model fails 46:55 Model upgrades and the "in distribution" debate 51:34 How Izzy went from marketer to AI engineer 59:59 The 90-day simulation for long-horizon evals

    1 Std. 8 Min.

Info

Welcome to Max Agency, a podcast about how the best AI agents are actually being built. Hosted by Harrison Chase, CEO of LangChain, each episode goes deep with the builders designing, deploying, and learning from real agent systems in the wild. From architecture decisions to evals, tooling, and failure modes, Max Agency is for people who want to understand what it really takes to build useful agents.

Das gefällt dir vielleicht auch