Agents and Engineers

Dan Gerlanc

The podcast about Agentic AI and Software Engineering. Each episode is a conversation with people whose daily lives most intersect with AI and agentic systems. Join me as I follow the stories, the behind-the-scenes, and the real people behind the code.

Episodes

  1. 2d ago

    When Software Gets Cheap, Focus Gets Expensive

    Dan and Greg open with how agentic development has changed since the early days of Copilot. At the time, Greg was at GitHub, and he saw AI mostly help with boilerplate and editor completions. Cursor-style agents were the next widely-used advancement bringing session history and integrated team-wide practices. By June 2026, capable models and harnesses are common inside engineering teams, so the gap between teams increasingly comes from context engineering, repository structure, and whether old team shapes still align with the new ways of building software. For small teams and startups, the leverage of AI is a double-edged sword. Greg describes how SpecStory's original extensions required real sweat equity to reverse engineer chat-log formats across Cursor, Copilot, Claude Code, Amp, and other tools. Now, much of that surface can now be maintained by a fraction of one person's time. The danger is that easy MVPs can trick founders into believing they have validated a market. When the marginal cost of software falls, founders have to spend more of their scarce attention on demand, willingness to pay, distribution, and the routes to customers. The conversation turns to Greg's book, 25 Patterns in Agentic Engineering. He explains how he mined roughly 1,300 preserved SpecStory sessions and nearly 5,000 commits to extract durable patterns from his own agentic practice. Two patterns stand out. First, when code becomes free, verification becomes the bottleneck. Second, between agents turns, docs are the persistent API of the system. For Greg, as-built architecture documents are practical maps that let both humans and agents recover the shape of a subsystem without re-reading the entire codebase every time. Greg's development practice has changed accordingly. He favors trunk-based development and says his team uses almost no pull requests for everyday development, partly because agent-generated diffs arrive at a volume he does not want to review line by line. He prefers local agents over cloud agents that containerize the repo and open PRs later, because steering an agent while it runs keeps his mental model intact. Long unattended runs still make sense to him, but only when they start from a clear goal and a more detailed rider, with phased commits and verification points he can inspect after a walk or a night away. Dan and Greg also dig into coordination at larger scale. Greg is skeptical that issue trackers were ever clean or current enough to describe day-to-day engineering, but he sees issues becoming useful as specs with provenance and evidence that can be handed to agents. Personally, he runs several projects at once, usually three to five, with local agents in permissive modes, and rotates attention while long runs execute. That power is not free. He describes the dopamine loop of watching ideas come to life, the temptation to keep agents busy overnight, and the scarcity mindset created by subsidized access to frontier models. The episode closes with where Greg still does not trust the tools. Copywriting and visual design still require heavy human intervention because the models can blur rather than sharpen the message. He frames taste less as a mystical trait and more a selection amongst trade-offs and the ability to connect ideas in understandable ways. Coding has benefited from benchmarks and verifiable answers; much of the rest of the world is less tractable because there is no single ground truth for what "good" means. Full episode notes Click here to view the episode transcript. Chapters (00:00) - Introduction and guest background (00:55) - What agentic teams are running into (06:56) - Startup leverage, MVP traps, and maintaining SpecStory (09:26) - When software gets cheaper, distribution matters more (12:31) - Hand-written code, craft, and code as liability (16:21) - Mining 1,300 sessions into 25 patterns (19:01) - Verification and as-built architecture docs (23:55) - Co-writing docs with LLMs (25:15) - Keeping docs fresh through skills, Git, and verbose commits (27:50) - Trunk-based development for agentic teams (30:26) - Local steering versus cloud-agent pull requests (32:14) - Goal and rider plans, long runs, and Gas Town (35:52) - Replacing issue trackers with weekly docs (38:19) - Larger teams and issues as agent-ready specs (42:45) - Parallel projects and concentration limits (44:47) - Local agents, permissions, and risk judgment (46:57) - The cognitive pull of managing agents (51:58) - Scarcity, token costs, and model choice (58:57) - Copy, design, naming, and taste (01:05:04) - Why creative output resists verification (01:07:12) - Closing ⠀ Links from the show -------------------- Hardcore Agentic Engineering for builders who ship SpecStory Stoa 25 Patterns in Agentic Engineering AI Essentials for Tech Executives Meditations on Tech Beyond Code-Centric Goal Engineering WebRTC CRDT Trunk-based development Steve Yegge's Gas Town Dead Reckon Devin DORA Bear DeepSeek Qwen Yann LeCun ⠀ Guests ------- Greg Ceccarelli, Co-Founder & CPO, SpecStory Website Blog LinkedIn ⠀ Follow the podcast ------------------- LinkedIn Threads Instagram TikTok ⠀ Follow Dan Gerlanc ------------------- X LinkedIn Threads Bluesky

    1h 8m
  2. Jun 18

    From Supervising AI to Building Systems for It

    Dan and Eleanor open by discussing how fast software engineering has changed. In the last six months, Eleanor's practice flipped from treating AI as a messy assistant that needs close supervision to building systems that put the agents on the path to success. She now writes essentially no code herself, arguing that the models have become good enough that her involvement mostly makes the results worse. This journey starts from babysitting agents locally to delegating to async, cloud-based agents like GitHub Copilot, Cursor, Devin, OpenHands, or Factory. Eleanor warns that the home-grown terminal "loops" everyone is building right now are great for learning but too brittle to scale. Next up, what does an agent engineering system actually need? Eleanor recommends starting with a sandboxed, execution environment (usually containers), careful configuration over how the agent reaches the outside world (MCP servers and selective network access), a way to see across multiple repositories, and layered rules via AGENTS.md and skills. Eleanor makes the case that async delegation is a forcing function for better specifications. Deterministic feedback like static analysis and test suites are the single biggest factor in work quality because "you can't control AI with AI." She has moved to fully test-driven development and notes that current-generation models no longer find unintended workarounds to tests (e.g., deleting them) the way Claude 4 and early GPT-5 once did. Dan and Eleanor turn to adoption and skills, including how to get better at using AI with deliberate practice. Eleanor explains why she moved using Python, which she was most familiar with from use over her career, to statically typed languages like TypeScript and Go for agent work, why supply chain risk at her healthcare company has her questioning every dependency, and why she dislikes the term "junior developer." Curiosity and systems thinking, not tenure, are what matter now. The episode closes on verification and scale. Eleanor distrusts any output she can't verify, doesn't miss hand-writing code, and argues that inventing new ways to verify, including more formal methods, is the real bottleneck now that models are cheap and strong. On team size, she pushes back on the "small teams" consensus, pointing to the success of large open-source communities. Eleanor remarks that software development has become a sub-branch of systems engineering, and anyone not practicing this now will be shocked in a matter of months. Full episode notes Click here to view the episode transcript. Chapters (00:00) - Introduction (00:58) - The flip: from supervising AI to getting out of its way (03:14) - Cloud-based agents vs. rolling your own (06:08) - The primitives every agent system needs (07:43) - Why async delegation beats local babysitting (11:02) - Writing specs: Codex, Repo Prompt, and markdown (12:21) - Guardrails: AGENTS.md, skills, and deterministic checks (14:19) - Going fully test-driven (17:19) - How engineers really adopt (and hide) AI (19:38) - Getting better through deliberate practice (21:12) - From experiment to reusable skill to library (24:26) - Choosing a language: Python, TypeScript, Go (26:56) - Supply chain risk and distributing specs (29:06) - Beyond 'junior': curiosity over tenure (31:03) - Systems thinking as the durable skill (38:02) - Where Eleanor still doesn't trust AI (39:40) - Not missing the keyboard (42:43) - Keeping up with a fast-moving field (44:50) - What teaching reveals (48:27) - Verification as the real bottleneck (50:41) - Team size and open source at scale (55:48) - Closing: take agents seriously   Links from the show -------------------- GitHub Copilot coding agent Devin OpenHands Factory Codex Repo Prompt AGENTS.md Model Context Protocol (MCP) Anthropic 'when AI builds itself' Lovable Vercel Formal verification UML Jimini Health   Guests ------- Eleanor Berger, Member of the Technical Staff, Jimini Health Website LinkedIn X   Follow the podcast ------------------- LinkedIn Threads Instagram TikTok   Follow Dan Gerlanc ------------------- X LinkedIn Threads Bluesky

    1 hr
  3. Jun 4

    Are We All Managers Now?

    Dan, Angie Jones, and Demetrios Brinkman open with a discussion of the Agentic AI Foundation ("AAIF"), founded by Anthropic, OpenAI, and Block in December 2025 and now home to roughly 180 member companies. AAIF recently launched an ambassador program (apply [here](https://aaif.io/ambassadors/)) and has upcoming events across the globe from [AGNTCon](https://events.linuxfoundation.org/agntcon-mcpcon-north-america/) in San Jose to gatherings in Amsterdam, India, Tokyo, and Seoul. A recurring theme is that the whole industry is learning agentic engineering together. So get out of your "lab" and compare notes! You don't have to do all of this R&D on your own (well, maybe some of it, but it doesn't hurt to collaborate). Everything is changing. And quickly. Angie marks the release of Claude Opus 4.5 as when agentic engineering became viable. Where engineers once obsessed over context engineering and priming a repo so an agent had a chance, the latest frontier models often just need to be pointed at a codebase and told the problem. Drawing on her time leading agentic AI at Block, Angie describes the agent they build that can hold a world model across 25,000 codebases. They paired this agent with cloud workstations where an agent picks up a Jira ticket, clones the repo, and opens a PR without anyone babysitting a terminal. With this kind of firepower comes new problems that look less like coding and more like management. Demetrios argues the unglamorous topic of governance — keeping teams aligned, codifying security practices, deciding what belongs in "the harness" — are the new challenges companies are grappling with. Sandboxes and cloud workers have gone mainstream. The group pushes back on the wave of AI-justified layoffs, worrying that companies are cutting the very mentorship and middle-layer "glue" needed to steer agents. They also dig into tokenomics: budgets blown by mid-year, tools that can cost more than the engineer using them, and Angie's hard-won lesson at Block that getting 95% of engineers onto coding agents produced no velocity until she funded a small group of "AI champions" to learn the tools properly. Tokens, everyone agrees, are not the same as value. As to what the group has found effective for agentic engineering, Angie makes the case for RPI (Research, Plan, Implement) from HumanLayer and for adversarial review. A 32-file refactor that earned a clean pass from Codex made her a believer. Alongside review skills, the [Council of Mine MCP server](https://github.com/block/mcp-council-of-mine), and Jesse Vincent's [Superpowers](https://github.com/obra/superpowers) skill pack; Dan adds Wes McKinney's [RoboRev](https://github.com/wesm/roborev) for continuous background review. The episode closes on the human side: whether "we're all managers now," the identity crisis facing engineers who loved the craft, how Angie found the same flow state building agents that she once found writing code, and how all of this democratizes building for non-engineers. A few quick stops to discuss the token-saving Caveman skill, naming your agents, and a duck-themed calendar app. There's still no free lunch, Dan notes, but the price has come down. At least until the next model drops. Full episode notes Click here to view the episode transcript. Chapters (00:00) - Welcome and introductions (01:49) - Inside the Agentic AI Foundation and the ambassador program (03:38) - A global slate of events and meetups (07:13) - What engineers are doing differently than six months ago (10:16) - Agentic engineering at enterprise scale and cloud workers (12:32) - Governance, the harness, and sandboxes (15:10) - Do we still need managers and the human 'glue'? (21:14) - The bill comes due: AI tool budgets (23:39) - Tokens aren't velocity and the 'AI champions' experiment (28:25) - Front-loading design versus vibe coding (30:15) - RPI and Codex as co-reviewer (34:15) - Adversarial review, Council of Mine, and Superpowers (39:34) - Robo Rev and the QA-agent pattern (42:58) - Agents, data analysis, and specifying the problem (46:33) - Are we all managers now? (48:00) - The Caveman skill and the limits of saving tokens (51:49) - Naming agents, Codex pets, and Quakpit (56:16) - Managing agents versus the joy of writing code (01:02:07) - Democratizing building and the falling price of software   Links from the show -------------------- Agentic AI Foundation RPI (Research, Plan, Implement) Superpowers roborev Council of Mine Caveman cmux context rot LLM Council (Andrej Karpathy) MLOps Community Davis Treybig Quakpit Dask Flying Toasters (After Dark) Broomy   Guests ------- Angie Jones, VP, Agentic AI Foundation LinkedIn   Demetrios Brinkman, Founder, MLOps Community Website LinkedIn   Follow the podcast ------------------- LinkedIn Threads Instagram TikTok   Follow Dan Gerlanc ------------------- X LinkedIn Threads Bluesky

    1h 5m
  4. May 21

    Claude-maxxing: Burning $10K in tokens for only $50 with a custom software factory

    Dan and Ian Stokes-Rees, founder and CEO of PNI AI Studio, open by discussing the thesis of Ian's company: an opinionated stack of open-source tools wrapped in agentic AI so business analysts, managers, and finance teams can get the capabilities of a senior data scientist without learning Python, SQL, or R. Ian's primary target is financial services, where an estimated 200+ million weekly Excel users still run human-driven, tacit-knowledge processes. He frames a second opportunity, capturing the "AI exhaust" coming out of those workflows, as the seed for a follow-on product. The conversation turns to how Ian actually builds his product. Ian walks through a three-phase evolution: Cursor as a coding assistant, prompt-based Claude Code generation, and finally a full agentic team modeled on Steve Yegge's "Gas Town" post. Today he runs six to ten Claude Code agents in named roles. Xavier and Yasmin are Agile Process Managers, Anne is the Principal Architect. Now add in software engineers, QA engineers, a test engineer, and a release manager. The agents' operating manual is a roughly 5,000-line AGENTS.md tree spread across about 45 markdown files and served via MkDocs. The Kanban lives in GitHub Projects, milestones serve as sprints, story points and labels drive the workflow, and a "kaizen accumulator" task captures learnings each sprint that get translated into process changes at the start of the next one. Next up, diving into token-maxxing. Ian explains why he keeps hitting Claude Max 20x weekly limits on day three of a sprint — five software engineering agents plus two QA agents burning tokens in parallel — and the management tricks he's adopted: Caveman to enforce terse prompts, templated processes, a catalog of deterministic scripts behind self-documenting skills, pre-commit hooks, and roughly a dozen CI gates that run Claude and Codex reviews against PR templates. Still, not everything is perfect in agent-land. Ian describes his agents as "solid second quartile" engineers. They're fast, pleasant, and (currently) inexpensive, but wrong in meaningful ways on one PR in five. Vibe coding works for prototypes and small reports, but serious systems still need human-driven design thinking, separation of concerns, and testing discipline. Perhaps the current moment is an "interregnum" between 25 years of established software practice and an agent-native future. Could this one day be a software factory with human "forepersons" running follow-the-sun shifts over agents that never sleep? The episode closes with a warning about "AI brain fry" that comes from work products arriving ten times faster than humans produce them. Full episode notes Click here to view the episode transcript. Chapters (00:00) - Welcome and Ian's background (01:09) - PNI AI Studio: enterprise AI analytics for Excel users (04:10) - Financial services, Excel jockeys, and AI exhaust (08:26) - Three phases from Cursor to an agent team (10:28) - Building from Steve Yegge's Gas Town (12:21) - A six-to-ten agent Scrum team (15:38) - Paperclip and migrating to a server (17:35) - Hitting Claude Max 20x weekly limits (19:47) - Token management: Caveman, skills, scripts (21:13) - Mining AI session exhaust and Diane.ai (28:37) - Agentic development as engineering craft (31:12) - Sprint planning in the interregnum (36:50) - GitHub Projects as Kanban, milestones as sprints (42:20) - The agents.md file tree and skills (50:23) - CI gates and code review (53:06) - What you still don't trust from agents (57:51) - Why vibe coding doesn't scale yet (59:41) - AI brain fry: managing agents vs. humans (01:02:06) - A software factory with human foremen (01:05:43) - Wrap up and how to reach Ian   Links from the show -------------------- PNI AI Studio Anaconda Steve Yegge's Gas Town Paperclip GitHub Projects MkDocs Caveman When Using AI Leads to "Brain Fry" (HBR) Jensen Huang GTC 2026 keynote Kaizen Extreme Programming (XP)   Guests ------- Follow the podcast ------------------- LinkedIn Threads Instagram TikTok   Follow Dan Gerlanc ------------------- X LinkedIn Threads Bluesky

    1h 8m
  5. May 9

    Vibe Coding in the Physical World: Robotics, Circuits, and Dangerous Permissions

    Dan and Greg discuss Revise Robotics, where Greg serves as founding engineer building robotic systems that refurbish discarded corporate laptops for donation. The episode opens with a description of how AI vision models allow robots to navigate unfamiliar BIOS screens and unpredictable laptop states dynamically — a capability that wasn't feasible a few years ago. Greg reflects on how LLM-powered vision surprised even him as a "second gift," enabling a kind of general adaptability that previously would have required exhaustively pre-coded state machines. The conversation digs into Greg's hands-on experience using Claude for hardware projects, most vividly illustrated by an Arduino RPC library he built on a Raspberry Pi in under five minutes — a task he estimates would have taken a full day by hand. Greg draws a sharp distinction between projects where AI delivers near-100x speedup (well-defined problems with existing patterns and a testable harness) versus cases where it gets confidently stuck in loops. His Minivac 601 circuit simulator project becomes the central cautionary example: months of fruitless AI-assisted attempts to simulate relay circuits collapsed once he realized he needed a real physics engine rather than asking the AI to re-derive Kirchhoff's laws from scratch. A recurring theme is the tension between speed and trust. Greg describes his journey from clicking "yes" to every Claude permission prompt, to briefly trying sandboxing tools like Nono, to ultimately running Claude with dangerously-skip-permissions locally — partly out of pragmatism, partly because he concluded the permission theater wasn't actually catching anything. He shares his "committee of elders" technique, routing important decisions through Claude, Gemini, and ChatGPT simultaneously and only proceeding when all three agree. Dan shares his MMI hook tool, which intercepts Claude's bash calls to enforce conventions like always using uv instead of raw Python. The episode closes with a candid discussion of the emotional and societal costs of this pace. Greg describes a new kind of frustration — distinct from normal debugging — when an AI tool fails after drawing you deep into a rabbit hole. He and Dan also address broader concerns: the acceleration of security vulnerabilities, the environmental cost of GPU compute, and AI-driven job displacement. Both acknowledge they can't stop using these tools even as they see the harms compounding, and end on a cautiously hopeful note about open-source and local models eventually offering more control. Chapters: (00:00) - Introduction and guest background (01:20) - Revise Robotics: refurbishing laptops with robots (04:16) - AI vision models navigating unpredictable hardware (07:36) - LLMs as a force multiplier for small teams (11:42) - Who gets the most out of working with LLMs? (15:34) - Claude hooks and the MMI permission tool (17:51) - Going dangerously: skipping Claude permissions (22:19) - Hardware with Claude: the Arduino library story (27:23) - Estimating the 100x speedup (30:23) - Vibe-coding the office network with MicroTik (35:57) - The committee of elders: multi-model verification (44:07) - Where AI fails: the Minivac 601 circuit simulator (54:27) - 3D and CAD as another AI blind spot (55:54) - Closed loops, tests, and why they make AI coding work (59:47) - Mental fatigue from AI-assisted development (01:05:29) - Security risks and societal costs of AI acceleration (01:10:36) - Open-source and local models as a path forward Click here to view the episode transcript.

    1h 12m
5
out of 5
7 Ratings

About

The podcast about Agentic AI and Software Engineering. Each episode is a conversation with people whose daily lives most intersect with AI and agentic systems. Join me as I follow the stories, the behind-the-scenes, and the real people behind the code.

You Might Also Like