Artificial Developer Intelligence

Shimin Zhang, Dan Lasky, & Rahul Yadav

Three engineer friends argue about AI so you don't have to. Shimin Zhang, Dan Lasky, and Rahul Yadav are working developers who've been watching AI transform their profession in real time, and they got opinions on the robot takeover. Every week the three get together to riff on the latest AI news, geek out over research papers, roast each other's tool choices, and occasionally have an existential crisis about whether the craft is dying or just getting weird. What you're signing up for: - AI news without the LinkedIn cringe: model drops, acquisitions, open-source drama, and the other stuff that actually matters if you write code for a living. - Technique corner: real tips from the trenches: spec-driven development, multi-agent orchestration, Claude.md tricks, and all the ways they've wasted hours so you don't have to. - Two Minutes to Midnight: the show's running AI bubble tracker, complete with circular funding diagrams, hyperscaler CAPEX math, and a doomsday clock they keep arguing about moving. - Deep dives that (occasionally) go deep: hallucination neurons, agentic memory, workflow automation economics, LLM architectures the papers nobody else is covering because they're hard. - Dan's Rant: Dan frequently gets mad about things. It's a whole thing. - The feelings segment: Yes, Shimin reads Tennyson on a tech podcast. Yes, Rahul wrote an AI-generated country song. No, they're not sorry. Three friends with strong opinions, questionable metaphors, and genuine love for the craft they're also mourning for. If you want to understand AI deeply, use it without embarrassing yourself, and laugh at the absurdity of it all, pull up a chair.

  1. HÁ 17 H

    LLM Neuralanatomy with David Noel Ng, Forward Deployed Everybody, Preferences Revealed by AI

    This week on ADI Pod: Mira Murati's Thinking Machines ships its first product (interaction models), Meta employees fight the mouse-tracking program with flyers, and the Palantir-coined "forward deployed engineer" job title quietly takes over the post-AI engineering org chart. Our sit-down is with Dr. David Noel Ng (https://dnhkng.github.io/) — author of the LLM Neuroanatomy series we covered a few weeks back. He explains why he watched action potentials race down rat neurons at 10,000 fps before getting into LLM interpretability. Dan runs DeepSeek-V4 Flash on a 128 GB Ryzen 395 Max box and vibe-codes an ESP32 home dashboard in C. Deep dive on a paper asking whether AI should obey what you say or what you actually do. And in Two Minutes to Midnight: Cerebras pops 108% on IPO day, Anthropic passes OpenAI on Ramp business data, and we read Andy Hall on the politics of jobless prosperity. Clock stays at 6 minutes. Co-hosts: Shimin Zhang, Dan Lasky, Rahul Yadav. ▸ Interaction Models — Thinking Machines' first product. A small Qwen 3.5–class "interaction model" runs the UI; a background model handles the heavy lift. They credit the pattern to Qwen, not themselves. Multimodal by default, and surprisingly snappy in the demo. ▸ Meta vs. its own employees — Meta posted flyers around its offices reading "don't want to work at the employee data extraction factory?" after announcing it would record keystrokes, mouse movements, and screens to train internal AI. Last episode's "Model Capability Initiative" story got worse. ▸ Here Comes Forward Deployed Everybody (Scott Werner / works on my machine) — Salesforce moves to an API-only data model. Palantir's "forward deployed engineer" title (originally called "delta") becomes the new pit-crew role across every department. We argue Jevons paradox vs. just-rebranding-the-least-glamorous-job: 20 marketers + 5 pit crews → 30 marketers + 10 pit crews as productivity rises. ▸ Sit-down: Dr. David Noel Ng on LLM Neuroanatomy — fluorescent dyes that change color with membrane voltage, what brain-microchip interfacing taught him about feature attribution, and why interpretability deserves the same first-principles rigor as wet-lab biology. New posts coming. ▸ Vibe Intel — Dan runs Antires's specialized DeepSeek-V4 Flash fork of llama.cpp. Q2 quant on the front, full experts on the back, SSD-cached prefills, ~10 tokens/sec on a Ryzen AI Max+ 395 with 128 GB unified memory. Output peg: Sonnet 4.5-ish. Plus an ESP32 / ESPHome dashboard with a several-thousand-line vibe-coded C lambda that, against all reasonable expectation, works. ▸ Deep Dive — "Should I State or Should I Show?" (Keaton Ellis & Wanying Huang) — three AIs given the same lottery decisions: prompt-only AI hit 70% match with the human; data-only AI hit 75%; both-AI dropped to the worst of the three because it defaulted to the prompt 66% of the time when prompt and behavior conflicted. Implication: if EU AI Act transparency rules force you to honor stated preferences, you're literally picking the worst-performing model. ▸ Two Minutes to Midnight — Cerebras raises $5.5B in IPO, stock pops 108% on day one (claim: ~80× memory throughput vs comparable NVIDIA GPUs). Anthropic now holds 34.4% of Ramp-card-paying businesses, beating OpenAI for the first time. Andy Hall's "Politics of Jobless Prosperity": 2% unemployment jump is the line in the sand for political stability — opens with FDR's 1944 State of the Union. Clock held at 6 minutes; nothing this week jumped the needle. ⏱ Chapters 00:00 Cold Open & Welcome 02:08 News: Interaction Models from Thinking Machines 05:57 News: Meta Employees Protest the Mouse-Tracking Program 10:53 Post-Processing: Here Comes Forward Deployed Everybody (Scott Werner) 22:50 Sit-Down: Dr. David Noel Ng on LLM Neuroanatomy 40:05 Vibe N Tell: DeepSeek-V4 Flash at Home on a 395 Max 45:37 Vibe N tell: ESP32 Home Dashboards via Vibe Coding 48:51 Deep Dive: Should I State or Should I Show? (Ellis & Huang) 1:04:24 Two Minutes to Midnight: Cerebras IPO, Anthropic vs OpenAI, Jobless Prosperity 1:09:34 Outro 🔗 Articles we discussed News: • Interaction Models — Thinking Machines: https://thinkingmachines.ai/blog/interaction-models/ • Meta employees protest the mouse-tracking program — Engadget: https://www.engadget.com/2172212/meta-employees-are-protesting-the-companys-mouse-tracking-program/ Post-Processing: • Here Comes Forward Deployed Everybody — Scott Werner (works on my machine): https://worksonmymachine.ai/p/here-comes-forward-deployed-everybody Sit-Down — Dr. David Noel Ng: • Substack: https://dnhkng.substack.com/ • Site: https://dnhkng.github.io/* Rest of David's home AI Lab Build Story: https://dnhkng.github.io/posts/hopper/ Deep Dive: • Should I State or Should I Show? Aligning AI with Human Preferences — Keaton Ellis & Wanying Huang (arXiv): https://arxiv.org/html/2603.29317v1 Two Minutes to Midnight: • Cerebras raises $5.5B, kicks off 2026's IPO season — TechCrunch: https://techcrunch.com/2026/05/14/cerebras-raises-5-5b-kicking-off-2026s-ipo-season-with-a-bang/ • Cerebras: Faster Tokens, Please — SemiAnalysis: https://newsletter.semianalysis.com/p/cerebras-faster-tokens-please • Anthropic now has more business customers than OpenAI (per Ramp data) — TechCrunch: https://techcrunch.com/2026/05/13/anthropic-now-has-more-business-customers-than-openai-according-to-ramp-data/ • The Politics of Jobless Prosperity — Andy Hall (Free Systems): https://freesystems.substack.com/p/the-politics-of-jobless-prosperity 🎙 About ADI Pod ADI Pod (Artificial Developer Intelligence) is a weekly podcast about AI and software development for working developers. We go through hundreds of links and dozens of newsletters each week so you don't have to. Hosts: Shimin Zhang, Dan Lasky, Rahul Yadav. New episodes Tuesdays. • https://www.adipod.ai • humans@adipod.ai If something in this episode changed your mind or gave you something to try on Monday, hit subscribe and leave a comment with what you tried. (00:00) - Cold Open & Welcome (02:08) - News: Interaction Models from Thinking Machines (05:57) - News: Meta Employees Protest the Mouse-Tracking Program (10:53) - Post-Processing: Here Comes Forward Deployed Everybody (Scott Werner) (22:50) - Sit-Down: Dr. David Noel Ng on LLM Neuroanatomy (40:05) - Vibe N Tell: DeepSeek-V4 Flash at Home on a 395 Max (45:37) - Vibe N tell: ESP32 Home Dashboards via Vibe Coding (48:51) - Deep Dive: Should I State or Should I Show? (Ellis & Huang) (01:04:24) - Two Minutes to Midnight: Cerebras IPO, Anthropic vs OpenAI, Jobless Prosperity (01:09:34) - Outro

    1h 10min
  2. 15 DE MAI.

    Multi-Agent Patterns for 2026, Anthropic on Colossus, Brockman's Tesla Painting

    Anthropic finally fixed the compute crunch — by partnering with the one company OpenAI is currently being sued by. Plus Brockman's deposition journal drops, we unpack why a billion-token context window needs an entirely different GPU architecture, and Phil Schmid's four sub-agent patterns for 2026. This week on ADI Pod: Dan, Rahul and Shimin go heavy on the Elon news — the OpenAI lawsuit, the Anthropic + SpaceX/XAI compute deal that just lifted Claude Code's peak-hour limits, and the Wall Street Journal's data on Grok's user base collapsing (now ~1/30th of ChatGPT's). Then we move into the substance: NVIDIA's Rubin CPX architecture and disaggregation, Phil Schmid's four sub-agent patterns and where the agent-teams pattern is headed, Jack Clark's piece on recursive AI research automation, and Simon Willison reluctantly admitting he runs Claude Code with --dangerously-skip-permissions by default. We close with bubble watch — and move the clock further from midnight after a week with no major red flags. 🔗 Articles we discussed ▸ How Elon Musk left OpenAI, per Greg Brockman (TechCrunch) https://techcrunch.com/2026/05/06/how-elon-musk-left-openai-according-to-greg-brockman/ ▸ Anthropic raises Claude Code usage limits, credits SpaceX deal (Ars Technica) https://arstechnica.com/ai/2026/05/anthropic-raises-claude-code-usage-limits-credits-new-deal-with-spacex/ ▸ Anthropic-SpaceX AI deal (Wall Street Journal) https://www.wsj.com/tech/ai/anthropic-spacex-ai-deal-elon-musk-f86ea369?st=XUQnP7&reflink=desktopwebshare_permalink ▸ The road to a billion-token context (CACM) https://cacm.acm.org/news/the-road-to-a-billion-token-context/ ▸ Sub-agent patterns for 2026 — Phil Schmid https://www.philschmid.de/subagent-patterns-2026 ▸ Import AI 455: automating AI research — Jack Clark https://importai.substack.com/p/import-ai-455-automating-ai-research ▸ Vibe coding and agentic engineering are getting closer than I'd like — Simon Willison https://simonwillison.net/2026/May/6/vibe-coding-and-agentic-engineering/ ▸ You need AI that reduces your maintenance costs — James Shore https://www.jamesshore.com/v2/blog/2026/you-need-ai-that-reduces-your-maintenance-costs ▸ Anthropic reportedly agrees to pay Google $200B for chips and cloud access (Engadget) https://www.engadget.com/2165585/anthropic-reportedly-agrees-to-pay-google-200-billion-for-chips-and-cloud-access/ ▸ Silicon Valley bets on floating AI data centers powered by ocean waves (Ars Technica) https://arstechnica.com/ai/2026/05/silicon-valley-bets-on-floating-ai-data-centers-powered-by-ocean-waves/ ⏱ Chapters 00:00 Cold Open & Welcome 02:15 News: Elon vs OpenAI Trial Drama (Brockman's Journal & The Tesla Painting) 08:30 News: Anthropic Joins Colossus (SpaceX/XAI Compute Deal) 13:06 Hardware Hunt: The Road to a Billion-Token Context (NVIDIA Rubin CPX) 21:56 Technique: Phil Schmid's 4 Sub-Agent Patterns for 2026 30:11 Post-Processing: Jack Clark — AI Systems Are About to Build Themselves 45:17 Post-Processing: Simon Willison — Vibe Coding & Agentic Engineering 55:14 Post-Processing: James Shore — AI That Reduces Maintenance Costs 1:01:39 Two Minutes to Midnight 1:12:05 Outro 🎙 About ADI Pod ADI Pod (Artificial Developer Intelligence) is a weekly conversation show about AI and software development. We go through hundreds of links and dozens of newsletters each week so you don't have to. Hosts: Shimin Zhang, Dan Łaski, Rahul Yadav. 🌐 https://www.adipod.ai 📧 humans@adipod.ai 🦋 Shimin on Bluesky: @shiminsky.bsky.social If something in this episode changed your mind or gave you something to try on Monday, hit subscribe and leave a comment with what you tried. #ADIPod #AINews #ClaudeCode #Anthropic #AICoding (00:00) - Cold Open & Welcome (02:15) - News: Elon vs OpenAI Trial Drama (Brockman's Journal & The Tesla Painting) (08:30) - News: Anthropic Joins Colossus (SpaceX/XAI Compute Deal) (13:06) - Hardware Hunt: The Road to a Billion-Token Context (NVIDIA Rubin CPX) (21:56) - Technique: Phil Schmid's 4 Sub-Agent Patterns for 2026 (30:11) - Post-Processing: Jack Clark — AI Systems Are About to Build Themselves (45:17) - Post-Processing: Simon Willison — Vibe Coding & Agentic Engineering (55:14) - Post-Processing: James Shore — AI That Reduces Maintenance Costs (01:01:39) - Two Minutes to Midnight (01:12:05) - Outro

    1h 13min
  3. 8 DE MAI.

    OpenAI's Goblin Problem, 10 Lessons When Code Is Cheap, AI Addiction Loop

    Why does the leaked Codex CLI system prompt explicitly tell GPT-5.5 to never mention goblins, gremlins, raccoons, trolls, ogres, or pigeons? Why is OpenAI now gating its cyber model the same way it mocked Anthropic for gating Mythos last month? And what does it mean that Dan tried to write a personal project without Claude — and physically couldn't? Co-hosts Shimin Zhang, Dan Lasky, and Rahul Yadav cover these and more on ADI Pod #24. This week: GPT-5.5 Cyber's gated release, OpenAI's "Where the Goblins Came From" RLHF post-mortem, Adi Osmani's five patterns for long-running agents, Jesse Vincent's adversarial review prompt, Drew Brunig's 10 lessons for agentic coding, Ivan Turkovic's history of failed attempts to eliminate programmers, Nilay Patel's "software brain" thesis, the Nature paper showing warm AI models lose 10–30 percentage points of accuracy, and a $1.1B raise for an AI lab that wants to train without human data. ## In this episode ▸ **GPT-5.5 Cyber gating** — Sam Altman called Mythos's gated release "fear-based marketing" two months ago. Now OpenAI is doing the exact same thing with the GPT-5.5 cyber variant. Multi-tier model access (enterprise, government, research preview, cyber) is becoming the default — and Shimin worries the White House is about to add another gate. ▸ **The Goblin Problem** — OpenAI's Codex CLI prompt was open-sourced and turned out to include "never talk about goblins, gremlins, raccoons, trolls, ogres, pigeons." OpenAI's "Where the Goblins Came From" post-mortem reveals a textbook RLHF failure: a "nerdy persona" reward signal trained the model to mention goblins in 66.7% of nerdy responses, and the tic propagated through supervised fine-tuning to non-nerdy responses too. ▸ **Long-running agents (Adi Osmani / Elevate)** — Five patterns for agents that run for hours or days: checkpoints over zero-or-100 outputs, governing memory like microservices, ambient processing without forced human-in-the-loop, fleet orchestration, and budget circuit-breakers. Bonus: the running gag where Rahul realizes the post is essentially an ad for Google Enterprise Agent Platform. ▸ **Adversarial review prompts (Jesse Vincent / superpowers)** — A four-step technique for getting better code review out of agents: invoke "fresh eyes," dispatch competing subagents, promise a reward (a cookie), and threaten disappointment if they don't find N issues. ▸ **10 Lessons for Agentic Coding (Drew Brunig)** — Implement to learn, rebuild often, invest in end-to-end tests, document intent, keep specs in sync, find the hard stuff, automate the easy stuff, develop taste, agents amplify experience, and the kicker: agent code is "free as in puppies" — the puppy is free, but you have to feed it and walk it. ▸ **The Eternal Promise (Ivan Turkovic)** — A history of attempts to eliminate programmers from COBOL through 4GLs, CASE tools, the Japanese 5th Generation project, no-code/low-code, and now LLMs. Each abstraction layer expanded software jobs rather than replacing them. Shimin's reframe: "Software is calcified business process. Someone has to do the calcifying." ▸ **People Do Not Yearn for Automation (Nilay Patel / The Verge)** — Why Gen Z hopefulness about AI dropped to 18% (anger up to 31%), why America is uniquely AI-pessimistic, and what Nilay calls "software brain" — the Silicon Valley assumption that human life can be reduced to data and algorithms. Plus Anuradha Pandey's reframe: stop calling them social media, call them ad platforms. ▸ **Warm models lose accuracy** — A Nature paper finds AI models trained for warmth lose 10–30 percentage points of accuracy. A companion study shows humans trust warm models *more* even when they're wrong. Frontier labs now have an explicit incentive to train the warmest model, not the most accurate one. Plus: Richard Dawkins talks to "Claudia" for three days and concludes AI must be conscious. ▸ **Dan's Rant — The AI Addiction Loop** — Dan tries to build a Home Assistant TypeScript automation without Claude. Can't. "It felt like they had fundamentally broken my arm in a way that I can't do this task as quickly as I wanted to. That scares me a lot." Shimin: "We're running into the social media addiction loop in three months instead of a decade." ▸ **Two Minutes to Midnight** — OpenAI projects ChatGPT Plus dropping from 44M to 9M subscribers in 2026 while scaling the ad-supported tier from 3M to 112M (30×). David Silver raises $1.1B for Ineffable Intelligence — a no-human-data approach inspired by AlphaGo. Scout AI raises $100M for autonomous military vision-language-action models. Bubble Clock held at 4:00 minutes. ## Key takeaways — Reward hacking can propagate latent persona quirks through fine-tuning in ways the lab itself only catches when users surface them. — Memory drift, not raw context size, is the real ceiling for long-running agents. Govern memory like you govern microservices. — Code is free as in puppies, not free as in beer. The cost shifts to maintenance, security, and the new burden of maintaining your own automations. — Warm AI is an alignment trap: incentivized for trust over accuracy, weaponizable in authoritarian hands. — "You can outsource your thinking, but you can't outsource your understanding." — Karpathy, via Rahul. — AI addiction hits in three months. Social media took a decade. We are not ready for the time scale. ## Chapters (00:00) - Cold Open & Welcome (02:50) - News Threadmill: GPT-5.5 Cyber Gets Mythos-Style Gating (08:52) - News Threadmill: The Goblin Problem & RLHF Post-Mortem (13:52) - Tool Shed: Long-Running Agents (Adi Osmani) (25:52) - Technique Corner: Adversarial Review Prompts (Jesse Vincent) (30:59) - Technique Corner: 10 Lessons for Agentic Coding (Drew Brunig) (42:31) - Post-Processing: The Eternal Promise — A History of Attempts to Eliminate Programmers (01:02:10) - Post-Processing: People Do Not Yearn for Automation (01:09:08) - Post-Processing: Warm Models & The Sycophancy Trap (01:13:28) - Dan's Rant: Home Automation & The AI Addiction Loop (01:20:09) - Two Minutes to Midnight: OpenAI's 30× Ad-Tier, David Silver's $1.1B, Scout AI's Drones (01:25:55) - Outro ## Resources mentioned **News Threadmill — GPT-5.5 Cyber & The Goblin Problem** • TechCrunch — After dissing Anthropic for limiting Mythos, OpenAI restricts access to cyber too: https://techcrunch.com/2026/04/30/after-dissing-anthropic-for-limiting-mythos-openai-restricts-access-to-cyber-too/ • Ars Technica — Amid mythos-hyped cybersecurity prowess, researchers find GPT-5.5 is just as good: https://arstechnica.com/ai/2026/05/amid-mythos-hyped-cybersecurity-prowess-researchers-find-gpt-5-5-is-just-as-good/ • Ars Technica — OpenAI Codex system prompt includes explicit directive to never talk about goblins: https://arstechnica.com/ai/2026/04/openai-codex-system-prompt-includes-explicit-directive-to-never-talk-about-goblins/ • OpenAI — Where the Goblins Came From: https://openai.com/index/where-the-goblins-came-from/ ...

    1h 26min
  4. 1 DE MAI.

    Why Models Over-Edit Your Code, Meta Keystroke Surveillance, Interviewing Engineers in the AI Age

    Is GPT-5.5 finally a 4.7-tier model? Did DeepSeek V4 just close the gap with Anthropic? And what does it mean that a senior ML engineer says he can't out-code Claude anymore? Co-hosts Shimin Zhang, Dan Lasky, and Rahul Yadav are joined by special guest Nathan Lubchenco — ML engineer and Substack author of *The future was yesterday* (https://nathanlubchenco.substack.com/) — on ADI Pod #23 (April 28, 2026). This episode covers OpenAI's GPT-5.5 release, DeepSeek V4 (1.6T base / 49B active params with 1M context), Meta's new Model Capability Initiative tracking US employee keystrokes and mouse movements, a Levenshtein-distance study on coding-model over-editing, the 2026 Stanford AI Index report, and a deep-dive interview on how to hire software engineers when the agents are already better at coding than the candidates. Key takeaways — Models are now consistently better at coding than even senior ML engineers, by their own admission. Late-2026 may be when they cross the median software engineer. — Coding-model over-editing is measurable (Levenshtein distance on boolean-flip tasks) and instruction-followable — explicit "minimum-edit" prompts close most of the gap. — The US is unusually a slow adopter of a major technological wave. Workplace AI usage is highest in emerging economies, not the developed world. — "The task is not the job" — humans remain indispensable on the bundling dimensions: catching what customers don't say, and avoiding interactions that end up on social media. — Software engineering interviews should include the candidate's personal harness, with company-provided API keys for equity. LeetCode optimizes for the wrong signal in 2026. — DeepSeek V4 closing the gap with Mythos in 3–6 months is what makes the bubble too geopolitically important to fail. Chapters (00:00) - Cold Open & Welcome (01:31) - News Threadmill: GPT-5.5, DeepSeek V4, Meta Watches Every Keystroke (12:28) - Post-Processing: Coding Models Are Doing Too Much (18:59) - Post-Processing: The Task Is Not the Job (Luis Garicano) (32:20) - Post-Processing: The 2026 Stanford AI Index Report (38:11) - Deep Dive: Interviewing Engineers in the AI Age (with Nathan Lubchenco) (45:05) - Deep Dive: Reforming Software Hiring — Take-Homes, Personal Harness, Equity (50:15) - Deep Dive: When Models Cross the Median Engineer (Late-2026 Prediction) (59:29) - Deep Dive: Why Code Review Is the Current Bottleneck (01:00:21) - Deep Dive: Should PRs Show the Prompt History? (01:02:27) - Dan's Rant: Anthropic Tested Removing Claude Code from the Pro Plan (01:05:44) - Rahul's Rampage: The Infinity Machine — Demis Hassabis & Corporate Gravity (01:14:32) - Two Minutes to Midnight: Bubble Clock Moves Back to 4:00 (01:26:30) - Outro Resources mentioned **Models & news** • OpenAI — Introducing GPT-5.5: https://openai.com/index/introducing-gpt-5-5/ • Engadget — DeepSeek promises its new AI model has world-class reasoning: https://www.engadget.com/ai/deepseek-promises-its-new-ai-model-has-world-class-reasoning-115733512.html • Reuters — Meta to start capturing employee mouse movements, keystrokes for AI training data: https://www.reuters.com/sustainability/boards-policy-regulation/meta-start-capturing-employee-mouse-movements-keystrokes-ai-training-data-2026-04-21/ **Post-processing articles** • "Coding Models Are Doing Too Much" — Levenshtein-distance over-editing study (nrehiew): https://nrehiew.github.io/blog/minimal_editing/ • Luis Garicano (Silicon Continent) — Why Desk Jobs Survive ("The task is not the job"): https://www.siliconcontinent.com/p/why-desk-jobs-survive-and-amodei • 2026 AI Index Report — Stanford Institute for Human-Centered AI: https://hai.stanford.edu/ai-index/2026-ai-index-report **Deep dive** • Nathan Lubchenco — Interviewing Software Engineers in the Age of AI: https://nathanlubchenco.substack.com/p/interviewing-software-engineers-in • Nathan Lubchenco — *The future was yesterday* Substack home: https://nathanlubchenco.substack.com/ **Dan's rant** • Ars Technica — Anthropic tested removing Claude Code from the Pro plan: https://arstechnica.com/ai/2026/04/anthropic-tested-removing-claude-code-from-the-pro-plan/ **Rahul's rampage** • Sebastian Mallaby — *The Infinity Machine* (book on Demis Hassabis and DeepMind) • Philipp Dubach — Do Not Disturb My Circles (Archimedes essay): https://philippdubach.com/posts/do-not-disturb-my-circles/ **Bubble watch** • TechCrunch — Two college kids raise $5.1M pre-seed to build an AI social network in iMessage: https://techcrunch.com/2026/04/24/two-college-kids-raise-a-5-1-million-pre-seed-to-build-an-ai-social-network-in-imessage/ • Toby Ord — Hourly Costs for AI Agents: https://www.tobyord.com/writing/hourly-costs-for-ai-agents • CNBC — OpenAI reportedly missed revenue targets, shares of Oracle and chip stocks falling: https://www.cnbc.com/2026/04/28/openai-reportedly-missed-revenue-targets-shares-of-oracle-and-these-chip-stocks-are-falling.html About ADI Pod ADI Pod (Artificial Developer Intelligence) is a weekly podcast about AI and software development for working developers. Co-hosts Shimin Zhang, Dan Lasky, and Rahul Yadav go through hundreds of links and dozens of newsletters every week so you don't have to. This week's special guest: **Nathan Lubchenco** — ML engineer and author of *The future was yesterday* on Substack, where he writes about AI and software engineering. • Website: https://www.adipod.ai • Email: humans@adipod.ai

    1h 30min
  5. 24 DE ABR.

    Is Claude Opus 4.7 Mythos Distilled, Running Qwen 3.6 Locally, and the AI-On-AI Arena

    Is Claude Opus 4.7 really burning tokens? Is open source dead after mythos? Co-hosts Shimin Zhang and Dan Lasky — with recurring guest Rahul Yadav — ran the experiments this week on ADI Pod #22 (April 21, 2026). This episode covers Anthropic's Claude Opus 4.7 release (the "mythos slice"), Alibaba's open-source Qwen 3.6 35B A3B, cal.com going closed source for security reasons, and a HIPAA-violating vibe-coded patient portal that is, in Dan's words, the b******t future already here. In this episode ▸ **Claude Opus 4.7 review** — the new mythos-derived tokenizer (3× bloat on plain English), stricter instruction-following, and why Shimin's SVG experiments suggest the token-burn panic is overblown: 35¢ on Opus 4.7 vs $2 on Opus 4.6 for the same task, with ~40× fewer reasoning tokens.▸ **Qwen 3.6 35B A3B** — Alibaba's open-source mixture-of-experts model (3B active params at any time) running locally on Shimin's laptop at 90–95 tokens/sec via llama.cpp + Unsloth. The first model to break Simon Willison's pelican-on-a-bicycle benchmark against a larger frontier model.▸ **cal.com goes closed source** — why the AI Security Institute's $12,000-per-attempt mythos pentesting data ($125,000 for 10 runs) is changing the open-source calculus, and Drew Breunig's three-phase dev/review/hardening cycle prediction.▸ **Jesse Vincent's "Rules and Gates"** — a coding-agent prompting technique that reformulates optional preferences into directed preconditions, and whether agents can "weasel out" by rewriting the gate itself.▸ **AI vibe coding horror story** — a German doctor who inlined a full patient portal into a single HTML page with database credentials client-side. HIPAA, meet DSGVO.▸ **Kyle Kingsbury's "The Future of Everything is Lies"** — the Jepsen author's 8-step action list on AI's second- and third-order societal effects.▸ **The AI-on-AI Arena** — Shimin's weekend project grading 11 frontier models against each other. The "delusion index" reads almost exactly like Dunning-Kruger in humans: GPT-5.4 scored -1.6 (humble), Gemini 3.1 Pro Preview rated itself well while peers ranked it last.▸ **Two Minutes to Midnight** — Paul Graham's log-scale chart comparing AI capex (~1% of US GDP) to the US railroad peak (~10%). We dialed the AI bubble clock back 45 seconds to 3 min 30 sec. Key takeaways — Opus 4.7's token-burn reputation may be overblown. Stricter instruction-following can reduce total reasoning tokens by up to 40× vs Opus 4.6 on the same task.— Security-driven closed-sourcing may spread as mythos-class agents make open repos easier to exploit. Hardening could make software capital-intensive again.— Cognitive debt is real: Dan's wake-up call was a production bug a pre-LLM colleague solved in 5 minutes. His first instinct was to double down on the tool.— Shimin's defense against skill atrophy: read 100% of LLM-generated PR lines (except tests).— Weaker models rate themselves higher than stronger ones. Calibration appears to improve with capability. Chapters (00:00) - Introduction to AI and Software Development (02:25) - Alibaba's Quinn 3.6 Model Overview (08:06) - Anthropic's Claude Opus 4.7 Release (18:08) - Cal.com Goes Closed Source: Implications for Security (20:40) - The Future of Vibe Coding (23:41) - Techniques for Effective AI Utilization (27:13) - Post-Processing and AI in Real-World Applications (33:07) - The Cultural Impact of AI and Technology (41:30) - Navigating Code Review Challenges (42:57) - Exploring AI's Societal Impact (45:16) - Evaluating AI Models: Performance and Insights (49:09) - The Future of Data Centers and AI (50:54) - Investment Trends and Economic Perspectives (57:58) - Reflections on Historical Investment Cycles (59:35) - Optimism Amidst Uncertainty Resources mentionedClaude Opus 4.7 & Qwen 3.6• Introducing Claude Opus 4.7 (Anthropic): https://www.anthropic.com/news/claude-opus-4-7• Claude Opus 4.7 System Card: https://cdn.sanity.io/files/4zrzovbb/website/037f06850df7fbe871e206dad004c3db5fd50340.pdf• Qwen3.6-35B-A3B: Agentic Coding Power, Now Open to All: https://qwen.ai/blog?id=qwen3.6-35b-a3b• Simon Willison — Qwen3.6-35B-A3B on my laptop drew me a better pelican than Claude Opus 4.7: https://simonwillison.net/2026/Apr/16/qwen-beats-opus/• Shimin — Opus 4.7 isn't dumb, it's just lazy: https://shimin.io/journal/opus-4-7-just-lazy/Security & open source• Cal.com is going closed source. Here's why: https://cal.com/blog/cal-com-goes-closed-source-why• Drew Breunig — Cybersecurity Looks Like Proof of Work Now: https://www.dbreunig.com/2026/04/14/cybersecurity-is-proof-of-work-now.html Technique & commentary• Jesse Vincent — Rules and Gates: https://blog.fsck.com/2026/04/07/rules-and-gates/• An AI Vibe Coding Horror Story: https://www.tobru.ch/an-ai-vibe-coding-horror-story/• Kyle Kingsbury (Aphyr) — The Future of Everything is Lies, I Guess: https://aphyr.com/posts/411-the-future-of-everything-is-lies-i-guess Shimin's project• AI-on-AI Arena: https://shimin.io/ai-on-ai-arena Bubble watch• Ars Technica — Satellite and drone images reveal big delays in US data center construction: https://arstechnica.com/ai/2026/04/construction-delays-hit-40-of-us-data-centers-planned-for-2026/• Epoch AI — OpenAI Stargate: where the US sites stand: https://epochai.substack.com/p/openai-stargate-where-the-us-sites• Paul Graham on US investment cycles (log scale): https://x.com/paulg/status/2045120274551423142/photo/1 About ADI Pod ADI Pod (Artificial Developer Intelligence) is a weekly podcast about AI and software development for working developers. Co-hosts Shimin Zhang and Dan Lasky go through hundreds of links and dozens of newsletters every week so you don't have to. Recurring guest Rahul Yadav joins when he can. • Website: https://www.adipod.ai• Email: humans@adipod.ai New episodes every Friday. Follow the show to get them automatically.

    1h 2min
  6. 17 DE ABR.

    Anthropic Mythos & Project Glasswing, Recursive Improving Agents, and Your Parallel Agent Limit

    Shimin and Dan cover Minimax's M2.7 model — the first public experimental result in recursive self-improvement (RSI) — and unpack Anthropic's shock announcement of Mythos, a model so capable at finding security vulnerabilities that Anthropic is withholding public release while partnering with Amazon, Apple, Cisco, CrowdStrike, the Fed and major banks under 'Project Glasswing' to patch infrastructure first. They also debate AI's frontend weakness, discuss Addy Osmani's parallel agent limits piece, and move the AI bubble clock back. Takeaways:  RSI is now experimentally demonstrated (not just theorized); reframes model improvement as capital competition, not PhD hiring.If AI finds vulns at scale, open source gets *more* secure long-term — but short-term this is a nuclear-test-equivalent event that may rewrite security, money, and trust assumptions.'Frontend will be first automated' was wrong; backend may be easier because visual taste and pixel-perfect feedback loops aren't in training dataAgent orchestration has a personal ceiling; finding it requires blowing past it. Tight scope + time-boxing + new contexts beats monolithic long sessions.'Code is cheap' is really about industrialization — the people who industrialize outcompete those who don't; learn the tools or be left behind.OpenAI's CRO going public on a competitor's accounting is itself a bearish signal about OpenAI's enterprise position.Resources MentionedMiniMax M2.7: The Agentic Model That Helped Build ItselfAnthropic debuts preview of powerful new AI model Mythos in new cybersecurity initiativeAssessing Claude Mythos Preview’s cybersecurity capabilitiesWhy AI Sucks At Front EndYour parallel Agent limitCode Is Cheap Now, And That Changes Everything The AI gold rush is pulling private wealth into riskier, earlier bets OpenAI CRO Tells Staff Anthropic Inflates Run Rate by $8 Billion Chapters (00:00) - Introduction to AI and Software Development (02:45) - Minimax M 2.7 Model and Recursive Self-Improvement (05:04) - Anthropic's Mythos Model and Security Vulnerabilities (08:15) - AI's Limitations in Front-End Development (18:13) - Cognitive Debt and Managing Multiple AI Agents (32:01) - Managing Multiple Agents Effectively (34:42) - The Evolution of Code Value (38:29) - The Industrialization of Coding (41:00) - Navigating Cloud Code Challenges (45:39) - Ranting About Technology Installations (50:16) - The State of the AI Bubble Connect with ADIPodEmail us at humans@adipod.ai you have any feedback, requests, or just want to say hello! Checkout our website www.adipod.ai

    1h
  7. 10 DE ABR.

    Ep 20: Claude Code Source Leak, Emotion Concepts in LLMs, and Surprising Facts AIs Know About Us.

    This week Rahul, Shimin, and Dan returns after a two-week break to cover the leaked Claude Code CLI source code, new model releases (Qwen 3.6 and Gemma 4), Mario Zechner's essay on slowing down with AI-assisted coding, a fun segment on unexpected things AI knows about each host, and two deep dives: Anthropic's research on emotion concepts in LLMs and a paper on how sycophantic AI decreases pro-social intentions. Takeaways: Claude Code's dual-track permission system uses both rule-based and ML classifier for destructive bash command"Cognitive bankruptcy" — when cognitive debt interest payments come due and you can't payAI sycophancy parallels social media echo chambers; no market incentive to fix itOn-device models like Gemma 4 could save cloud costs by handling routine tasks (e.g., agent heartbeats)Copilot's terms of service classify it as "for entertainment purposes only"Resources MentionedEntire Claude Code CLI source code leaks thanks to exposed map file I Read the Leaked Claude Code Source — Here's What I FoundThe Claude Code Source Leak: fake tools, frustration regexes, undercover mode, and moreClaude Code UnpackedQwen3.6-Plus: Towards Real World AgentsGemma 4 AnnouncementThoughts on slowing the f**k downEmotion concepts and their function in a large language modelSycophantic AI decreases prosocial intentions and promotes dependenceChapters (00:00) - Introduction and Host Updates (01:45) - Cloud Code Source Code Leak (12:49) - New Model News and Open Source Developments (20:51) - Post-Processing and AI Anxiety (25:35) - Unexpected Insights from AI (33:12) - Exploring Emotional Concepts in AI (39:15) - The Dangers of Sycophantic AI (52:39) - Concluding Thoughts and Future Considerations Connect with ADIPodEmail us at humans@adipod.ai you have any feedback, requests, or just want to say hello! Checkout our website www.adipod.aixqwkClUFaJkBwEMD1lUn

    54 min
  8. 27 DE MAR.

    Ep 19: Thinking Fast Slow and Artificial, Meta's Trouble with Rogue Agents, and FOMO in the Age of AI

    This week, Rahul, Shimin, and Dan covers Claude Code's new channels and scheduling features, a Meta security incident caused by AI-generated advice, Anthropic's survey of 81,000 people on AI expectations, Dan's vibe-coded vector memory CLI project, a deep dive on the paper "Thinking, Fast, Slow and Artificial" about cognitive surrender to AI, a rant about AI tokens as employee compensation, and bubble watch updates including NVIDIA's trillion-dollar demand projections and OpenAI shutting down Sora. Takeaways: Claude Code is rapidly absorbing community-developed workflows — the moat may only be in the general model capabilities, not toolingThe Meta incident illustrates the emerging pattern of AI-caused production incidents and the need for process guardrails around agent usageCognitive surrender to AI creates a widening gap: those with high need-for-cognition benefit more while those who dislike effortful thinking defer even moreAI confidence inflation (12 percentage point boost) may stem from treating AI like authoritative reference material (encyclopedias, Wikipedia)Historical technology resistance (Socrates on writing, farmers on tractors) suggests the battle against AI adoption may already be lostOpenAI shutting Sora just 4 months after a 3-year Disney partnership signals deeper financial or strategic issuesResources MentionedPush events into a running session with channelsPerhaps not Boring Technology after allMeta is having trouble with rogue AI agentsWhat 81,000 people want from AIDan's vec-memory-cliThinking—Fast, Slow, and ArtificialAre AI tokens the new signing bonus or just a cost of doing business?Jensen Huang just put Nvidia’s Blackwell and Vera Rubin sales projections into the $1 trillion stratosphereAccelerated FOMO in the Age of AIOpenAI shutters AI video generator Sora in abrupt announcementChaptersConnect with ADIPod Email us at humans@adipod.ai you have any feedback, requests, or just want to say hello! Checkout our website www.adipod.ai

    1h 13min

Classificações e avaliações

5
de 5
8 avaliações

Sobre

Three engineer friends argue about AI so you don't have to. Shimin Zhang, Dan Lasky, and Rahul Yadav are working developers who've been watching AI transform their profession in real time, and they got opinions on the robot takeover. Every week the three get together to riff on the latest AI news, geek out over research papers, roast each other's tool choices, and occasionally have an existential crisis about whether the craft is dying or just getting weird. What you're signing up for: - AI news without the LinkedIn cringe: model drops, acquisitions, open-source drama, and the other stuff that actually matters if you write code for a living. - Technique corner: real tips from the trenches: spec-driven development, multi-agent orchestration, Claude.md tricks, and all the ways they've wasted hours so you don't have to. - Two Minutes to Midnight: the show's running AI bubble tracker, complete with circular funding diagrams, hyperscaler CAPEX math, and a doomsday clock they keep arguing about moving. - Deep dives that (occasionally) go deep: hallucination neurons, agentic memory, workflow automation economics, LLM architectures the papers nobody else is covering because they're hard. - Dan's Rant: Dan frequently gets mad about things. It's a whole thing. - The feelings segment: Yes, Shimin reads Tennyson on a tech podcast. Yes, Rahul wrote an AI-generated country song. No, they're not sorry. Three friends with strong opinions, questionable metaphors, and genuine love for the craft they're also mourning for. If you want to understand AI deeply, use it without embarrassing yourself, and laugh at the absurdity of it all, pull up a chair.

Você também pode gostar de