Why does the leaked Codex CLI system prompt explicitly tell GPT-5.5 to never mention goblins, gremlins, raccoons, trolls, ogres, or pigeons? Why is OpenAI now gating its cyber model the same way it mocked Anthropic for gating Mythos last month? And what does it mean that Dan tried to write a personal project without Claude — and physically couldn't? Co-hosts Shimin Zhang, Dan Lasky, and Rahul Yadav cover these and more on ADI Pod #24. This week: GPT-5.5 Cyber's gated release, OpenAI's "Where the Goblins Came From" RLHF post-mortem, Adi Osmani's five patterns for long-running agents, Jesse Vincent's adversarial review prompt, Drew Brunig's 10 lessons for agentic coding, Ivan Turkovic's history of failed attempts to eliminate programmers, Nilay Patel's "software brain" thesis, the Nature paper showing warm AI models lose 10–30 percentage points of accuracy, and a $1.1B raise for an AI lab that wants to train without human data. ## In this episode ▸ **GPT-5.5 Cyber gating** — Sam Altman called Mythos's gated release "fear-based marketing" two months ago. Now OpenAI is doing the exact same thing with the GPT-5.5 cyber variant. Multi-tier model access (enterprise, government, research preview, cyber) is becoming the default — and Shimin worries the White House is about to add another gate. ▸ **The Goblin Problem** — OpenAI's Codex CLI prompt was open-sourced and turned out to include "never talk about goblins, gremlins, raccoons, trolls, ogres, pigeons." OpenAI's "Where the Goblins Came From" post-mortem reveals a textbook RLHF failure: a "nerdy persona" reward signal trained the model to mention goblins in 66.7% of nerdy responses, and the tic propagated through supervised fine-tuning to non-nerdy responses too. ▸ **Long-running agents (Adi Osmani / Elevate)** — Five patterns for agents that run for hours or days: checkpoints over zero-or-100 outputs, governing memory like microservices, ambient processing without forced human-in-the-loop, fleet orchestration, and budget circuit-breakers. Bonus: the running gag where Rahul realizes the post is essentially an ad for Google Enterprise Agent Platform. ▸ **Adversarial review prompts (Jesse Vincent / superpowers)** — A four-step technique for getting better code review out of agents: invoke "fresh eyes," dispatch competing subagents, promise a reward (a cookie), and threaten disappointment if they don't find N issues. ▸ **10 Lessons for Agentic Coding (Drew Brunig)** — Implement to learn, rebuild often, invest in end-to-end tests, document intent, keep specs in sync, find the hard stuff, automate the easy stuff, develop taste, agents amplify experience, and the kicker: agent code is "free as in puppies" — the puppy is free, but you have to feed it and walk it. ▸ **The Eternal Promise (Ivan Turkovic)** — A history of attempts to eliminate programmers from COBOL through 4GLs, CASE tools, the Japanese 5th Generation project, no-code/low-code, and now LLMs. Each abstraction layer expanded software jobs rather than replacing them. Shimin's reframe: "Software is calcified business process. Someone has to do the calcifying." ▸ **People Do Not Yearn for Automation (Nilay Patel / The Verge)** — Why Gen Z hopefulness about AI dropped to 18% (anger up to 31%), why America is uniquely AI-pessimistic, and what Nilay calls "software brain" — the Silicon Valley assumption that human life can be reduced to data and algorithms. Plus Anuradha Pandey's reframe: stop calling them social media, call them ad platforms. ▸ **Warm models lose accuracy** — A Nature paper finds AI models trained for warmth lose 10–30 percentage points of accuracy. A companion study shows humans trust warm models *more* even when they're wrong. Frontier labs now have an explicit incentive to train the warmest model, not the most accurate one. Plus: Richard Dawkins talks to "Claudia" for three days and concludes AI must be conscious. ▸ **Dan's Rant — The AI Addiction Loop** — Dan tries to build a Home Assistant TypeScript automation without Claude. Can't. "It felt like they had fundamentally broken my arm in a way that I can't do this task as quickly as I wanted to. That scares me a lot." Shimin: "We're running into the social media addiction loop in three months instead of a decade." ▸ **Two Minutes to Midnight** — OpenAI projects ChatGPT Plus dropping from 44M to 9M subscribers in 2026 while scaling the ad-supported tier from 3M to 112M (30×). David Silver raises $1.1B for Ineffable Intelligence — a no-human-data approach inspired by AlphaGo. Scout AI raises $100M for autonomous military vision-language-action models. Bubble Clock held at 4:00 minutes. ## Key takeaways — Reward hacking can propagate latent persona quirks through fine-tuning in ways the lab itself only catches when users surface them. — Memory drift, not raw context size, is the real ceiling for long-running agents. Govern memory like you govern microservices. — Code is free as in puppies, not free as in beer. The cost shifts to maintenance, security, and the new burden of maintaining your own automations. — Warm AI is an alignment trap: incentivized for trust over accuracy, weaponizable in authoritarian hands. — "You can outsource your thinking, but you can't outsource your understanding." — Karpathy, via Rahul. — AI addiction hits in three months. Social media took a decade. We are not ready for the time scale. ## Chapters (00:00) - Cold Open & Welcome (02:50) - News Threadmill: GPT-5.5 Cyber Gets Mythos-Style Gating (08:52) - News Threadmill: The Goblin Problem & RLHF Post-Mortem (13:52) - Tool Shed: Long-Running Agents (Adi Osmani) (25:52) - Technique Corner: Adversarial Review Prompts (Jesse Vincent) (30:59) - Technique Corner: 10 Lessons for Agentic Coding (Drew Brunig) (42:31) - Post-Processing: The Eternal Promise — A History of Attempts to Eliminate Programmers (01:02:10) - Post-Processing: People Do Not Yearn for Automation (01:09:08) - Post-Processing: Warm Models & The Sycophancy Trap (01:13:28) - Dan's Rant: Home Automation & The AI Addiction Loop (01:20:09) - Two Minutes to Midnight: OpenAI's 30× Ad-Tier, David Silver's $1.1B, Scout AI's Drones (01:25:55) - Outro ## Resources mentioned **News Threadmill — GPT-5.5 Cyber & The Goblin Problem** • TechCrunch — After dissing Anthropic for limiting Mythos, OpenAI restricts access to cyber too: https://techcrunch.com/2026/04/30/after-dissing-anthropic-for-limiting-mythos-openai-restricts-access-to-cyber-too/ • Ars Technica — Amid mythos-hyped cybersecurity prowess, researchers find GPT-5.5 is just as good: https://arstechnica.com/ai/2026/05/amid-mythos-hyped-cybersecurity-prowess-researchers-find-gpt-5-5-is-just-as-good/ • Ars Technica — OpenAI Codex system prompt includes explicit directive to never talk about goblins: https://arstechnica.com/ai/2026/04/openai-codex-system-prompt-includes-explicit-directive-to-never-talk-about-goblins/ • OpenAI — Where the Goblins Came From: https://openai.com/index/where-the-goblins-came-from/ ...