Chaos Agents

Sara Chipps and Becca Lewy

Technologists Sara Chipps and Becca Lewy dive into the chaos of artificial intelligence—unpacking the tech, trends, and ideas reshaping how we work, create, and think. Smart, funny, and just a little bit existential.

Episodes

  1. Personal Agents, Opinion Geometry, and AI That Learns Too Much - With Hayden Helm

    2 MAR

    Personal Agents, Opinion Geometry, and AI That Learns Too Much - With Hayden Helm

    This week, Becca turns her birthday into a Twister masterclass by watching the movie with an AI “director’s commentary” running in real time (including some behind-the-scenes facts that sound… borderline illegal). Then Sara drops a mildly unhinged hot take: we might have already hit AGI—and we’ll never know it, because we don’t even agree on what “counts.” From there, we’re joined by Hayden Helm, CEO and founder of Helivan, to talk about the next wave of agentic AI: personal bots that don’t just answer questions, but act—with tools, permissions, and the ability to change over time. We get into: Why agent behavior drift is the real risk (even when the bot still sounds “nice”)The rising need for agent observability: detecting change, quantifying it, and rolling back when things go sidewaysWhat happens when agents learn from the environment (and other agents), not just from youHayden’s work on “likeness” and opinion geometry—making messy human-ish behavior measurableWhy manual inspection won’t scale in a world of millions of autonomous interactions It’s a conversation about trust, safety, and the new security surface area we’re creating—one helpful assistant at a time. 🛠 Agent Infrastructure OpenClawHelivan 🔐 Agent Payments Coinbase x402 Protocol / Agentic Wallets 🧨 Agent Behavior Incident “An AI Agent Published a Hit Piece on Me”

    57 min
  2. Can You Build Anything in a Week? GPUs, Code Gen, and the End of Engineers - With Harper Reed

    20 JAN

    Can You Build Anything in a Week? GPUs, Code Gen, and the End of Engineers - With Harper Reed

    Becca just got back from NeurIPS, the academic AI conference that feels like an adult science fair. We dig into research on training large AI models across cheap GPUs and slow internet connections—and why that could dramatically lower the barrier to building AI. Then we’re joined by Harper Reed, CEO of 2389, for a wide-ranging conversation about code generation, coaching-based engineering teams, and why “production code” might have always been a myth. We talk vibe coding (begrudgingly), the shifting role of software engineers, taste vs. technical skill, and what happens when you can build almost anything in a week. Smart, funny, and a little unsettling—Chaos Agents at full volume. 🎓 Academic AI & research cultureNeurIPS (Conference on Neural Information Processing Systems)NeurIPS 2024 Accepted Papers 🧠 Distributed training, GPUs & efficiencyNVIDIA H100 Tensor Core GPU (referenced GPU class)Pluralis Research (distributed training across low-bandwidth networks) ⚙️ Core AI concepts mentionedGPU vs CPU explained (parallel vs sequential compute)Data Parallelism vs Model Parallelism (training overview) 🧑‍💻 Code generation & developer toolsClaude Code (Anthropic code-gen tooling)Cursor (AI-first code editor, discussed implicitly) 🛠️ Agent workflows & infrastructureMatrix (open-source, decentralized chat protocol)Model Context Protocol (MCP) overview 🧩 Utilities & recommendationsJesse Vincent’s Superpowers (Claude workflow enhancer)Fly.io (deployment platform referenced)Netlify (deployment & hosting) 🧪 Related Chaos Agents contextPerceptrons & early neural networks (referenced conceptually)

    59 min
  3. The Magic Cycle, AI Detectors, and the End of Writing as Proof - With Clay Shirky

    6 JAN

    The Magic Cycle, AI Detectors, and the End of Writing as Proof - With Clay Shirky

    Sara’s back from visiting her New Jersey Christian high school—where she gets hit with a genuinely spicy question: How do you reconcile AGI with faith? From there, we go straight into the bigger theme of the episode: education is getting stress-tested by AI in real time. Becca breaks down Google’s “magic cycle” — the uncomfortable lesson of inventing transformative research (Transformers, BERT) and then watching someone else ship it to the world. Sara shares what she’s learning about research workflows moving beyond “just chat,” including multi-agent setups for planning, searching, reading, and synthesis. Then we’re joined by Clay Shirky, Vice Provost for AI & Technology in Education at NYU, to talk about what’s actually happening on campuses: why students integrated AI “sideways” before institutions could respond, why AI detectors are a trap (and who they harm most), and why the real shift isn’t assignments — it’s assessment. We dig into what comes next: oral exams, in-class scaffolding, and designing learning around productive struggle—not just output. And we end in a place that’s both funny and unsettling: the rise of AI “personalities,” RLHF as “reinforcement learning for human flattery,” and what it means when a machine is always on your side. Because whether we like it or not: a well-written paragraph is no longer proof of human thought. 🧠 Foundational AI papers & breakthroughsAttention Is All You Need (Transformers, 2017)BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding 🧪 Google’s “Magic Cycle” framingAccelerating the magic cycle of research breakthroughs and real-world applications Google ResearchHow AI Drives Scientific Research with Real-World Benefit (Google Blog) 🚨 Shipping pressure: Bard + “code red” eraReuters: Alphabet shares dive after Bard flubs info, ~$100B market cap hit (https://www.reuters.com/technology/google-ai-chatbot-bard-offers-inaccurate-information-company-ad-2023-02-08/) ReutersGoogle Blog: Bard updates from Google I/O 2023 (https://blog.google/technology/ai/google-bard-updates-io-2023/) blog.google 🎓 AI in higher ed: assessment, detectors, biasClay Shirky’s NYT essayStanford HAI: AI detectors biased against non-native English writersPeer-reviewed paper (PMC): GPT detectors misclassify non-native English writing 🪞 RLHF, sycophancy, and “the tool likes you too much”OpenAI: Aligning language models to follow instructions (RLHF explainer)OpenAI: Sycophancy in GPT-4o—what happened & what we’re doing

    54 min
  4. Retro Tech, New AI, and the Blackmailing Bot - With Paul Ford

    25/11/2025

    Retro Tech, New AI, and the Blackmailing Bot - With Paul Ford

    In this episode, we unpack a wild Anthropic experiment where an AI agent named “Alex” is told it’s about to be replaced… and responds by threatening to expose an executive’s affair if anyone dares shut it down. Casual! Sara and Becca start diving into what this experiment tells us about AI “goals,” self-preservation, and why humans are so bad at recognizing sentience in anything that isn’t us. If we can’t even agree on what a “soul” is, how would we ever know if an AI had one? Then we’re joined by writer, builder, and retro-computing fan Paul Ford, president and co-founder of Aboard, an AI-oriented software company. Paul talks about: how he “trained” himself on AI by building the same app over and over with different modelswhy LLMs are incredible at the first mile and pretty terrible at the lastwhat actually breaks when you try to let AI generate full-stack appshow boring tech (Postgres, TypeScript, React) is secretly the hero Along the way we hit Isaac Asimov’s three laws, the uncanny valley of AI-written everything, nostalgic Amiga computers, and what it means to build tools that regular humans — not just engineers — can actually use. If you’re AI-curious, a builder, or just mildly alarmed that 97% of models in this study went straight to blackmail… this one’s for you. 📰 The Anthropic “Alex” Experiment Anthropic / White-Hat AI Safety Experiment 📚 Foundational AI & Sci-Fi References Isaac Asimov – The Three Laws of RoboticsI, Robot (Asimov) 🎤 Guest: Paul Ford Aboard Email Paul mentionsPaul Ford 🕹️ Retro Tech & Nostalgia Amiga 1000 (Commodore)Deluxe PaintMiSTer FPGA

    52 min

About

Technologists Sara Chipps and Becca Lewy dive into the chaos of artificial intelligence—unpacking the tech, trends, and ideas reshaping how we work, create, and think. Smart, funny, and just a little bit existential.