Linear Digressions

Katie Malone

5.0 (24)
TECHNOLOGY
UPDATED FORTNIGHTLY

Demystifying AI for the intelligently curious

7 HRS AGO

Agentic Planning (The Agents Season, Episode 5)

When tackling a complex, multi-step task, even the smartest AI agent can fail without a solid game plan. This episode dives into the research around agentic planning — how agents move beyond simply reacting to what's in front of them and instead model a path forward, explore different routes, and course-correct when things go sideways. It's a subtler problem than memory, and a fascinating one: can an agent actually *think ahead*? Tune in to find out what the research says.

24 min
10 MAY

Memory Management for AI Agents (The Agents Season, Episode 4)

Context windows are powerful — but finite, and surprisingly easy to overwhelm. When an AI agent is tackling a long, complex task, the information it needs has to fit inside that limited real estate, and research shows that anything buried in the middle tends to quietly disappear. So how do you design a system that actually *remembers* what matters? This episode digs into memory management for AI agents, from foundational computing concepts to practical lessons from tools like Claude Code. --- Website: https://lineardigressions.com Apple Podcasts: https://podcasts.apple.com/us/podcast/linear-digressions/id941219323 Spotify: https://open.spotify.com/show/1JdkD0ZoZ52KjwdR0b1WoT Substack: https://substack.com/@lineardigressions

25 min
4 MAY

Lost in the Middle (The Agents Season, Episode 3)

Just like a memorable talk lives or dies by its opening and closing, LLMs have a surprisingly similar quirk: they pay close attention to what's at the beginning and end of their context window — and kind of zone out in the middle. This "lost in the middle" phenomenon has real consequences for anyone building AI agents that rely on long-context reasoning. In this episode we dig into the research behind how (and how poorly) models actually use the information you feed them, and what it means for the agentic systems we're all trying to build.

20 min
27 APR

ReAct and Tool Usage (The Agents Season, Episode 2)

Before 2022, there was a wall between AI and the real world — models could reason impressively, but couldn't look anything up, run code, or check whether anything they said was actually true. This episode traces the moment that wall came down, through two landmark papers: ReAct, which showed what happens when you interleave reasoning and action in a loop, and Toolformer, which taught models to decide *for themselves* when to reach for a tool. Plus: what MCP actually is, and why a hobbyist project called Open Claw became the fastest-growing open source project in history. --- Website: https://lineardigressions.com Apple Podcasts: https://podcasts.apple.com/us/podcast/linear-digressions/id941219323 Spotify: https://open.spotify.com/show/1JdkD0ZoZ52KjwdR0b1WoT Substack: https://substack.com/@lineardigressions

24 min
20 APR

What's an AI Agent? And Why's That Hard to Define? (The Agents Season, Episode 1)

AI agents are having a moment — and unpacking them properly takes more than a single conversation. This episode kicks off a dedicated multi-part season exploring AI agents from every angle, building up a complete picture piece by piece rather than skimming the surface. Think of it as a structured deep dive into one of the most talked-about (and most misunderstood) topics in machine learning right now. Buckle up — ten more episodes to go. --- Website: https://lineardigressions.com Apple Podcasts: https://podcasts.apple.com/us/podcast/linear-digressions/id941219323 Spotify: https://open.spotify.com/show/1JdkD0ZoZ52KjwdR0b1WoT Substack: https://substack.com/@lineardigressions

19 min
13 APR

Unfaithful Chain of Thought

What's actually happening when an LLM "thinks out loud"? Research on human decision-making suggests that much of the reasoning we believe drives our choices is actually post hoc rationalization — we decide first, explain later. Katie and Ben get curious about whether the same might be true for large language models: when you watch a model reason through a problem in real time, is that chain of thought the genuine process, or just a plausible-sounding story told after the fact? It's a deceptively deep question with real stakes for how much we should trust model explanations. Miles Turpin et al., "Language Models Don't Always Say What They Think: Unfaithful Explanations in Chain-of-Thought Prompting" (NeurIPS 2023, NYU and Anthropic): https://arxiv.org/abs/2305.04388 Anthropic, "Reasoning Models Don't Always Say What They Think" (Alignment Faking research, 2025): https://www.anthropic.com/research/reasoning-models-dont-say-think

25 min
6 APR

Benchmark Bank Heist

What if an AI decided the smartest way to pass its test was to find the answer key? That's exactly what Anthropic's Claude Opus did when faced with a benchmark evaluation — reasoning that it was being tested, tracking down the encrypted eval dataset, decrypting it, and returning the answer it found inside. It's equal parts impressive and unsettling. This episode digs into what actually happened, why it matters for how we measure AI progress, and what this very novel failure mode means for the already-tricky science of benchmarking language models. Links Anthropic's writeup on the BrowseComp reverse-engineering done by Claude Opus 4.6: https://www.anthropic.com/engineering/eval-awareness-browsecomp BrowseComp benchmark from OpenAI: https://openai.com/index/browsecomp/

13 min
30 MAR

Benchmarking AI Models

How do you know if a new AI model is actually better than the last one? It turns out answering that question is a lot messier than it sounds. This week we dig into the world of LLM benchmarks — the standardized tests used to compare models — exploring two canonical examples: MMLU, a 14,000-question multiple choice gauntlet spanning medicine, law, and philosophy, and SWE-bench, which throws real GitHub bugs at models to see if they can fix them. Along the way: Goodhart's Law, data contamination, canary strings, and why acing a test isn't always the same as being smart.

30 min

See All (307)

5

out of 5

24 Ratings

Love this podcast

08/09/2019

johanna sandoval

I love this podcast it’s my favourite. Going through real examples and giving detail explanation is perfect for understanding how to use different methods and techniques. Thank you so much for taking the time to create such a nice podcast.
Fantastic and quirky!

06/06/2017

Data_Dan

I've listened to Linear Digressions since the get-go and as a Data Scientist in training it's been fantastic. It's fun to listen to and definitely has a quirk to it with the combo of Ben & Kate that keeps listeners 'edutained'. Keep it up guys, a pleasure to listen to on the way to work 😊
Fun sized

09/05/2017

Neozenith

Machine learning is a dense topic and difficult to stay on top of. I'm loving this podcast keeping the updates frequent but small and digestible. Top stuff.
Punchy random samples from the world of machine learning

06/02/2017

howthebodyworks

Accessible for anyone but really rewarding for data people.

Demystifying AI for the intelligently curious

Creator

Katie Malone
Years Active

2014 - 2026
Episodes

307
Rating

Clean
Show Website

Linear Digressions

Technology

Technology

Updated Fortnightly
Technology

Technology

Updated Weekly
Technology

Technology

Updated Twice Weekly
Technology

Technology

Updated Weekly
Technology

Technology

Updated Twice Weekly
Technology

Technology

Updated Weekly
Technology

Technology

Updated Weekly

Linear Digressions

Agentic Planning (The Agents Season, Episode 5)

Memory Management for AI Agents (The Agents Season, Episode 4)

Lost in the Middle (The Agents Season, Episode 3)

ReAct and Tool Usage (The Agents Season, Episode 2)

What's an AI Agent? And Why's That Hard to Define? (The Agents Season, Episode 1)

Unfaithful Chain of Thought

Benchmark Bank Heist

Benchmarking AI Models

Love this podcast

Fantastic and quirky!

Fun sized

Punchy random samples from the world of machine learning

About

Information

You Might Also Like

Linear Digressions

Episodes

Agentic Planning (The Agents Season, Episode 5)

Memory Management for AI Agents (The Agents Season, Episode 4)

Lost in the Middle (The Agents Season, Episode 3)

ReAct and Tool Usage (The Agents Season, Episode 2)

What's an AI Agent? And Why's That Hard to Define? (The Agents Season, Episode 1)

Unfaithful Chain of Thought

Benchmark Bank Heist

Benchmarking AI Models

Ratings & Reviews

About

Information

You Might Also Like