The Information Bottleneck

Ravid Shwartz-Ziv & Allen Roush

Two AI Researchers - Ravid Shwartz Ziv, and Allen Roush, discuss the latest trends, news, and research within Generative AI, LLMs, GPUs, and Cloud Systems.

  1. After Math Falls, What's Next?  with Julia Kempe (NYU/Meta)

    17H AGO

    After Math Falls, What's Next? with Julia Kempe (NYU/Meta)

    Julia Kempe on Why Math Will Fall Next, Superhuman Provers, and the Return of the Renaissance Researcher In this episode, we sit down with Julia Kempe, a Professor at NYU's Center for Data Science and researcher at Meta FAIR's Foundations of Reasoning team,  for a wide-ranging conversation on the future of AI research. We dig into why verifiable domains like mathematics may be on track to "fall" the way Go did. With formal verification through Lean and the Mathlib infrastructure, LLM agents can now generate and check proofs at scale, and Julia makes the case that a new industry of automated mathematical discovery is closer than most mathematicians believe. We explore why Erdős problems are already falling, what's still missing for harder fields like analysis and physics, and how synthetic data, curation, and verification fit together. From there we get into the energy and scaling limits of frontier models, the case for academic research that big labs can't pursue, how to advise PhD students when Claude can already do their first-year work, the rise of AI safety and security as research priorities, and Julia's optimistic argument that AI tools are bringing back the Renaissance generalist  -  the researcher who can finally work fluently across math, biology, and beyond. Timeline 00:00 — Introductions01:00 — Defining reasoning and verifiable domains04:00 — Lean, Mathlib, and the formalization of mathematics10:00 — Constructive proofs, Erdős problems, and the new wave of "AI mathematicians"14:00 — Will math be "solved"? Art, photography, and the changing nature of creative work18:00 — Why physics is harder than math22:00 — Moravec's paradox, evolution, and why robotics lags behind language27:00 — The Renaissance is back: generalist researchers in the age of AI29:00 — Advising students: math, programming, and what core education still matters32:00 — Teaching and assessment when GPT can do the homework35:00 — Anti-AI backlash, energy costs, and the security threat40:00 — Scaling vs. efficiency42:00 — Model collapse, synthetic data, and what's left to squeeze from the internet44:00 — What's exciting next: AI for science, safety, robotics, memory, and planning47:00 — Annotation costs as a proxy50:00 — Superhuman models and what security even means against them52:00 — AlphaGo as precedent for verifiable superhuman performance54:00 — Hallucination, the Mirage paper, and whether these are solvable problems56:00 — Why coding isn't fully solved yet58:00 — Agent security, prompt injection, and the Wild West of deployed agents1:01:00 — Regulation: what's needed and what's possible1:04:00 — Advice for PhD students and what research academia should pursue1:09:00 — Startup opportunities: robotics, security, and AI for finance1:12:00 — Closing thoughts: use the tools, and build grassroots AI for goodMusic: "Kid Kodi" - Blue Dot Sessions - via Free Music Archive - CC BY-NC 4.0."Palms Down" - Blue Dot Sessions - via Free Music Archive - CC BY-NC 4.0.Changes: trimmedAbout: The Information Bottleneck is hosted by Ravid Shwartz-Ziv and Allen Roush, featuring in-depth conversations with leading AI researchers about the ideas shaping the future of machine learning.

    1h 15m
  2. Intelligence in an Open World - with Mengye Ren (NYU)

    5D AGO

    Intelligence in an Open World - with Mengye Ren (NYU)

    We talk with Mengye Ren, Assistant Professor at NYU's Center for Data Science, about what intelligence actually means once you step outside a benchmark, and why scaling a single centralized model isn't the whole story. We get into why intelligence has to be defined in open environments, not closed ones, and what that means for how we measure progress. We push on the creativity question: today's models sample bottom-up from a softmax or a Gaussian, with no internal loop of consideration, and as Mengye puts it, we haven't understood creativity yet and we're already prepared to hand it over. We also talk about what's missing for the next paradigm: continual learning, memory, embodied grounding, and smaller models that actually accumulate experience instead of re-deriving everything from scratch each call. Along the way, we get into JEPA and latent variables, biology as inspiration vs. blueprint, why frontier labs don't lean on explicit latents, the limits of synthetic data and world models, agent-to-agent communication, model uncertainty and forecasting, and whether ML education still matters when AI writes the experiments. A grounded, contrarian conversation about where AI research should be looking next, beyond benchmarks, beyond scale. Timeline00:00 — Intro and welcome 01:24 — What is intelligence? Defining it relative to objectives and open environments 04:19 — Is intelligence really the path to human flourishing, or is it productivity? 04:57 — Safety, scalable oversight, and whether stronger models help or hurt 06:09 — What does "alignment" actually mean? 07:18 — Centralized vs. decentralized models: objectivity vs. personal meaning 08:50 — Hinton vs. LeCun: where Mengye stands on AI risk 10:29 — Bottom-up vs. top-down architectures and feedback loops 21:28 — Biology and AI: inspiration, not blueprint 24:14 — Biological plausibility, spiking nets, and where the analogy breaks 25:39 — JEPA, Mamba, and architectures beyond the transformer 27:31 — Language as a special modality: abstraction built for communication 29:04 — Are we too locked into the current paradigm? Risk of creativity collapse 30:09 — Synthetic data, simulation, and the brain's own generative models 31:43 — World models and physical AI: how babies actually learn 33:03 — The case for smaller, continually learning models 37:02 — The role of academic research in a frontier-lab world 39:47 — Why LLMs aren't funny: the creativity gap 40:35 — What research areas matter most: embodiment, continual learning, creativity 42:05 — Creativity is bounded by experience — and why bottom-up sampling isn't enough 45:35 — Agent-to-agent communication and the limits of sub-agents 46:39 — Model confidence, epistemic uncertainty, and forecasting 49:44 — Tokenization, static vs. dynamic worlds, and always-learning systems 52:20 — Latent variables, JEPA, and why frontier models skip them 53:40 — The future of ML education when AI writes the experiments Music: "Kid Kodi" - Blue Dot Sessions - via Free Music Archive - CC BY-NC 4.0."Palms Down" - Blue Dot Sessions - via Free Music Archive - CC BY-NC 4.0.Changes: trimmedAbout: The Information Bottleneck is hosted by Ravid Shwartz-Ziv and Allen Roush, featuring in-depth conversations with leading AI researchers about the ideas shaping the future of machine learning.

    59 min
  3. Language, Cognition, and the Limits of LLMs - with Tal Linzen (NYU/Google)

    MAY 17

    Language, Cognition, and the Limits of LLMs - with Tal Linzen (NYU/Google)

    We host Tal Linzen, Associate Professor at NYU and Research Scientist at Google, for a conversation on the intersection of cognitive science and large language models. We discussed why children can learn language from around 100 million words while LLMs need trillions, and the surprising finding that as models get better at predicting the next word, they become worse models of how humans actually process language. Tal walked us through how his lab uses eye-tracking and reading-time data to compare model behavior to human behavior, and what that reveals about prediction, working memory, and the limits of current architectures. We also got into nature versus nurture and how inductive biases can be instilled by pre-training on synthetic languages, world models and whether transformers actually use the geometric structure they encode, the BabyLM challenge and data-efficient language learning, and what mechanistic interpretability can offer cognitive science beyond just fixing model bugs. The conversation closed on academia versus industry, the role of PhDs in the current AI moment, and how AI coding tools are changing the way Tal teaches and evaluates students at NYU. Timeline 00:13 — Intro and what cognitive science means02:16 — Using computational simulations to understand how humans learn language05:26 — How children learn language vs. how LLMs are pre-trained07:53 — Why mainstream LLMs are not good models of humans 10:07 — Comparing humans and models with eye-tracking and reading behavior13:52 — Sensory modalities, smell, and how much you can learn from language alone16:03 — Animal cognition and decoding animal communication17:00 — Nature vs. nurture, inductive biases, and what transformers can and can't learn21:21 — Instilling inductive biases through synthetic languages 27:34 — The bouba/kiki effect and cross-linguistic sound symbolism28:33 — Latent causal structure in language and whether models discover it31:13 — Does knowing linguistics help build better models?35:07 — World models: what they mean, and why transformers encode geometry but don't use it39:13 — Tokenization, and why Tal doesn't like it41:35 — Scaling laws and the inverse-U curve of model quality vs. human fit44:34 — Where the human–model mismatch comes from: architecture, memory, and data47:08 — Diffusion language models and sentence planning48:21 — Data quality, synthetic data, and curriculum effects50:54 — Comparing models at different training stages to human development; BabyLM54:40 — What level of the model should we actually probe? Representations vs. behavior1:01:04 — Mechanistic interpretability, Deep Dream, and human dreaming1:02:11 — Cognitive neuroscience, intracranial recordings, and working memory1:10:31 — Should you still do a PhD in 2026?1:12:31 — Will software engineers lose their jobs to AI?1:17:43 — Teaching in the age of coding agents: what changes in the classroom1:20:54 — What's next: human-like LLMs as user simulators, and recruitingMusic: "Kid Kodi" - Blue Dot Sessions - via Free Music Archive - CC BY-NC 4.0."Palms Down" - Blue Dot Sessions - via Free Music Archive - CC BY-NC 4.0.Changes: trimmedAbout: The Information Bottleneck is hosted by Ravid Shwartz-Ziv and Allen Roush, featuring in-depth conversations with leading AI researchers about the ideas shaping the future of machine learning.

    1h 23m
  4. The Principles of Diffusion Models -  with Jesse Lai (Sony AI)

    MAY 10

    The Principles of Diffusion Models - with Jesse Lai (Sony AI)

    We host Chieh-Hsin (Jesse) Lai, Staff Research Scientist at Sony AI and visiting professor at National Yang Ming Chiao Tung University, Taiwan, for a conversation about diffusion models, the technology behind tools like Stable Diffusion, and most of the AI image and video generators you've seen in the last few years. Jesse recently co-authored The Principles of Diffusion Models with Stefano Ermon, and the book is quickly becoming a go-to reference in the field. We start with what a generative model actually is, and what it means to "generate" an image or a sound. Jesse explains the core idea behind diffusion in plain terms. You start with pure noise, and a neural network gradually cleans it up, step by step, until a realistic image emerges. From there, we talk about why diffusion has come to dominate so much of generative AI. Because the model builds an image gradually, you can guide it along the way, nudging the output toward what you actually want, refining details, or combining it with other controls. We also discuss the common critique that diffusion is slow and how the field has largely addressed it through new techniques. We zoom out to the bigger picture, too. Jesse shares his view on world models and whether diffusion is the right foundation for them. We talk about what makes a generative model genuinely good versus just good at gaming benchmarks, and why evaluating creativity and realism is so much harder than scoring a multiple-choice test. Timeline 00:12 — Intro and welcoming Jesse 00:47 — Why Jesse wrote the book, and who it's for 03:29 — The three families of diffusion models, and why they're really one idea 05:14 — What makes a good generative model 07:39 — How do you even measure if a generated image is good 08:59 — Why diffusion beats autoregressive models for images 10:33 — Is diffusion still slow? How fast generation got fast 11:12 — A simple intuition for what a "score" is 14:12 — How the different flavors of diffusion connect under the hood 14:42 — Diffusion for text and proteins 17:12 — Consistency models and the push for one-step generation 22:12 — Diffusion for world models: simulating reality in real time 26:12 — Do world models need to understand language 35:12 — Is diffusion the right tool, or just a convenient one 38:12 — What benchmarks actually tell us, and what they miss 46:12 — Closing thoughts and where to find the book Music: "Kid Kodi" - Blue Dot Sessions - via Free Music Archive - CC BY-NC 4.0."Palms Down" - Blue Dot Sessions - via Free Music Archive - CC BY-NC 4.0.Changes: trimmedAbout: The Information Bottleneck is hosted by Ravid Shwartz-Ziv and Allen Roush, featuring in-depth conversations with leading AI researchers about the ideas shaping the future of machine learning.

    56 min
  5. Inside xAI, and the Bet on AI Math - with Christian Szegedy (Math Inc)

    MAY 4

    Inside xAI, and the Bet on AI Math - with Christian Szegedy (Math Inc)

    We talked with Christian Szegedy, co-inventor of Inception and Batch Normalization, founding scientist at xAI, now at Math Inc, about what it takes to build a frontier lab, and why he left xAI to work on formal mathematics. Christian thinks Lean and auto-formalization are the missing piece for trustworthy AI: a machine-checkable layer underneath all reasoning, where proofs are guaranteed correct without anyone having to read them. We got into his bet with François Chollet that AI will hit superhuman mathematician level by 2026, and what that actually unlocks beyond math itself: verified software instead of vibe-coded apps that break when you refactor, AI systems you can actually trust because their reasoning is checkable, and a path to handling protein folding, chemistry, and parts of biology with real guarantees instead of hand-waving. Christian also walked us through how Math Inc's Gauss system pulled off a proof in two weeks that human experts had estimated would take another year. We also covered xAI's first 12-person year, why Christian no longer buys the original batch normalization story, why he's sure transformers won't be the dominant architecture in five years, what mathematicians do in a world of cheap proofs, and his take on whether humanity will handle AI well. He distrusts humanity more than he distrusts AI. Timeline00:12 — Intros: Christian's background (Inception, Batch Norm, xAI, Math Inc) 01:29 — Building a frontier lab from scratch: the first 12 people at xAI 04:15 — Hiring for proven track records when 200K GPUs are at stake 06:07 — Elon's "dependency graph" and balancing long-term vision with investor demos 07:28 — Gauss formalizes the strong prime number theorem in 2 weeks 12:25 — What "formalization" actually means (and why it's not what most people think) 14:39 — Why Lean gives 100% certainty and why that matters for RL 15:26 — ProofBridge and joint embeddings across mathematical subfields 18:07 — Does math formalization transfer to coding and other fields? 21:44 — Can every domain be mathematized? 23:14 — Verified software, chip design, and why vibe-coded apps are dangerous 26:35 — Scaling Mathlib by 100–1000x 28:27 — Artisan formalizers vs. invisible machine-language formalists 33:26 — Can verification generalize? 45:19 — Revisiting Batch Norm: covariate shift, loss landscape, and what really happens 48:22 — Is normalization even necessary? 50:10 — What's actually fundamental in modern AI architectures 51:41 — Why Christian thinks transformers won't last 5 years 52:38 — The 2026 superhuman AI mathematician bet 55:15 — What's missing: better verification + a much larger formalized math repository 56:13 — Lean vs. Coq vs. HOL Light -  does the proof assistant actually matter? 59:26 — The role of mathematicians in 5–10 years 1:02:00 — A human element to mathematics: Newton, Leibniz, and competitive proving 1:03:25 — The telescope analogy: AI as the instrument that lets us see the math universe 1:05:19 — Job apocalypse or Jevons paradox? 1:08:41 — Advice for students 1:09:50 — Can we formally verify AI alignment? 1:11:52 — Closing thanks Music: "Kid Kodi" - Blue Dot Sessions - via Free Music Archive - CC BY-NC 4.0."Palms Down" - Blue Dot Sessions - via Free Music Archive - CC BY-NC 4.0.Changes: trimmedAbout: The Information Bottleneck is hosted by Ravid Shwartz-Ziv and Allen Roush, featuring in-depth conversations with leading AI researchers about the ideas shaping the future of machine learning.

    1h 13m
  6. Reasoning Models and Planning - with Rao Kambhampati (Arizona State)

    APR 29

    Reasoning Models and Planning - with Rao Kambhampati (Arizona State)

    We sat down with Rao Kambhampati, a Professor of CS at Arizona State University and former President of AAAI, to talk about reasoning models: what they are, when they work, and when they break. Rao has been working on planning and decision-making since long before deep learning, which makes him one of the most grounded voices on what today's reasoning systems actually do. We start with definitions of what reasoning is, why planning is the hard subset of it, and what changed when systems like o1 and DeepSeek R1 moved the verifier from inference into post-training. From there we get into where these models generalize, where they don't, and why benchmarks can be misleading about both. A big chunk of the conversation is on chain-of-thought: what intermediate tokens are actually doing, why they help the model more than they help the reader, and what outcome-based RL does to whatever semantic content was there to begin with. We also cover world models and why Rao thinks the video-only framing is the wrong bet, the difference between agentic safety and existential risk, and what the planning community figured out decades ago that the LLM community keeps rediscovering. Timeline(00:12) Intros(01:32) Defining "reasoning" and the System 1 / System 2 framing(04:12) Blocksworld vs Sokoban, and non-ergodicity(06:42) Pre-o1: PlanBench and "LLMs are zero-shot X" papers(07:42) LLM-Modulo and moving the verifier into post-training(10:12) Is RL post-training reasoning, or case-based retrieval?(13:12) τ-Bench and benchmarks that avoid action interactions(14:12) OOD generalization and what we don't know about post-training data(19:02) Does it matter how they work if they answer the questions we care about?(21:27) Architecture lotteries and why no one tries different designs(23:42) Intermediate tokens and the "reduce thinking effort" cottage industry(26:12) The 30×30 maze experiment(27:42) Sokoban, NetHack, and Mystery Blocksworld(34:58) Stop Anthropomorphizing Intermediate Tokens — the swapped-trace experiment(46:12) Latent reasoning, Coconut, and why R0 beat R1(50:12) How outcome-based RL erodes CoT semantics(52:12) Dot-dot-dot and Anthropic's CoT monitoring paper(53:42) Safety: Hinton, Bengio, LeCun(57:12) Existential risk vs real safety work(59:42) World models, transition models, and video-only approaches(1:03:12) Why linguistic abstractions matter — pick and roll(1:05:42) What the planning community knew in 2005(1:08:12) Multi-agent LLMs(1:09:57) Closing thoughts: the bridge analogy Music: "Kid Kodi" - Blue Dot Sessions - via Free Music Archive - CC BY-NC 4.0."Palms Down" - Blue Dot Sessions - via Free Music Archive - CC BY-NC 4.0.Changes: trimmedAbout: The Information Bottleneck is hosted by Ravid Shwartz-Ziv and Allen Roush, featuring in-depth conversations with leading AI researchers about the ideas shaping the future of machine learning.

    1h 12m
  7. What Actually Matters in AI? - with Zhuang Liu (Princeton)

    APR 24

    What Actually Matters in AI? - with Zhuang Liu (Princeton)

    In this episode, we hosted Zhuang Liu, Assistant Professor at Princeton and former researcher at Meta, for a conversation about what actually matters in modern AI and what turns out to be a historical accident. Zhuang is behind some of the most important papers in recent years (with more than 100k citations): ConvNeXt (showing ConvNets can match Transformers if you get the details right), Transformers Without Normalization (replacing LayerNorm with dynamic tanh), ImageBind, Eyes Wide Shut on CLIP's blind spots, the dataset bias work showing that even our biggest "diverse" datasets are still distinguishable from each other, and more. We got into whether architecture research is even worth doing anymore, what "good data" actually means, why vision is the natural bridge across modalities but language drove the adoption wave, whether we need per-lab RL environments or better continual learning, whether LLMs have world models (and for which tasks you'd need one), why LLM outputs carry fingerprints that survive paraphrasing, and where coding agents like Claude Code fit into research workflows today and where they still fall short. Timeline 00:13 — Intro 01:15 — ConvNeXt and whether architecture still matters 06:35 — What actually drove the jump from GPT-1 to  GPT-3 08:24 — Setting the bar for architecture papers today 11:14 — Dataset bias: why "diverse" datasets still aren't 22:52 — What good data actually looks like 26:49 — ImageBind and vision as the bridge across modalities 29:09 — Why language drove the adoption wave, not vision 32:24 — Eyes Wide Shut: CLIP's blind spots 34:57 — RL environments, continual learning, and memory as the real bottleneck 43:06 — Are inductive biases just historical accidents? 44:30 — Do LLMs have world models? 48:15 — Which tasks actually need a vision world model 50:14 — Idiosyncrasy in LLMs: pre-training vs post-training fingerprints 53:39 — The future of pre-training, mid-training, and post-training 57:57 — Claude Code, Codex, and coding agents in research 59:11 — Do we still need students in the age of autonomous research? 1:04:19 — Transformers Without Normalization and the four pillars that survived 1:06:53 — MetaMorph: Does generation help understanding, or the other way around? 1:09:17 — Wrap Music: "Kid Kodi" - Blue Dot Sessions - via Free Music Archive - CC BY-NC 4.0."Palms Down" - Blue Dot Sessions - via Free Music Archive - CC BY-NC 4.0.Changes: trimmedAbout: The Information Bottleneck is hosted by Ravid Shwartz-Ziv and Allen Roush, featuring in-depth conversations with leading AI researchers about the ideas shaping the future of machine learning.

    1h 10m
  8. The Future of Coding Agents with Sasha Rush (Cursor/Cornell)

    APR 15

    The Future of Coding Agents with Sasha Rush (Cursor/Cornell)

    We talked with Sasha Rush, researcher at Cursor and professor at Cornell, about what it actually feels like to we in the heart of the AI revolution and build coding agents right now. Sasha shared how these systems are changing day-to-day work and how it feels to develop these systems. A big part of the conversation was about why coding has become such a powerful setting for these tools. We discussed what makes code different from other domains, why agents seem to work especially well there, and how much of today’s progress comes not just from better models, but from better ways of using them. Sasha also gave an inside look at how Cursor thinks about training coding models, long-running agents, context limits, bug finding, and the balance between autonomy and human oversight. We also talked about the broader shift happening in software engineering. Are developers moving to a higher level of abstraction? Is this just a phase where we “babysit” models, or the beginning of a deeper change in how software gets built? Sasha had a very thoughtful perspective here, including what he’s seeing from students, researchers, and engineers who are growing up native to these tools. More broadly, this episode is about what it means to do serious technical work in a moment when the tools are changing incredibly fast. Sasha brought both optimism and skepticism to the discussion, and that made this a really grounded conversation about where coding agents are today, what they are already surprisingly good at, and where all of this might be going next. Timeline 00:00 Intro and Sasha joins us 01:11 What “coding agents” actually mean 02:34 Why coding became the breakout use case 08:56 Long-running agents and autonomous workflows 15:08 How these tools are changing the work of engineers 17:15 Are people just babysitting models right now? 22:11 How Cursor builds its coding models 26:29 Rewards, training, and what makes agents work 34:53 Memory, continual learning, and agent communication 38:00 How context compaction works in practice 41:29 Why coding agents recently got much better 50:31 Refactoring, maintenance, and self-improving codebases 52:16 Bug finding, oversight, and verification 54:43 Will this pace of progress continue? 56:42 Can this spread beyond coding? 58:27 The future of Cursor and coding agents 1:03:08 Model architectures beyond standard transformers 1:05:37 World models, diffusion, and what may come next Music: "Kid Kodi" - Blue Dot Sessions - via Free Music Archive - CC BY-NC 4.0."Palms Down" - Blue Dot Sessions - via Free Music Archive - CC BY-NC 4.0.Changes: trimmedAbout: The Information Bottleneck is hosted by Ravid Shwartz-Ziv and Allen Roush, featuring in-depth conversations with leading AI researchers about the ideas shaping the future of machine learning.

    1h 25m

Ratings & Reviews

5
out of 5
5 Ratings

About

Two AI Researchers - Ravid Shwartz Ziv, and Allen Roush, discuss the latest trends, news, and research within Generative AI, LLMs, GPUs, and Cloud Systems.

You Might Also Like