The Information Bottleneck

Ravid Shwartz-Ziv & Allen Roush

Two AI Researchers - Ravid Shwartz Ziv, and Allen Roush, discuss the latest trends, news, and research within Generative AI, LLMs, GPUs, and Cloud Systems.

  1. AI for Science with Qichao Hu (Molecular Universe / SES AI)

    1일 전

    AI for Science with Qichao Hu (Molecular Universe / SES AI)

    Most AI-for-science companies are selling shovels. Qichao Hu wants the gold. In this episode, we talk with Qichao, the founder and CEO of Molecular Universe, the AI-for-science platform that grew out of SES AI, a high-energy-density battery developer he's run for fourteen years. His core distinction is that companies from the AI world build tools, such as foundation models that predict properties, while companies from the science world care about the final product, such as the new battery or material that actually ships. Molecular Universe sits firmly on the science side, and the difference shows up everywhere from what they publish to what they refuse to. We get into the actual workflow of materials discovery and where AI compresses it. A single trial in a traditional lab can take a year with maybe a 40% success rate; the goal is to run a thousand candidates in parallel and turn that year into a week. Qichao walks through improving low-temperature fast-charging for EV batteries:  from hypothesis generation through molecule-, material-, and device-level property prediction, down to autonomous labs that synthesize and test the top candidates without a human touching a pipette. The hardest problem, it turns out, isn't predicting molecular properties or measuring device performance, but it's the black box connecting the two. In batteries, that's the solid-electrolyte interface, which the field has been hand-waving about since the seventies. And the thing standing in the way of cracking it isn't a clever training trick but data: companies sitting on twenty years of records are finding it too messy, incomplete, and poorly labeled to train on, and are having to start collecting from scratch with new protocols and robots. Timeline 00:13 — Intro and welcome;01:19 — Shovel vs. gold05:18 — Why the world's smartest scientist doesn't automatically give you a better battery07:25 — The discovery workflow09:37 — Exploration vs. exploitation11:54 — Safety and filtering: screening novel molecules against banned and toxic-substance lists17:55 — How hypotheses get generated, and where frontier LLMs help20:29 — From hypothesis to ~400 formulations: property prediction, ranking, and handing off to autonomous labs26:37 — "A foundation model for everything" — and the black box between molecular properties and device performance30:01 — World models and physics33:09 — The great unknown in batteries37:08 — Simulation vs. reality: calibrating massive simulated datasets with a sliver of experimental data41:47 — Lab robotics: how fast the hardware has caught up, and what a floor of autonomous labs looks like43:50 — The real bottlenecks50:21 — Pre-training from scratch vs. post-training LLMs, and why training tricks haven't reduced the need for good data52:42 — Evaluation55:42 — Publish the B+ model, keep the A model58:05 — Five years out1:00:37 — Closing thoughts and wrapMusic: "Kid Kodi" - Blue Dot Sessions - via Free Music Archive - CC BY-NC 4.0.About: The Information Bottleneck is hosted by Ravid Shwartz-Ziv and Allen Roush, featuring in-depth conversations with leading AI researchers about the ideas shaping the future of machine learning.

    1시간 1분
  2. Infrastructure for AI at Scale - With Benny Chen (Fireworks AI)

    6일 전

    Infrastructure for AI at Scale - With Benny Chen (Fireworks AI)

    We talk a lot on this show about RL, agents, and the move between pre-training and post-training, but not enough about the layer everything actually runs on. Benny Chen, co-founder of Fireworks AI, one of the largest inference platforms around, walks us through what it takes to serve models at scale: sourcing GPUs, writing the kernels, the runtime, and the routing layer that lets a customer hit one endpoint and forget the rest. We talk why the real bottleneck is power, not chips, and why that favors Nvidia and Google. Why MoE keeps winning even when dense models look better on paper and why he'd rather run fungible capacity at 95% than specialized chips at 60%. We also talk about quantization limits, where RL efficiency has to go next, and his case that AI is still under-hyped. We also get into cross-region training, sparse autoencoders and why interpretability hasn't taken off in open source, whether open models can close the gap, and a frank read on Anthropic's go-to-market. Timeline 00:00 — Intro: the part of AI nobody talks about01:20 — What "infrastructure for AI" actually means: the layers, from GPUs up to routing02:59 — Why not just buy your own GPUs and do it yourself?05:17 — The scale Fireworks runs at06:35 — Hardware inflation, GPU costs, and the real risk hiding in commit duration10:14 — Nvidia vs AMD vs TPUs, and why power is the bottleneck11:57 — Mixing GPU types and generations; fungibility vs. specialization14:22 — Once you have the GPUs, what's the next layer to build?17:04 — Dense vs. MoE, and why the hardware picks the winner21:07 — Quantization: is FP4 the floor? TurboQuant and INT vs. FP24:28 — How tied are the algorithms to the hardware?25:12 — DeepSeek, DeepGEMM, and next-token prediction as reconstruction loss28:50 — Why RL is still wildly inefficient compared to pre-training30:08 — Speculative decoding, AI-generated kernels, and auto-research34:00 — The AGI question: why text gets automated but vision may stay expensive37:07 — Hype check: why Benny thinks AI is still under-hyped41:28 — Training vs. inference at the infrastructure level44:12 — Scaling across data centers: cross-region training with Cursor45:40 — Sparse autoencoders, interpretability, and why open source is human-constrained49:04 — Will open models catch up — on quality and on compute?51:41 — Are we plateauing? Opus 4.7 vs. 4.6 and the coming data wars54:41 — Physical limits, HBM, and whether chips keep getting faster58:17 — The belief about inference everyone gets wrong59:31 — Anthropic, mythos, and a frank take on go-to-market1:04:41 — Wrap-up Music: "Kid Kodi" - Blue Dot Sessions - via Free Music Archive - CC BY-NC 4.0. About: The Information Bottleneck is hosted by Ravid Shwartz-Ziv and Allen Roush, featuring in-depth conversations with leading AI researchers about the ideas shaping the future of machine learning.

    1시간 6분
  3. Broken Peer Review, AI, and Worms — with Oded Rechavi

    6월 21일

    Broken Peer Review, AI, and Worms — with Oded Rechavi

    Oded Rechavi is a biologist at Tel Aviv University and the co-founder of QED, a company building AI to review scientific work. He's also spent years studying worms. We start with what's wrong with peer review and grant funding: why it takes years to publish, why reviewers are often your own competitors, and why the whole thing is locked to an economic model that rewards publishing more papers, not better ones. Oded explains why he doesn't call QED "peer review" at all, and what it would take to actually validate science instead of just stamping it. Then we get into the biology. C. elegans has exactly 959 cells, every one of them named, and a fully mapped brain. Oded's lab studies how a worm's experiences get passed to its offspring through RNA rather than DNA — meaning what happens to a worm in its lifetime can change its descendants. We also talk about using ancient DNA to reassemble the Dead Sea Scrolls, what AI can and can't do for biology, and why he wants to build an "Ironman suit" for researchers rather than replace them. 00:00 Intro 01:35 Why scientific publishing is broken 04:02 Years to publish, and what it costs science 07:20 Bad reviewers, conflicts of interest, and the money 10:47 Why preprints don't fix it 15:37 How AI conferences handle review 22:07 Conferences vs. journals — does slow review help? 25:22 Building QED: review, not peer review 30:02 Tracking a paper from idea to submission 33:11 What writing a grant actually involves 35:00 The ERC reviewer crisis 37:06 Tailoring feedback to your field 41:48 Switching to biology 44:30 Every cell has a name: inside C. elegans 46:28 Inheritance without DNA 48:16 What the worm "thinks" changes its offspring 51:58 Reassembling the Dead Sea Scrolls with ancient DNA 56:07 Psychedelics and worms 58:36 Can AI run the research itself? 1:04:49 Automation vs. validation 1:07:12 The origin of life 1:08:49 Why people reject AI-written work 1:16:18 Will humans still have a role? 1:17:39 Wrap-up Music: "Kid Kodi" - Blue Dot Sessions - via Free Music Archive - CC BY-NC 4.0. About: The Information Bottleneck is hosted by Ravid Shwartz-Ziv and Allen Roush, featuring in-depth conversations with leading AI researchers about the ideas shaping the future of machine learning.

    1시간 18분
  4. Will AI Take Our Jobs? With Alex Imas (Google/University of Chicago)

    6월 16일

    Will AI Take Our Jobs? With Alex Imas (Google/University of Chicago)

    Will AI take our jobs? We put the question to Alex Imas, the new Director of AGI Economics at Google DeepMind and a professor at Chicago Booth, whose entire job now is studying how frontier AI reshapes the economy. His short answer: probably some of them, but the popular story is mostly wrong about which jobs and how fast. Alex makes the case that a job is a bundle of tasks, not a single thing AI either does or doesn't do, and that the number of people who should actually care about is how much consumer demand responds to falling prices. Get that wrong and you predict mass layoffs. Get it right and you sometimes predict more hiring. We get into why the automation panic is two centuries old, why he thinks blue-collar work is in more danger than white-collar, and why the people already winning are the ones adopting AI fastest. We also cover the AGI versus ASI distinction and why it changes everything for the economy, what happens when there's no moat and open models stay six to eight months behind, the three-tier pricing future he sees coming after the 2026 compute crunch, and what any of this means if you're deciding whether to send your kids to college. The episode was recorded before Alex joined GoogleTimestamps 00:00 Meeting Alex Imas 00:44 Will AI take our jobs? 03:35 Is this an AI question or an economics question? 06:18 The economy is already behind the AI we have 07:43 Why AI adoption is K-shaped 12:51 Was Andrew Yang right? 13:45 The automation panic is 200 years old 16:46 Dario's six-month claim, and why we don't see it yet 17:22 A job is not a task 22:38 The three numbers that actually predict the labor market 22:42 The chess engine analogy and the centaur phase 25:45 Recursive self-improvement and the hamburger problem 30:06 Should AI labs be the ones answering alignment questions? 31:17 The "invisible hand wave" and why nobody wants fully autonomous AI 33:27 AGI vs ASI, and why the difference is everything 35:28 Commodities vs relational goods 41:14 Star Trek, replicators, and predicting with sci-fi 45:20 Inequality and the Upper West Side VCs 46:21 Your money manager was automated in the 1960s 50:47 Are OpenAI and Anthropic overvalued? The moat problem 54:29 What has to be true for the losses to make sense 55:43 Cognitive atrophy and monopoly fears 57:00 The 2026 compute crunch and the three-tier pricing future 1:01:52 The Apple vs Android analogy 1:03:54 A rich-country perspective 1:04:16 Protecting the skills that actually matter 1:07:02 Will not using AI become a status symbol? 1:08:53 Does capitalism even survive? 1:13:44 Redistribution becomes the political battleground 1:18:16 Blue collar vs white collar: who's really at risk 1:21:18 Advice for parents in an AI world 1:22:43 Saving for retirement when the Valley says don't 1:25:06 Will non-elite colleges survive? Music: "Kid Kodi" - Blue Dot Sessions - via Free Music Archive - CC BY-NC 4.0. About: The Information Bottleneck is hosted by Ravid Shwartz-Ziv and Allen Roush, featuring in-depth conversations with leading AI researchers about the ideas shaping the future of machine learning.

    1시간 29분
  5. Why AI Benchmarks Are Lying to You - with Wenhu Chen (Meta/University of Waterloo)

    6월 13일

    Why AI Benchmarks Are Lying to You - with Wenhu Chen (Meta/University of Waterloo)

    In this episode, we sit down with Wenhu Chen, research scientist at Meta MSL, assistant professor at the University of Waterloo, and the person behind MMLU-Pro and MMMU. If you've read a frontier model release in the last two years, you've seen his benchmarks. That makes him one of the best people to answer the question everyone dances around: when a model jumps from 40% to 90% on your benchmark, how much of that is real? In this episode, we dig into why benchmarks have become the loss function of the entire field - design a bad one, and thousands of brilliant researchers will spend months hill-climbing in the wrong direction. Wenhu is surprisingly candid about the limits of his own creations: contamination is everywhere, saturation turns frontier benchmarks into unit tests, and popular alternatives, such as LM Arena, mostly measure tone and length rather than capability. His answer is to evaluate models where they've never been: private codebases, hospital data, and the messy, live internet. We also talk about ClawBench, his new benchmark that deploys agents to over 140 real production websites to do things people actually want done, such, such as ordering food, booking tickets, and applying for jobs. The best model in the world completes about a third of these tasks. We unpack why: bot detection, models that refuse to click "pay," agents that give up the moment an environment doesn't match their training, and harnesses that can swing results by 20% without changing the model at all. Along the way, we cover the overlooked science of evaluating pre-training, data flywheels, and synthetic environments for agent training, and whether RL teaches models to reason or just surfaces what's already there. We close with Wenhu's predictions: exploration and adaptability will improve rapidly, but security will become the field's hardest problem as agents gain real permissions in the real world. Timestamps 00:00 – Intro 00:55 – What good evaluation means, and how it's changed since the early GPT days 03:35 – Benchmarks as the field's loss function 05:50 – Contamination: the problem nobody fully solves 08:08 – MMLU-Pro scores: real progress or training on the test set? 11:05 – Can you measure creativity? 12:34 – Why human judges and arenas are unreliable — and what to use instead 19:22 – What a good benchmark actually looks like 22:34 – Chain of thought: signal or scratchpad? 26:01 – Auto-research and hill-climbing agents 28:52 – Harnesses: 20% swings without touching the model 32:28 – Safety, model release, and an "FDA for models" 36:53 – The overlooked science of pre-training evaluation 43:49 – Designing pre-training benchmarks when one run costs a billion dollars 49:45 – ClawBench: agents on 140+ live websites, and why the best model gets 33% 54:42 – How MMLU-Pro and MMMU-Pro were born from public complaints 59:16 – Pixel agents vs. APIs: will MCP kill computer use? 1:02:11 – Training agents: data flywheels and synthetic environments 1:05:43 – SFT vs. RL, and does RL teach reasoning or reveal it? 1:09:21 – What gets solved next year — and what doesn't 1:14:32 – Undervalued ideas, and what's next for ClawBench Music: "Kid Kodi" - Blue Dot Sessions - via Free Music Archive - CC BY-NC 4.0. About: The Information Bottleneck is hosted by Ravid Shwartz-Ziv and Allen Roush, featuring in-depth conversations with leading AI researchers about the ideas shaping the future of machine learning.

    1시간 19분
  6. Jürgen Schmidhuber - Part 2: JEPA, the Road to AGI, and Who Really Invented Modern AI

    6월 7일

    Jürgen Schmidhuber - Part 2: JEPA, the Road to AGI, and Who Really Invented Modern AI

    In the second half of our conversation with Jürgen Schmidhuber, we focus on the key ideas he's pursued since the early 1990s and discuss why he believes these concepts are only now being rediscovered. We start with JEPA. Jürgen argues that the method LeCun named in 2022 is the same family he published in 1992 as Predictability Maximization. From there he traces the adversarial lineage back further still, to his 1990 world-model paper and 1991 Predictability Minimization  -  the curiosity-driven minimax games he sees as the real origins of GANs. We also talk about why these ideas took thirty years to land, why today's trillion-dollar data-center buildout is driven by AGI fear, and why he thinks Apple may come out ahead. The back half turns to what he sees as the real frontier: physical AI. Today's systems are superhuman behind the screen but helpless at a leaky pipe, and until a robot can use human tools, there's no AGI. He discusses self-replicating, self-improving machines as "a new kind of life," reframes continual learning and test-time training as ideas from his 1991 fast-weight work, and detours through Solomonoff's universal prior, Hutter's AIXI, and the Gödel machine. We close on the subject Jürgen is famous for: scientific credit. He makes his case for rigorous attribution, casts himself as a "speaker for the dead" championing forgotten pioneers like Ivakhnenko, and reflects candidly on whether the fights are personal. Timeline 00:30 — What JEPA is, and the 1992 Predictability Maximization story 04:54 — Implementing PMAX: autoencoders, Siamese networks, Infomax 09:10 — Predictability Minimization, factorial codes, and the roots of GANs 16:00 — Why it took 30 years: the economics of compute 20:52 — Data, the web, and 1990 as the origin point 23:09 — Hardware inflation, the trillion-dollar buildout, and the coming crash 34:05 — Physical AI: the plumber problem and self-replicating machines 41:14 — Which 90s ideas are being scaled right now 45:26 — Continual learning and test-time training as "old hats" 55:19 — Measuring intelligence: Solomonoff, AIXI, and the Gödel machine 1:05:26 — Self-replication and von Neumann 1:09:51 — Will he see AGI in his lifetime? 1:10:42 — Credit, integrity, and being a "speaker for the dead" Music: "Kid Kodi" - Blue Dot Sessions - via Free Music Archive - CC BY-NC 4.0."Palms Down" - Blue Dot Sessions - via Free Music Archive - CC BY-NC 4.0.Changes: trimmed About: The Information Bottleneck is hosted by Ravid Shwartz-Ziv and Allen Roush, featuring in-depth conversations with leading AI researchers about the ideas shaping the future of machine learning.

    1시간 29분
  7. Jürgen Schmidhuber  -  World Models, RL, and the Year that changed AI (Part 1)

    6월 4일

    Jürgen Schmidhuber - World Models, RL, and the Year that changed AI (Part 1)

    In this episode, we host Jürgen Schmidhuber - the man, the legend, one of the godfathers of modern AI. His lab worked out many ideas behind today’s systems (LSTM, world models, artificial curiosity, Transformer variants, and even GAN-style setups) decades before they became fashionable, and he’s just as well known for making sure people remember who did what first. This is the first of two conversations with him. We go back to his lab in the early 90s and ask how one small group came up with so many of the ideas that are now being scaled to a thousand billion dollars, back when compute was ten million times more expensive. A lot of the episode comes down to one distinction he keeps making: prediction vs. decision-making. His take is that LLMs are very good prediction machines that imitate the web, but that’s only half the problem. To actually act in the world, you need a controller that uses a world model to plan. He talks about his 1990 work on world models and artificial curiosity, where the controller gets rewarded for running experiments that improve its own model (an adversarial setup years before GANs), why planning millisecond by millisecond doesn’t scale, and why you need sub-goals instead. We also talk about compression as the core of understanding, from falling apples to Kepler to Einstein, and why we still don’t have a robot that can do what a plumber does, even though the AI behind the screen keeps getting better. Then the conversation moves to credit assignment: how “to Schmidhuber” became a verb, what he thinks is broken about the award system, and a long exchange on PMAX vs. JEPA. He ends on the real origins of deep learning and a prediction about self-replicating machines in space. Timeline 00:00  Intro 00:55  1991 in Munich, and why that lab mattered 02:38  "I'm not very smart"  and why compute getting 10× cheaper every 5 years changed everything 04:25  Chess as an AI proxy 08:27  Artificial curiosity in the 90s vs. today's RL exploration 09:10  Why RL is harder than supervised learning 20:48  Coding agents vs. robots, and how a baby learns its own hands 26:20  Compression as understanding 33:40  What's actually missing on the road to AGI 37:30  Why millisecond-by-millisecond planning is stupid 47:44  Convergence to LLMs, GPUs, and how far we still are from the Bremermann limit 51:49  Unsupervised learning, factorial codes, and predictability minimization 58:12  Credit assignment: the fights with LeCun and the Nobel critique 1:02:13  On his last name becoming a verb 1:05:17  The award system's missing peer review 1:07:03  Closed labs and the decline of open research 1:13:23  Audience questions 1:34:02  Closing: who really invented deep learning? Music: "Kid Kodi" - Blue Dot Sessions - via Free Music Archive - CC BY-NC 4.0."Palms Down" - Blue Dot Sessions - via Free Music Archive - CC BY-NC 4.0.Changes: trimmed About: The Information Bottleneck is hosted by Ravid Shwartz-Ziv and Allen Roush, featuring in-depth conversations with leading AI researchers about the ideas shaping the future of machine learning.

    1시간 38분
  8. AI for Science and the Thermodynamics of Generative AI - with Max Welling (UvA, CuspAI)

    5월 29일

    AI for Science and the Thermodynamics of Generative AI - with Max Welling (UvA, CuspAI)

    In this episode, we sit with Max Welling, Professor of Machine Learning at the University of Amsterdam, co-founder and CTO of CuspAI, and a foundational figure behind variational autoencoders (VAEs), equivariant networks, and Bayesian deep learning. We talk about AI for science, the physics underneath generative models, and what's still missing on the road to real intelligence. Max starts with what impresses him and what worries him about the LLM era, then makes the case that the next leaps will come from physical AI and from science itself. We dig into how machine learning actually works in the lab, world models and whether priors like geometry and symmetry should be built in or simply learned, and whether transformers will still rule a decade from now. At the end, we talk about CuspAI's climate mission, AI risk and regulation, Max’s new book, and where neuroscience might inspire the next wave of ML. Timeline 00:00 — Intro00:47 — Are we happy with the LLM era?03:14 — Embodiment and physical AI08:05 — Does "AGI" even matter as a term?11:34 — Verifiers, RL, and why math/coding are tractable13:17 — What actually shifted to make materials discovery work14:42 — From molecules to biology and wet labs16:26 — Working with real labs: timescales, friction, and the "Mira" agent20:29 — Balancing simulators vs. experiments: the exploration–exploitation trade-off23:44 — Active learning for experimental design24:23 — Why active learning hasn't been central to LLMs25:24 — A general loop for ML-for-science across domains27:10 — Foundation models for chemistry: a "mother ship" plus a zoo of fine-tuned models30:04 — Quantum mechanics, interpretation, and AI as a creative theorist31:54 — World models and Yann LeCun's view; priors vs. learning34:57 — Should world knowledge be explicit? (responding to Stefano Ermon)36:41 — Vision: equivariance vs. transformers, and the role of optimization40:32 — Best model for molecular properties in 10 years? Will transformers survive?43:16 — CuspAI's climate focus and what motivated it47:10 — One platform for every material class — what transfers and what doesn't48:42 — Where does the risk of human extinction really come from?51:06 — The "pause AI" debate and the arms-race reality52:40 — Regulating powerful models: government vs. self-regulation55:16 — Who should design AI regulation? 56:29 — The new book1:00:31 — Compression, the information bottleneck, and renormalization1:03:30 — The role of foundational principles in modern AI1:04:06 — Waves in computing, the brain, and the next wave of innovation1:07:11 — Neuroscience and ML: are we in a better position now?1:09:17 — Conferences, the ICLR keynote, and finding the right peopleMusic: "Kid Kodi" - Blue Dot Sessions - via Free Music Archive - CC BY-NC 4.0."Palms Down" - Blue Dot Sessions - via Free Music Archive - CC BY-NC 4.0.Changes: trimmedAbout: The Information Bottleneck is hosted by Ravid Shwartz-Ziv and Allen Roush, featuring in-depth conversations with leading AI researchers about the ideas shaping the future of machine learning.

    1시간 14분

소개

Two AI Researchers - Ravid Shwartz Ziv, and Allen Roush, discuss the latest trends, news, and research within Generative AI, LLMs, GPUs, and Cloud Systems.

좋아할 만한 다른 항목