Seb Krier on AGI, Scaffolding, and Coasean Bargaining at Scale In this episode of Justified Posteriors, we welcome Seb Krier — policy lead for AGI at Google DeepMind and excellent Twitter poster. Speaking in his personal capacity, Seb walks us through his understanding of AGI, why AI alignment has gone better than expected, the potential and limitations of a world where agents constantly barter on our behalf, and — of course — electronic music. We also cover AI in London vs. New York, how Seb went from reading Marginal Revolution for 15 years to becoming a recurring character on it, and Seb’s side-splitting humor on mediocre AI conferences. Related Links * Seb Krier on X: @sebkrier * Seb’s Substack, Technologik * “Coasean Bargaining at Scale” — Seb’s essay at the Cosmos Institute (also republished here) * “Musings on Recursive Self-Improvement” — Seb’s essay separating model-side RSI from societal-side * “The Cyborg Era: What AI Means for Jobs” — Seb’s guest essay on Alex Imas’s Substack, defending the scaffolding view * Anthropic’s Project Deal — the agent-bargaining experiment among Anthropic employees * Fradkin & Krishnan, “MarketBench” — Andrey and Rohit experiment of LLMs bidding in procurement auctions as an investigation of the future of AI marketplaces and the companion writeup: Rohit Krishnan, “Agent, Know Thyself! (and bid accordingly)” * Edge Esmeralda — Devon Zuegel’s pop-up village in Healdsburg, CA * MATS — for junior economists looking to skill up on AI safety/governance * Cosmos Institute and FIRE * bianjie.systems — the art platform Seb is co-organizing a dinner with in NY (Seb’s announcement) * Drexciya — James Stinson, Gerald Donald, and the Detroit electro-afrofuturism canon Timestamps (00:00) Intro (01:16) What is AGI? (07:30) In defense of scaffolding — Hayek, division of labor, and why one giant model won’t do it (13:00) Markets for cognition: will agents bid in procurement auctions? (18:40) Recursive self-improvement — separating the model side from the societal side (24:44) Alignment has gone better than 2017-Seb expected; prefer “intent following” (31:14) What economists should actually work on to inform AI labs(33:32) What does a DeepMind policy lead’s day look like? (38:20) AI Conferences(41:52) Coasean bargaining at scale — the positive vision(55:00) Inequality, property rights, and who gets the initial allocation (01:03:00) The Helldivers 2 “Managed Democracy” dystopia as Coasean bargaining gone wrong (01:09:00) Sponsor: Revelio Labs (01:09:30) Lightning round Justified Posteriors is a reader-supported publication. To receive new posts and support our work, consider becoming a free or paid subscriber. You’re also invited to our discord community at: https://discord.gg/b8VpPbBUt Transcript 00:00:00,100 --> 00:00:20,480 [Seth] [upbeat music] Welcome to the Justified Posterior’s podcast, the podcast that updates beliefs about the economics of AI and technology. I’m Seth Benzell, the number two biggest fan, after Tyler Cowen, in the Seb Krier fan club. 00:00:20,480 --> 00:00:20,740 [Andrey] [laughs] 00:00:20,740 --> 00:00:24,660 [Seth] Coming to you from Chapman University in sunny southern California. 00:00:24,660 --> 00:00:34,120 [Andrey] And I’m Andrey Fradkin, coming to you from San Francisco, California. And Justified Posterior’s is sponsored by the fine folks at Revelio Labs. 00:00:35,560 --> 00:00:45,600 [Andrey] We’re very excited to have Seb Krier here with us today. He is the policy lead for AGI at Google DeepMind, and is, 00:00:46,840 --> 00:00:52,400 [Andrey] dare I say, a thought leader in this space. Welcome to the show, Seb. 00:00:52,400 --> 00:00:54,200 [Seb Krier] Thank you very much. It’s great to be here. 00:00:55,380 --> 00:00:58,160 [Seb Krier] Yeah, I’m Seb, calling in from New York. 00:00:58,160 --> 00:01:00,320 [Andrey] And we should remind our listeners that 00:01:01,340 --> 00:01:08,410 [Andrey] Seb is, during this podcast, expressing his personal opinions, and is not speaking on behalf of DeepMind. All right. 00:01:08,410 --> 00:01:09,740 [Seb Krier] Indeed. [laughs] 00:01:09,740 --> 00:01:11,060 [Andrey] [laughs] 00:01:12,780 --> 00:01:13,900 [Andrey] The usual caveat. 00:01:15,260 --> 00:01:16,760 [Andrey] Seb, what is AGI? 00:01:18,080 --> 00:01:19,450 [Seb Krier] What is AGI? [laughs] 00:01:19,450 --> 00:01:19,570 [Andrey] [laughs] 00:01:19,570 --> 00:01:19,580 [Seth] [laughs] 00:01:19,580 --> 00:01:19,780 [Seb Krier] Great question. 00:01:19,780 --> 00:01:21,900 [Andrey] We’re going to start with the big questions. 00:01:21,900 --> 00:01:22,880 [Seb Krier] Yeah, might as well. 00:01:24,259 --> 00:01:54,840 [Seb Krier] [sighs] I think there’s so many definitions out there of what AGI is, and I think most of them are kind of unsatisfactory in one way or another. I’ve seen stuff like many definitions are indexed on the societal transformations or economic impacts of the technology, which I don’t really like very much because it makes it very dependent on external factors whether or not we have AGI. If it’s banned, we don’t have AGI, and if it’s not banned, we have AGI. Is it? 00:01:54,840 --> 00:01:55,480 [Andrey] [laughs] 00:01:55,480 --> 00:02:04,670 [Seb Krier] And there are other tests, like if an AI makes $1 million or something, which I find is very weird because most humans do not make $1 million in the first place. 00:02:04,670 --> 00:02:05,080 [Andrey] [laughs] 00:02:05,080 --> 00:02:11,359 [Seb Krier] So the one I kind of like is actually Shane Legg’s definition- 00:02:11,360 --> 00:02:11,620 [Andrey] Mm 00:02:11,620 --> 00:02:12,420 [Seb Krier] ... who’s at Deep Mind, who is 00:02:13,640 --> 00:02:16,980 [Seb Krier] more of a capability-based definition, which is something along the lines of 00:02:18,420 --> 00:02:20,960 [Seb Krier] an AI or a system that does most 00:02:22,380 --> 00:02:30,360 [Seb Krier] standard cognitive tasks that people typically do. [lips smack] So it’s kind of the bar isn’t too low, and it’s also not too high either. 00:02:32,220 --> 00:02:35,480 [Seb Krier] And so I think he’s got this definition of a minimal AGI, 00:02:36,580 --> 00:02:43,020 [Seb Krier] and I think that we’re not exactly there yet. I would disagree with people saying that we have AGI today because I think 00:02:44,220 --> 00:02:48,900 [Seb Krier] a lot of the systems we have, there’s many things that a human can do that they don’t really do very well. 00:02:48,900 --> 00:02:50,360 [Seth] What’s the biggest gap that we’re missing? 00:02:52,020 --> 00:03:47,740 [Seb Krier] I’d say there’s a few. One of them might be continual learning, or at least the ability to adapt and learn over time, and in different contexts and situations, just kind of update your own world model or whatever. If I think of a new joiner in a company, they’re not super useful the first day, but their value goes up over time because they learn all sorts of things. And so [lips smack] that might be one of them. A lot of the systems we have today, I think, are not very good at software, and you’re using graphical user interfaces and software and whatnot. If I ask an agent right now to go and use a music production software and make a track, I think they’d generally struggle. That doesn’t mean it’s impossible to solve or anything like that, but I think, in many respects, they’re not as general as you’d want them to be. And then the other bit also is, [lips smack] and of course they still make some silly mistakes here and there, but I think that’s getting it fixed. But the creativity point is one that I’m really interested in as well, in that I think they’re really good at kind of 00:03:48,780 --> 00:04:02,700 [Seb Krier] exploiting maybe an existing paradigm or an existing knowledge and so on, and recombining knowledge and whatnot. But I think really coming up with new concepts and abstractions entirely is something I think humans can do, but I don’t see our current systems really doing either. 00:04:02,700 --> 00:04:10,060 [Andrey] How do you measure whether humans can do creative tasks? One of the things that 00:04:11,200 --> 00:04:15,940 [Andrey] strikes me as a bit of an unfair test in that, 00:04:17,060 --> 00:04:23,290 [Andrey] let’s say you ask an LLM to write a poem or to write a story. It’s very- 00:04:23,290 --> 00:04:23,290 [Seth] [laughs] 00:04:23,290 --> 00:04:32,050 [Andrey] ... times more entertaining than what a random human would write. So, do you have a benchmark for creativity? 00:04:32,050 --> 00:04:35,390 [Seth] This is the meme where the robot asks Will Smith if he can compose an opera. 00:04:35,390 --> 00:05:14,700 [Seb Krier] [laughs] Can you? Yeah, exactly. It depends, and you’re right. Obviously, most people aren’t creating new abstraction and concepts on a day-to-day level. But I imagine there’s still something qualitative about that kind of creativity that I think does get applied in everyone’s day-to-day life in various kind of ways. Maybe they’re not as big or significant as creating a symphony. But I don’t really have a strong test. There’s actually an interesting podcast that had Ben Goertzel and Yoshua, I think a few years ago, where they were saying something like, if you had a model that was trained knowing only classical music and West African drumming, could it come up with jazz in the first place, or recreate jazz? 00:05:16,460 --> 00:05:27,880 [Seb Krier] And I quite like that test. And in principle, I can imagine it being possible. You could kind of decompose all sorts of different kind of elements and variables here and just get something jazz-like. But it still feels a bit... 00:05:29,580 --> 00:05:40,580 [Seb Krier] It’s not the same as just coming up with the idea of jazz in the first place and saying, oh, I’m goi