Dwarkesh Podcast

Dwarkesh Patel
Dwarkesh Podcast

Deeply researched interviews www.dwarkeshpatel.com

  1. 3 DAYS AGO

    Satya Nadella – Microsoft’s AGI Plan & Quantum Breakthrough

    Satya Nadella on: - Why he doesn’t believe in AGI but does believe in 10% economic growth, - Microsoft’s new topological qubit breakthrough and gaming world models, - Whether Office commoditizes LLMs or the other way around, Watch on Youtube; listen on Apple Podcasts or Spotify. Sponsors Scale partners with major AI labs like Meta, Google Deepmind, and OpenAI. Through Scale’s Data Foundry, labs get access to high-quality data to fuel post-training, including advanced reasoning capabilities. If you’re an AI researcher or engineer, learn about how Scale’s Data Foundry and research lab, SEAL, can help you go beyond the current frontier at scale.com/dwarkesh Linear's project management tools have become the default choice for product teams at companies like Ramp, CashApp, OpenAI, and Scale. These teams use Linear so they can stay close to their products and move fast. If you’re curious why so many companies are making the switch, visit linear.app/dwarkesh To sponsor a future episode, visit dwarkeshpatel.com/p/advertise. Timestamps (0:00:00) - Intro (0:05:04) - AI won't be winner-take-all (0:15:18) - World economy growing by 10% (0:21:39) - Decreasing price of intelligence (0:30:19) - Quantum breakthrough (0:42:51) - How Muse will change gaming (0:49:51) - Legal barriers to AI (0:55:46) - Getting AGI safety right (1:04:59) - 34 years at Microsoft (1:10:46) - Does Satya Nadella believe in AGI? Get full access to Dwarkesh Podcast at www.dwarkeshpatel.com/subscribe

    1h 16m
  2. 12 FEB

    Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

    This week I welcome on the show two of the most important technologists ever, in any field. Jeff Dean is Google's Chief Scientist, and through 25 years at the company, has worked on basically the most transformative systems in modern computing: from MapReduce, BigTable, Tensorflow, AlphaChip, to Gemini. Noam Shazeer invented or co-invented all the main architectures and techniques that are used for modern LLMs: from the Transformer itself, to Mixture of Experts, to Mesh Tensorflow, to Gemini and many other things. We talk about their 25 years at Google, going from PageRank to MapReduce to the Transformer to MoEs to AlphaChip – and maybe soon to ASI. My favorite part was Jeff's vision for Pathways, Google’s grand plan for a mutually-reinforcing loop of hardware and algorithmic design and for going past autoregression. That culminates in us imagining *all* of Google-the-company, going through one huge MoE model. And Noam just bites every bullet: 100x world GDP soon; let’s get a million automated researchers running in the Google datacenter; living to see the year 3000.Watch on Youtube; listen on Apple Podcasts or Spotify. Sponsors Scale partners with major AI labs like Meta, Google Deepmind, and OpenAI. Through Scale’s Data Foundry, labs get access to high-quality data to fuel post-training, including advanced reasoning capabilities. If you’re an AI researcher or engineer, learn about how Scale’s Data Foundry and research lab, SEAL, can help you go beyond the current frontier at scale.com/dwarkesh Curious how Jane Street teaches their new traders? They use Figgie, a rapid-fire card game that simulates the most exciting parts of markets and trading. It’s become so popular that Jane Street hosts an inter-office Figgie championship every year. Download from the app store or play on your desktop at figgie.com Meter wants to radically improve the digital world we take for granted. They’re developing a foundation model that automates network management end-to-end. To do this, they just announced a long-term partnership with Microsoft for tens of thousands of GPUs, and they’re recruiting a world class AI research team. To learn more, go to meter.com/dwarkesh To sponsor a future episode, visit dwarkeshpatel.com/p/advertise Timestamps 00:00:00 - Intro 00:02:44 - Joining Google in 1999 00:05:36 - Future of Moore's Law 00:10:21 - Future TPUs 00:13:13 - Jeff’s undergrad thesis: parallel backprop 00:15:10 - LLMs in 2007 00:23:07 - “Holy s**t” moments 00:29:46 - AI fulfills Google’s original mission 00:34:19 - Doing Search in-context 00:38:32 - The internal coding model 00:39:49 - What will 2027 models do? 00:46:00 - A new architecture every day? 00:49:21 - Automated chip design and intelligence explosion 00:57:31 - Future of inference scaling 01:03:56 - Already doing multi-datacenter runs 01:22:33 - Debugging at scale 01:26:05 - Fast takeoff and superalignment 01:34:40 - A million evil Jeff Deans 01:38:16 - Fun times at Google 01:41:50 - World compute demand in 2030 01:48:21 - Getting back to modularity 01:59:13 - Keeping a giga-MoE in-memory 02:04:09 - All of Google in one model 02:12:43 - What’s missing from distillation 02:18:03 - Open research, pros and cons 02:24:54 - Going the distance Get full access to Dwarkesh Podcast at www.dwarkeshpatel.com/subscribe

    2h 15m
  3. 30 JAN

    Sarah Paine Episode 3: How Mao Conquered China

    Third and final episode in the Paine trilogy! Chinese history is full of warlords constantly challenging the capital. How could Mao not only stay in power for decades, but not even face any insurgency? And how did Mao go from military genius to peacetime disaster - the patriotic hero who inflicted history’s worst human catastrophe on China? How can someone shrewd enough to win a civil war outnumbered 5 to 1 decide "let's have peasants make iron in their backyards" and "let's kill all the birds"? In her lecture and our Q&A, we cover the first nationwide famine in Chinese history; Mao's lasting influence on other insurgents; broken promises to minorities and peasantry; and what Taiwan means. Thanks so much to @Substack for running this in-person event! Note that Sarah is doing an AMA over the next couple days on Youtube; see the pinned comment. Watch on YouTube. Listen on Apple Podcasts, Spotify, or any other podcast platform. Sponsor Today’s episode is brought to you by Scale AI. Scale partners with the U.S. government to fuel America’s AI advantage through their data foundry. Scale recently introduced Defense Llama, Scale's latest solution available for military personnel. With Defense Llama, military personnel can harness the power of AI to plan military or intelligence operations and understand adversary vulnerabilities. If you’re interested in learning more on how Scale powers frontier AI capabilities, go to https://scale.com/dwarkesh. Get full access to Dwarkesh Podcast at www.dwarkeshpatel.com/subscribe

    1h 48m
  4. 23 JAN

    Sarah Paine Episode 2: Why Japan Lost (Lecture & Interview)

    This is the second episode in the trilogy of a lectures by Professor Sarah Paine of the Naval War College. In this second episode, Prof Paine dissects the ideas and economics behind Japanese imperialism before and during WWII. We get into the oil shortage which caused the war; the unique culture of honor and death; the surprisingly chaotic chain of command. This is followed by a Q&A with me. Huge thanks to Substack for hosting this event! Watch on YouTube. Listen on Apple Podcasts, Spotify, or any other podcast platform. Sponsor Today’s episode is brought to you by Scale AI. Scale partners with the U.S. government to fuel America’s AI advantage through their data foundry. Scale recently introduced Defense Llama, Scale's latest solution available for military personnel. With Defense Llama, military personnel can harness the power of AI to plan military or intelligence operations and understand adversary vulnerabilities. If you’re interested in learning more on how Scale powers frontier AI capabilities, go to scale.com/dwarkesh. Buy Sarah's Books! I highly, highly recommend both "The Wars for Asia, 1911–1949" and "The Japanese Empire: Grand Strategy from the Meiji Restoration to the Pacific War". Timestamps (0:00:00) - Lecture begins (0:06:58) - The code of the samurai (0:10:45) - Buddhism, Shinto, Confucianism (0:16:52) - Bushido as bad strategy (0:23:34) - Military theorists (0:33:42) - Strategic sins of omission (0:38:10) - Crippled logistics (0:40:58) - the Kwantung Army (0:43:31) - Inter-service communication (0:51:15) - Shattering Japanese morale (0:57:35) - Q&A begins (01:05:02) - Unusual brutality of WWII (01:11:30) - Embargo caused the war (01:16:48) - The liberation of China (01:22:02) - Could US have prevented war? (01:25:30) - Counterfactuals in history (01:27:46) - Japanese optimism (01:30:46) - Tech change and social change (01:38:22) - Hamming questions (01:44:31) - Do sanctions work? (01:50:07) - Backloaded mass death (01:54:09) - demilitarizing Japan (01:57:30) - Post-war alliances (02:03:46) - Inter-service rivalry Get full access to Dwarkesh Podcast at www.dwarkeshpatel.com/subscribe

    2h 8m
  5. 16 JAN

    Sarah Paine Episode 1: The War For India (Lecture & Interview)

    I’m thrilled to launch a new trilogy of double episodes: a lecture series by Professor Sarah Paine of the Naval War College, each followed by a deep Q&A. In this first episode, Prof Paine talks about key decisions by Khrushchev, Mao, Nehru, Bhutto, & Lyndon Johnson that shaped the whole dynamic of South Asia today. This is followed by a Q&A. Come for the spy bases, shoestring nukes, and insight about how great power politics impacts every region. Huge thanks to Substack for hosting this! Watch on YouTube. Listen on Apple Podcasts, Spotify, or any other podcast platform. Sponsors Today’s episode is brought to you by Scale AI. Scale partners with the U.S. government to fuel America’s AI advantage through their data foundry. The Air Force, Army, Defense Innovation Unit, and Chief Digital and Artificial Intelligence Office all trust Scale to equip their teams with AI-ready data and the technology to build powerful applications. Scale recently introduced Defense Llama, Scale's latest solution available for military personnel. With Defense Llama, military personnel can harness the power of AI to plan military or intelligence operations and understand adversary vulnerabilities. If you’re interested in learning more on how Scale powers frontier AI capabilities, go to scale.com/dwarkesh. Timestamps (00:00) - Intro (02:11) - Mao at war, 1949-51 (05:40) - Pactomania and Sino-Soviet conflicts (14:42) - The Sino-Indian War (20:00) - Soviet peace in India-Pakistan (22:00) - US Aid and Alliances (26:14) - The difference with WWII (30:09) - The geopolitical map in 1904 (35:10) - The US alienates Indira Gandhi (42:58) - Instruments of US power (53:41) - Carrier battle groups (1:02:41) - Q&A begins (1:04:31) - The appeal of the USSR (1:09:36) - The last communist premier (1:15:42) - India and China's lost opportunity (1:58:04) - Bismark's cunning (2:03:05) - Training US officers (2:07:03) - Cruelty in Russian history Get full access to Dwarkesh Podcast at www.dwarkeshpatel.com/subscribe

    2h 13m
  6. Adam Brown – How Future Civilizations Could Change The Laws of Physics

    26/12/2024

    Adam Brown – How Future Civilizations Could Change The Laws of Physics

    Adam Brown is a founder and lead of BlueShift with is cracking maths and reasoning at Google DeepMind and a theoretical physicist at Stanford. We discuss: destroying the light cone with vacuum decay, holographic principle, mining black holes, & what it would take to train LLMs that can make Einstein level conceptual breakthroughs. Stupefying, entertaining, & terrifying. Enjoy! Watch on YouTube, read the transcript, listen on Apple Podcasts, Spotify, or your favorite platform. Sponsors - Deepmind, Meta, Anthropic, and OpenAI, partner with Scale for high quality data to fuel post-training Publicly available data is running out - to keep developing smarter and smarter models, labs will need to rely on Scale’s data foundry, which combines subject matter experts with AI models to generate fresh data and break through the data wall. Learn more at scale.ai/dwarkesh. - Jane Street is looking to hire their next generation of leaders. Their deep learning team is looking for ML researchers, FPGA programmers, and CUDA programmers. Summer internships are open for just a few more weeks. If you want to stand out, take a crack at their new Kaggle competition. To learn more, go to janestreet.com/dwarkesh. - This episode is brought to you by Stripe, financial infrastructure for the internet. Millions of companies from Anthropic to Amazon use Stripe to accept payments, automate financial processes and grow their revenue. Timestamps (00:00:00) - Changing the laws of physics (00:26:05) - Why is our universe the way it is (00:37:30) - Making Einstein level AGI (01:00:31) - Physics stagnation and particle colliders (01:11:10) - Hitchhiking (01:29:00) - Nagasaki (01:36:19) - Adam’s career (01:43:25) - Mining black holes (01:59:42) - The holographic principle (02:23:25) - Philosophy of infinities (02:31:42) - Engineering constraints for future civilizations Get full access to Dwarkesh Podcast at www.dwarkeshpatel.com/subscribe

    2h 44m
  7. 13/11/2024

    Gwern Branwen - How an Anonymous Researcher Predicted AI's Trajectory

    Gwern is a pseudonymous researcher and writer. He was one of the first people to see LLM scaling coming. If you've read his blog, you know he's one of the most interesting polymathic thinkers alive. In order to protect Gwern's anonymity, I proposed interviewing him in person, and having my friend Chris Painter voice over his words after. This amused him enough that he agreed. After the episode, I convinced Gwern to create a donation page where people can help sustain what he's up to. Please go here to contribute. Read the full transcript here. Sponsors: * Jane Street is looking to hire their next generation of leaders. Their deep learning team is looking for ML researchers, FPGA programmers, and CUDA programmers. Summer internships are open - if you want to stand out, take a crack at their new Kaggle competition. To learn more, go to janestreet.com/dwarkesh. * Turing provides complete post-training services for leading AI labs like OpenAI, Anthropic, Meta, and Gemini. They specialize in model evaluation, SFT, RLHF, and DPO to enhance models’ reasoning, coding, and multimodal capabilities. Learn more at turing.com/dwarkesh. * This episode is brought to you by Stripe, financial infrastructure for the internet. Millions of companies from Anthropic to Amazon use Stripe to accept payments, automate financial processes and grow their revenue. If you’re interested in advertising on the podcast, check out this page. Timestamps 00:00:00 - Anonymity 00:01:09 - Automating Steve Jobs 00:04:38 - Isaac Newton's theory of progress 00:06:36 - Grand theory of intelligence 00:10:39 - Seeing scaling early 00:21:04 - AGI Timelines 00:22:54 - What to do in remaining 3 years until AGI 00:26:29 - Influencing the shoggoth with writing 00:30:50 - Human vs artificial intelligence 00:33:52 - Rabbit holes 00:38:48 - Hearing impairment 00:43:00 - Wikipedia editing 00:47:43 - Gwern.net 00:50:20 - Counterfactual careers 00:54:30 - Borges & literature 01:01:32 - Gwern's intelligence and process 01:11:03 - A day in the life of Gwern 01:19:16 - Gwern's finances 01:25:05 - The diversity of AI minds 01:27:24 - GLP drugs and obesity 01:31:08 - Drug experimentation 01:33:40 - Parasocial relationships 01:35:23 - Open rabbit holes Get full access to Dwarkesh Podcast at www.dwarkeshpatel.com/subscribe

    1h 37m
4.6
out of 5
9 Ratings

About

Deeply researched interviews www.dwarkeshpatel.com

You Might Also Like

To listen to explicit episodes, sign in.

Stay up to date with this show

Sign-in or sign-up to follow shows, save episodes and get the latest updates.

Select a country or region

Africa, Middle East, and India

Asia Pacific

Europe

Latin America and the Caribbean

The United States and Canada