Épisodes

  • Satya Nadella – Microsoft’s AGI Plan & Quantum Breakthrough
    Feb 19 2025

    Satya Nadella on:

    - Why he doesn’t believe in AGI but does believe in 10% economic growth,

    - Microsoft’s new topological qubit breakthrough and gaming world models,

    - Whether Office commoditizes LLMs or the other way around,

    Listen on Apple Podcasts or Spotify; watch on Youtube.

    Sponsors

    Scale partners with major AI labs like Meta, Google Deepmind, and OpenAI. Through Scale’s Data Foundry, labs get access to high-quality data to fuel post-training, including advanced reasoning capabilities. If you’re an AI researcher or engineer, learn about how Scale’s Data Foundry and research lab, SEAL, can help you go beyond the current frontier at scale.com/dwarkesh.

    Linear's project management tools have become the default choice for product teams at companies like Ramp, CashApp, OpenAI, and Scale. These teams use Linear so they can stay close to their products and move fast. If you’re curious why so many companies are making the switch, visit linear.app/dwarkesh to try Linear for free.

    To sponsor a future episode, visit dwarkeshpatel.com/p/advertise.

    Timestamps

    (0:00:00) - Intro

    (0:05:04) - AI won't be winner-take-all

    (0:15:18) - World economy growing by 10%

    (0:21:39) - Decreasing price of intelligence

    (0:30:19) - Quantum breakthrough

    (0:42:51) - How Muse will change gaming

    (0:49:51) - Legal barriers to AI

    (0:55:46) - Getting AGI safety right

    (1:04:59) - 34 years at Microsoft

    (1:10:46) - Does Satya Nadella believe in AGI?



    Get full access to Dwarkesh Podcast at www.dwarkeshpatel.com/subscribe
    Voir plus Voir moins
    1 h et 16 min
  • Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI
    Feb 12 2025

    This week I welcome on the show two of the most important technologists ever, in any field.

    Jeff Dean is Google's Chief Scientist, and through 25 years at the company, has worked on basically the most transformative systems in modern computing: from MapReduce, BigTable, Tensorflow, AlphaChip, to Gemini.

    Noam Shazeer invented or co-invented all the main architectures and techniques that are used for modern LLMs: from the Transformer itself, to Mixture of Experts, to Mesh Tensorflow, to Gemini and many other things.

    We talk about their 25 years at Google, going from PageRank to MapReduce to the Transformer to MoEs to AlphaChip – and maybe soon to ASI.

    My favorite part was Jeff's vision for Pathways, Google’s grand plan for a mutually-reinforcing loop of hardware and algorithmic design and for going past autoregression. That culminates in us imagining *all* of Google-the-company, going through one huge MoE model.

    And Noam just bites every bullet: 100x world GDP soon; let’s get a million automated researchers running in the Google datacenter; living to see the year 3000.

    Sponsors

    Scale partners with major AI labs like Meta, Google Deepmind, and OpenAI. Through Scale’s Data Foundry, labs get access to high-quality data to fuel post-training, including advanced reasoning capabilities. If you’re an AI researcher or engineer, learn about how Scale’s Data Foundry and research lab, SEAL, can help you go beyond the current frontier at scale.com/dwarkesh

    Curious how Jane Street teaches their new traders? They use Figgie, a rapid-fire card game that simulates the most exciting parts of markets and trading. It’s become so popular that Jane Street hosts an inter-office Figgie championship every year. Download from the app store or play on your desktop at figgie.com

    Meter wants to radically improve the digital world we take for granted. They’re developing a foundation model that automates network management end-to-end. To do this, they just announced a long-term partnership with Microsoft for tens of thousands of GPUs, and they’re recruiting a world class AI research team. To learn more, go to meter.com/dwarkesh

    To sponsor a future episode, visit dwarkeshpatel.com/p/advertise

    Timestamps

    00:00:00 - Intro

    00:02:44 - Joining Google in 1999

    00:05:36 - Future of Moore's Law

    00:10:21 - Future TPUs

    00:13:13 - Jeff’s undergrad thesis: parallel backprop

    00:15:10 - LLMs in 2007

    00:23:07 - “Holy s**t” moments

    00:29:46 - AI fulfills Google’s original mission

    00:34:19 - Doing Search in-context

    00:38:32 - The internal coding model

    00:39:49 - What will 2027 models do?

    00:46:00 - A new architecture every day?

    00:49:21 - Automated chip design and intelligence explosion

    00:57:31 - Future of inference scaling

    01:03:56 - Already doing multi-datacenter runs

    01:22:33 - Debugging at scale

    01:26:05 - Fast takeoff and superalignment

    01:34:40 - A million evil Jeff Deans

    01:38:16 - Fun times at Google

    01:41:50 - World compute demand in 2030

    01:48:21 - Getting back to modularity

    01:59:13 - Keeping a giga-MoE in-memory

    02:04:09 - All of Google in one model

    02:12:43 - What’s missing from distillation

    02:18:03 - Open research, pros and cons

    02:24:54 - Going the distance



    Get full access to Dwarkesh Podcast at www.dwarkeshpatel.com/subscribe
    Voir plus Voir moins
    2 h et 15 min
  • Sarah Paine Episode 3: How Mao Conquered China
    Jan 30 2025

    Third and final episode in the Paine trilogy!

    Chinese history is full of warlords constantly challenging the capital. How could Mao not only stay in power for decades, but not even face any insurgency?

    And how did Mao go from military genius to peacetime disaster - the patriotic hero who inflicted history’s worst human catastrophe on China? How can someone shrewd enough to win a civil war outnumbered 5 to 1 decide "let's have peasants make iron in their backyards" and "let's kill all the birds"?

    In her lecture and our Q&A, we cover the first nationwide famine in Chinese history; Mao's lasting influence on other insurgents; broken promises to minorities and peasantry; and what Taiwan means.

    Thanks so much to @Substack for running this in-person event!

    Note that Sarah is doing an AMA over the next couple days on Youtube; see the pinned comment.

    Watch on YouTube. Listen on Apple Podcasts, Spotify, or any other podcast platform.

    Sponsor

    Today’s episode is brought to you by Scale AI. Scale partners with the U.S. government to fuel America’s AI advantage through their data foundry. Scale recently introduced Defense Llama, Scale's latest solution available for military personnel. With Defense Llama, military personnel can harness the power of AI to plan military or intelligence operations and understand adversary vulnerabilities.

    If you’re interested in learning more on how Scale powers frontier AI capabilities, go to https://scale.com/dwarkesh.



    Get full access to Dwarkesh Podcast at www.dwarkeshpatel.com/subscribe
    Voir plus Voir moins
    1 h et 48 min
  • Sarah Paine Episode 2: Why Japan Lost (Lecture & Interview)
    Jan 23 2025

    This is the second episode in the trilogy of a lectures by Professor Sarah Paine of the Naval War College.

    In this second episode, Prof Paine dissects the ideas and economics behind Japanese imperialism before and during WWII. We get into the oil shortage which caused the war; the unique culture of honor and death; the surprisingly chaotic chain of command. This is followed by a Q&A with me.

    Huge thanks to Substack for hosting this event!

    Watch on YouTube. Listen on Apple Podcasts, Spotify, or any other podcast platform.

    Sponsor

    Today’s episode is brought to you by Scale AI. Scale partners with the U.S. government to fuel America’s AI advantage through their data foundry. Scale recently introduced Defense Llama, Scale's latest solution available for military personnel. With Defense Llama, military personnel can harness the power of AI to plan military or intelligence operations and understand adversary vulnerabilities.

    If you’re interested in learning more on how Scale powers frontier AI capabilities, go to scale.com/dwarkesh.

    Buy Sarah's Books!

    I highly, highly recommend both "The Wars for Asia, 1911–1949" and "The Japanese Empire: Grand Strategy from the Meiji Restoration to the Pacific War".

    Timestamps

    (0:00:00) - Lecture begins

    (0:06:58) - The code of the samurai

    (0:10:45) - Buddhism, Shinto, Confucianism

    (0:16:52) - Bushido as bad strategy

    (0:23:34) - Military theorists

    (0:33:42) - Strategic sins of omission

    (0:38:10) - Crippled logistics

    (0:40:58) - the Kwantung Army

    (0:43:31) - Inter-service communication

    (0:51:15) - Shattering Japanese morale

    (0:57:35) - Q&A begins

    (01:05:02) - Unusual brutality of WWII

    (01:11:30) - Embargo caused the war

    (01:16:48) - The liberation of China

    (01:22:02) - Could US have prevented war?

    (01:25:30) - Counterfactuals in history

    (01:27:46) - Japanese optimism

    (01:30:46) - Tech change and social change

    (01:38:22) - Hamming questions

    (01:44:31) - Do sanctions work?

    (01:50:07) - Backloaded mass death

    (01:54:09) - demilitarizing Japan

    (01:57:30) - Post-war alliances

    (02:03:46) - Inter-service rivalry



    Get full access to Dwarkesh Podcast at www.dwarkeshpatel.com/subscribe
    Voir plus Voir moins
    2 h et 8 min
  • Sarah Paine Episode 1: The War For India (Lecture & Interview)
    Jan 16 2025

    I’m thrilled to launch a new trilogy of double episodes: a lecture series by Professor Sarah Paine of the Naval War College, each followed by a deep Q&A.

    In this first episode, Prof Paine talks about key decisions by Khrushchev, Mao, Nehru, Bhutto, & Lyndon Johnson that shaped the whole dynamic of South Asia today. This is followed by a Q&A.

    Come for the spy bases, shoestring nukes, and insight about how great power politics impacts every region.

    Huge thanks to Substack for hosting this!

    Watch on YouTube. Listen on Apple Podcasts, Spotify, or any other podcast platform.

    Sponsors

    Today’s episode is brought to you by Scale AI. Scale partners with the U.S. government to fuel America’s AI advantage through their data foundry. The Air Force, Army, Defense Innovation Unit, and Chief Digital and Artificial Intelligence Office all trust Scale to equip their teams with AI-ready data and the technology to build powerful applications.

    Scale recently introduced Defense Llama, Scale's latest solution available for military personnel. With Defense Llama, military personnel can harness the power of AI to plan military or intelligence operations and understand adversary vulnerabilities.

    If you’re interested in learning more on how Scale powers frontier AI capabilities, go to scale.com/dwarkesh.

    Timestamps

    (00:00) - Intro

    (02:11) - Mao at war, 1949-51

    (05:40) - Pactomania and Sino-Soviet conflicts

    (14:42) - The Sino-Indian War

    (20:00) - Soviet peace in India-Pakistan

    (22:00) - US Aid and Alliances

    (26:14) - The difference with WWII

    (30:09) - The geopolitical map in 1904

    (35:10) - The US alienates Indira Gandhi

    (42:58) - Instruments of US power

    (53:41) - Carrier battle groups

    (1:02:41) - Q&A begins

    (1:04:31) - The appeal of the USSR

    (1:09:36) - The last communist premier

    (1:15:42) - India and China's lost opportunity

    (1:58:04) - Bismark's cunning

    (2:03:05) - Training US officers

    (2:07:03) - Cruelty in Russian history



    Get full access to Dwarkesh Podcast at www.dwarkeshpatel.com/subscribe
    Voir plus Voir moins
    2 h et 13 min
  • Tyler Cowen - the #1 bottleneck to AI progress is humans
    Jan 9 2025

    I interviewed Tyler Cowen at the Progress Conference 2024. As always, I had a blast. This is my fourth interview with him – and yet I’m always hearing new stuff.

    We talked about why he thinks AI won't drive explosive economic growth, the real bottlenecks on world progress, him now writing for AIs instead of humans, and the difficult relationship between being cultured and fostering growth – among many other things in the full episode.

    Thanks to the Roots of Progress Institute (with special thanks to Jason Crawford and Heike Larson) for such a wonderful conference, and to FreeThink for the videography.

    Watch on YouTube. Listen on Apple Podcasts, Spotify, or any other podcast platform. Read the full transcript here.

    Sponsors

    I’m grateful to Tyler for volunteering to say a few words about Jane Street. It's the first time that a guest has participated in the sponsorship. I hope you can see why Tyler and I think so highly of Jane Street. To learn more about their open rules, go to janestreet.com/dwarkesh.

    Timestamps

    (00:00:00) Economic Growth and AI

    (00:14:57) Founder Mode and increasing variance

    (00:29:31) Effective Altruism and Progress Studies

    (00:33:05) What AI changes for Tyler

    (00:44:57) The slow diffusion of innovation

    (00:49:53) Stalin's library

    (00:52:19) DC vs SF vs EU



    Get full access to Dwarkesh Podcast at www.dwarkeshpatel.com/subscribe
    Voir plus Voir moins
    1 h
  • Adam Brown – How Future Civilizations Could Change The Laws of Physics
    Dec 26 2024

    Adam Brown is a founder and lead of BlueShift with is cracking maths and reasoning at Google DeepMind and a theoretical physicist at Stanford.

    We discuss: destroying the light cone with vacuum decay, holographic principle, mining black holes, & what it would take to train LLMs that can make Einstein level conceptual breakthroughs.

    Stupefying, entertaining, & terrifying.

    Enjoy!

    Watch on YouTube, read the transcript, listen on Apple Podcasts, Spotify, or your favorite platform.

    Sponsors

    - Deepmind, Meta, Anthropic, and OpenAI, partner with Scale for high quality data to fuel post-training Publicly available data is running out - to keep developing smarter and smarter models, labs will need to rely on Scale’s data foundry, which combines subject matter experts with AI models to generate fresh data and break through the data wall. Learn more at scale.ai/dwarkesh.

    - Jane Street is looking to hire their next generation of leaders. Their deep learning team is looking for ML researchers, FPGA programmers, and CUDA programmers. Summer internships are open for just a few more weeks. If you want to stand out, take a crack at their new Kaggle competition. To learn more, go to janestreet.com/dwarkesh.

    - This episode is brought to you by Stripe, financial infrastructure for the internet. Millions of companies from Anthropic to Amazon use Stripe to accept payments, automate financial processes and grow their revenue.

    Timestamps

    (00:00:00) - Changing the laws of physics

    (00:26:05) - Why is our universe the way it is

    (00:37:30) - Making Einstein level AGI

    (01:00:31) - Physics stagnation and particle colliders

    (01:11:10) - Hitchhiking

    (01:29:00) - Nagasaki

    (01:36:19) - Adam’s career

    (01:43:25) - Mining black holes

    (01:59:42) - The holographic principle

    (02:23:25) - Philosophy of infinities

    (02:31:42) - Engineering constraints for future civilizations



    Get full access to Dwarkesh Podcast at www.dwarkeshpatel.com/subscribe
    Voir plus Voir moins
    2 h et 44 min
  • Gwern Branwen - How an Anonymous Researcher Predicted AI's Trajectory
    Nov 13 2024

    Gwern is a pseudonymous researcher and writer. He was one of the first people to see LLM scaling coming. If you've read his blog, you know he's one of the most interesting polymathic thinkers alive.

    In order to protect Gwern's anonymity, I proposed interviewing him in person, and having my friend Chris Painter voice over his words after. This amused him enough that he agreed.

    After the episode, I convinced Gwern to create a donation page where people can help sustain what he's up to. Please go here to contribute.

    Read the full transcript here.

    Sponsors:

    * Jane Street is looking to hire their next generation of leaders. Their deep learning team is looking for ML researchers, FPGA programmers, and CUDA programmers. Summer internships are open - if you want to stand out, take a crack at their new Kaggle competition. To learn more, go to janestreet.com/dwarkesh.

    * Turing provides complete post-training services for leading AI labs like OpenAI, Anthropic, Meta, and Gemini. They specialize in model evaluation, SFT, RLHF, and DPO to enhance models’ reasoning, coding, and multimodal capabilities. Learn more at turing.com/dwarkesh.

    * This episode is brought to you by Stripe, financial infrastructure for the internet. Millions of companies from Anthropic to Amazon use Stripe to accept payments, automate financial processes and grow their revenue.

    If you’re interested in advertising on the podcast, check out this page.

    Timestamps

    00:00:00 - Anonymity

    00:01:09 - Automating Steve Jobs

    00:04:38 - Isaac Newton's theory of progress

    00:06:36 - Grand theory of intelligence

    00:10:39 - Seeing scaling early

    00:21:04 - AGI Timelines

    00:22:54 - What to do in remaining 3 years until AGI

    00:26:29 - Influencing the shoggoth with writing

    00:30:50 - Human vs artificial intelligence

    00:33:52 - Rabbit holes

    00:38:48 - Hearing impairment

    00:43:00 - Wikipedia editing

    00:47:43 - Gwern.net

    00:50:20 - Counterfactual careers

    00:54:30 - Borges & literature

    01:01:32 - Gwern's intelligence and process

    01:11:03 - A day in the life of Gwern

    01:19:16 - Gwern's finances

    01:25:05 - The diversity of AI minds

    01:27:24 - GLP drugs and obesity

    01:31:08 - Drug experimentation

    01:33:40 - Parasocial relationships

    01:35:23 - Open rabbit holes



    Get full access to Dwarkesh Podcast at www.dwarkeshpatel.com/subscribe
    Voir plus Voir moins
    1 h et 37 min