AI Explained Official Podcast

Philip - Host of AI Explained YT

0,0 (0)
Notizie di tecnologia
Ogni settimana

Covering the biggest news of the century - the arrival of smarter-than-human AI. From the author of Simple Bench, which reveals the remaining gap between LLM and human reasoning. Hype-free, and the British accent is a freebie bonus.

14 giu

Claude Fable Blocked - 11 Quiet Details on What’s Next

Claude Fable 5 banned, but what’s the bigger story. We go through 11 under-reported details, so you have the context to see what’s coming next for your use of AI. From whether the ban will last, what the possible motives are, what the model can actually do, and some wild over-extrapolations going on. Check out my fast-growing (!) app, free to use, and code INSIDER15 for paid tiers: https://lmcouncil.ai AI Insiders ($9!): https://www.patreon.com/AIExplained Chapters: 00:00 - Introduction 00:51 - Came from an Anthropic Investor ‘and other tech leaders’ 01:47 - Govt pressured by CEOs like Jamie Dimon 03:01 - ‘Already decided’ 04:02 - Prompt Injection Robustness Comparison 05:15 - Wellness? 06:36 - “Overreach” 08:17 - Anthropic Did Admit it would cause Difficulty 09:32 - 90 Minutes 10:02 - Equity Absence 10:31 - Lobbying and OpenAI ‘Already Decided’ - https://www.theinformation.com/articles/amazons-jassy-raised-concerns-anthropic-model-trump-crackdown?rc=sy0ihq Not for Other Models: https://www.theinformation.com/briefings/u-s-government-unlikely-extend-anthropic-export-control-ai-companies?rc=sy0ihq 90 Minutes: https://archive.fo/20260614001605/https://www.politico.com/news/2026/06/13/inside-the-whirlwind-24-hours-that-led-the-white-house-to-slap-export-controls-on-anthropic-00961519#selection-807.1-807.219 Anthropic Statement: https://www.anthropic.com/news/fable-mythos-access Life Comes at you Fast: https://x.com/etbrooking/status/2065638276388495742 Anthropic Deputy CISO: https://x.com/TheTranscript_/status/2065883670053847324 Hegseth Gloat: https://x.com/PeteHegseth/status/2065897156226015690 Roon Speculation: https://x.com/tszzl/status/2065939227167392147 Mythos System Card: https://www-cdn.anthropic.com/d00db56fa754a1b115b6dd7cb2e3c342ee809620.pdf Sachs Statement: https://x.com/DavidSacks/status/2065853007619588171 OpenAI Lobbying: https://thehill.com/policy/technology/5912720-altman-openai-get-bogged-down-in-political-spending-fight/ Absent from Equity Talks: https://finance.yahoo.com/sectors/technology/articles/trump-ai-ownership-plan-could-131053732.html Pliny Jaibreak: https://x.com/elder_plinius/status/2064776322979676227 Fusion: https://x.com/OpenRouter/status/2065856871215329545 https://lmcouncil.ai Non-hype Newsletter: https://signaltonoise.beehiiv.com/ Podcast: https://aiexplainedopodcast.buzzsprout.com/

13 min
10 giu

Claude Fable 5 - Full 319 page Breakdown

Fable 5 is out - and it’s good, very good. But beyond the splashy demos, I want to bring you the 20+ nuggets from the 319 page system card, which I read in full, all day, plus benchmarks you may not have noticed. https://assemblyai.com/aiexplained Plus two worrying trends inside the ‘mind’ of Claude, how OpenAI counter, and the transformer inventor’s warning. Check out my fast-growing (!) app, free to use, and code INSIDER15 for paid tiers: https://lmcouncil.ai AI Insiders ($9!): https://www.patreon.com/AIExplained Chapters: 00:00 - Introduction 01:06 - Blocks + Better Models 02:42 - Fable 5 Upgrade over Mythos Preview 04:49 - ML Acceleration Bombshell 07:11 - No RSI yet 07:41 - Bio-capable 14:51 - Creative Writing … no 17:23 - Does need bug-checks 18:57 - OpenAI Response 19:23 - Benchmark Bonanza 28:06 - Chain of Thought worrying trend Fable 5 Release: https://www.anthropic.com/news/claude-fable-5-mythos-5 System Card: https://www-cdn.anthropic.com/d00db56fa754a1b115b6dd7cb2e3c342ee809620.pdf Intelligence Explosion: https://www.patreon.com/posts/anthropic-charts-160231656 Annotated: https://x.com/Miles_Brundage/status/2064500190523113816/photo/1 OpenAI Counter: https://x.com/thsottiaux/status/2064572118264913923 https://x.com/thsottiaux/status/2043177597434306699 Double Lifespan: https://darioamodei.com/essay/machines-of-loving-grace AutomationBench: https://zapier.com/benchmarks Vending Bench: https://x.com/andonlabs/status/2064429817530085804 CritPt: https://critpt.com/ Riemann Bench: https://surgehq.ai/leaderboards/riemann-bench GDPVal: https://artificialanalysis.ai/evaluations/gdpval-aa BluePrint Bench 2: https://andonlabs.com/evals/blueprint-bench-2 MCP Atlas: https://labs.scale.com/leaderboard/mcp_atlas FutureSim: https://x.com/nikhilchandak29/status/2064676801440358774 Roon Stun Lock: https://x.com/tszzl/status/2064454617568874669 Noam Brown Inference Ceiling: https://x.com/polynoamial/status/2064210146558136827 Isochronic Chart: https://isochronic-passage-chart.netlify.app/#nyc Rose Tavern: https://claude.ai/public/artifacts/2295bebe-77e6-43e2-ae94-0fe49e9a776b Redwall Game: https://redwall-mossflower.surge.sh/ Risk Report: https://www-cdn.anthropic.com/097c63b5fe7dd8b14866e1f15bb1910ec713658a.pdf Transformer Inventor Warning: https://x.com/tszzl/status/2064563986914554125 Non-hype Newsletter: https://signaltonoise.beehiiv.com/ Podcast: https://aiexplainedopodcast.buzzsprout.com/

34 min
29 mag

New Claude - 244 page breakdown

The ‘best’ generally available AI model just dropped, but there is plenty I bet you missed about what it is, how it performs, and what the release tells us. 15 highlights from the 244 page system card, plus private testing, leader interview and more. AI Insiders ($9!): https://www.patreon.com/AIExplained Chapters: 00:00 - Introduction 00:49 - Mythos in Weeks 01:49 - Adaptive not necessary 02:26 - Honesty? 04:37 - Flagging Uncertainty 04:57 - Benchmarks 08:54 - Mythos will be even better 10:30 - Business skillz 11:15 - Model Welfare 12:16 - Cyber Comparable 13:10 - Misalignment Concerns 16:22 - Meta Inabilities 17:58 - Code flagging 18:34 - Go to sleep 18:50 - Fast Mode 20:21 - Dynamic Workflows Opus 4.8 Paper: https://cdn.sanity.io/files/4zrzovbb/website/c886650a2e96fc0925c805a1a7ca77314ccbf4a6.pdf Release: https://www.anthropic.com/news/claude-opus-4-8 Chips: https://www.theinformation.com/articles/anthropic-talks-use-microsofts-ai-chips?rc=sy0ihq https://www.anthropic.com/news/expanding-our-use-of-google-cloud-tpus-and-services https://www.anthropic.com/news/higher-limits-spacex Patreon Vid: https://www.patreon.com/posts/re-up-anthropics-159289449 GDPVal: https://artificialanalysis.ai/evaluations/omniscience https://arxiv.org/abs/2510.04374 Amodei Technical Debt: https://www.youtube.com/watch?v=7xco5Qd2Oo8 Dynamic Workflows: https://x.com/ClaudeDevs/status/2060044853279617150 https://x.com/_catwu/status/2060054180379689074/photo/1 https://claude.com/blog/introducing-dynamic-workflows-in-claude-code https://simple-bench.com/ Check out my fast-growing (!) app, free to use, and code INSIDER15 for paid tiers: https://lmcouncil.ai Non-hype Newsletter: https://signaltonoise.beehiiv.com/ Podcast: https://aiexplainedopodcast.buzzsprout.com/

22 min
20 mag

Two Rival Bets on AGI: Google I/O Highlights

The biggest Google AI push of the year, but what is the bigger story? Why is Google pursuing a different fork in the road than OpenAI or Anthropic? What does Gemini 3.5 Flash mean for the near-term future of AI? https://assemblyai.com/aiexplained Plus the highlights from a provocative new paper on AI, 8 key moments you may have missed, and the signal from 5+ hours of AI lab interviews. Check out my free to use app, code INSIDER15 for paid tiers: https://lmcouncil.ai AI Insiders ($9!): https://www.patreon.com/AIExplained Chapters: 00:00 - Introduction 00:38 - Vibes and Google Goal 02:18 - Omni, again? 06:57 - Taking the same road 07:44 - Gemini 3 Flash 12:37 - Pitching on Cost? 13:55 - Agentic Task Search 14:30 - 1-shot OS but jagged, negation paper 20:02 - The Karpathy Moonshot Mostafa Deghani Interview: https://www.youtube.com/watch?v=Bo19sXssYXI Negation Neglect Paper: https://arxiv.org/pdf/2605.13829 Gemini 3.5 Flash Headline Scores: https://deepmind.google/models/model-cards/gemini-3-5-flash/ Sors original AGI Path: https://www.theguardian.com/commentisfree/2024/feb/24/openai-video-generation-tool-sora-babies-ai-artificial-intelligence Hassabis Helped Set-up Anthropic: https://archive.fo/20260519070857/https://www.ft.com/content/8f2a529e-7a1b-4d8e-95be-338d0c4c98f5 Intelligence to Output Speed: https://artificialanalysis.ai/models?intelligence-comparison=intelligence-vs-output-speed#intelligence VibeCodeBench + Finance Agent: https://www.vals.ai/home OpenAI Needs Ads: https://archive.ph/20260409123153/https://www.reuters.com/business/media-telecom/openai-projects-25-billion-ad-revenue-this-year-100-billion-by-2030-axios-2026-04-09/ Anthropic Core Views: https://www.anthropic.com/news/core-views-on-ai-safety Karpathy Move: https://x.com/karpathy/status/2056753169888334312 https://www.axios.com/2026/05/19/anthropic-openai-karpathy-andrej-claude Recursive Self-Improvement: https://www.patreon.com/posts/ineffably-smart-156866417 Non-hype Newsletter: https://signaltonoise.beehiiv.com/ Podcast: https://aiexplainedopodcast.buzzsprout.com/

22 min
24 apr

GPT 5.5 Arrives, DeepSeek V4 Drops, and the Compute War Intensifies

GPT 5.5 full analysis, plus DeepSeek V4 paper highlights, comparisons with Mythos, a vibe-coded game w/ GPT Image 2, and 50 data-points you wouldn’t get from just reading the headlines. Chapters: 01:11 - GPT 5.5 Comparison 06:04 - Mythos Marketing 11:50 - Recursive Self-Improvement? 14:11 - Deepseek V4 18:03 - VibeCode Experiment Extravaganza 21:44 - The Scarce Compute Era https://80000hours.org/aiexplained OpenAI Benchmarks: https://openai.com/index/introducing-gpt-5-5/ 5.5 System Card: https://deploymentsafety.openai.com/gpt-5-5/gpt-5-5.pdf Direct Comparison: https://pbs.twimg.com/media/HGnNm5GWEAAJ1Ob?format=jpg&name=4096x4096 DeepSeek Paper: https://huggingface.co/deepseek-ai/DeepSeek-V4-Pro SWE Bench Pro - benchmark of choice? https://x.com/ChowdhuryNeil/status/2047416077622395025 AA Omniscience: https://artificialanalysis.ai/evaluations/omniscience Vending Bench: https://x.com/andonlabs/status/2047377260412649967 Opus 4.7 System Card: https://cdn.sanity.io/files/4zrzovbb/website/037f06850df7fbe871e206dad004c3db5fd50340.pdf Sam Altman Drunk Phase: https://x.com/sama/with_replies Noam Brown: https://x.com/polynoamial/status/2047387675762802998 DeepSeek Compute Crunch: https://www.bloomberg.com/news/articles/2026-04-24/deepseek-unveils-newest-flagship-a-year-after-ai-breakthrough?srnd=phx-ai Spreadsheet Bench: https://x.com/nicochristie/status/2047476237464211721 Pattern Recognition: https://arcprize.org/leaderboard Leader Interviews: Core Memory: https://www.youtube.com/watch?v=NCKQL0op30E Knowledge Podcast: https://www.youtube.com/watch?v=6JoUcQ1qmAc Big Tech Round 1: https://www.youtube.com/watch?v=J6vYvk7R190&t=1116s Big Tech Round 2: https://www.youtube.com/watch?v=YnoQ8RJbALw&t=8s Claude Code Limitations: https://x.com/TheAmolAvasare/status/2046724659039932830 ChatGPT 5.4 for Clinicians: https://openai.com/index/making-chatgpt-better-for-clinicians/ Image Arena: https://x.com/arena/status/2046670703311884548 VibeCode Bench: https://www.vals.ai/benchmarks/vibe-code 5.5-made Game +Seedance 2.0: https://rosemere-quest.pages.dev/

25 min
17 apr

Claude Opus 4.7 - A New Frontier, in Performance … and Drama

Claude Opus 4.7 just dropped, but behind every headline lies a deeper story. From a bonanza of benchmarks, to seeing the fruits of one of the biggest mega-projects in US history, to sneaky Mythos disclaimers, to Anthropic admitting compute restraints and, forcing lower capability of Opus 4.7. Where the new model falls behind Gemini but ahead of GPT 5.4, plus why some users are furious at Anthropic. Ending with a 9-year animus, that still affects AI today… https://assemblyai.com/aiexplained Check out my fast-growing (!) app, free to use, and code INSIDER15 for paid tiers: https://lmcouncil.ai AI Insiders ($9!): https://www.patreon.com/AIExplained Chapters: 00:00 - Introduction 00:58 - Benchmarks 05:21 - Market Share + Compute Problems 08:12 - Mythos Exclusives 12:56 - User Frustration + Claude Code Updates 14:03 - Brockman Amodei Rivalry 17:40 - OpenAI vs Anthropic Approach to Code Claude 4.7 Opus Release Notes: https://www.anthropic.com/news/claude-opus-4-7 vs Mythos: https://pbs.twimg.com/media/HGCGugrXUAAKcHp?format=jpg&name=medium 232-page System Card: https://cdn.sanity.io/files/4zrzovbb/website/037f06850df7fbe871e206dad004c3db5fd50340.pdf ARC-AGI 2: https://x.com/arcprize/status/2044834615417053305/photo/1 ParseBench: https://x.com/jerryjliu0/status/2044902620746363016/photo/1 GDPVal: https://artificialanalysis.ai/evaluations/gdpval-aa Vidoc Security Replication: https://blog.vidocsecurity.com/blog/we-reproduced-anthropics-mythos-findings-with-public-models Boris Cherny Settings: https://x.com/Hesamation/status/2043016923961577516/photo/2 User Frustration: https://x.com/RileyRalmuto/status/2044836116189069660 VibeCode Bench: https://x.com/ValsAI/status/2044791415524471099/photo/1 Verge Memo: https://www.theverge.com/ai-artificial-intelligence/911118/openai-memo-cro-ai-competition-anthropic 5.4 Cyber: https://openai.com/index/scaling-trusted-access-for-cyber-defense/ Data Centers in Absolute $: https://x.com/finmoorhouse/status/2044933442236776794/photo/1 …in % of GDP: https://pbs.twimg.com/media/HGEN8FGWQAAN7Np?format=jpg&name=4096x4096 WSJ Exclusive: https://www.wsj.com/tech/ai/the-decadelong-feud-shaping-the-future-of-ai-7075acde Brockman Interview: https://www.youtube.com/watch?v=J6vYvk7R190 $1T Valuation: https://x.com/StefanFSchubert/status/2045039686997967082 Emotions: https://www.patreon.com/c/aiexplained/posts https://lmcouncil.ai/benchmarks Non-hype Newsletter: https://signaltonoise.beehiiv.com/

20 min
8 apr

Claude Mythos: Highlights from 244-page Release

The model, the mythos, the legend. We have a new best AI model, but not all of us. How good is it, what does it’s new offensive capabilities mean? Why does it’s 244 page report card remind me of Her, and why did the creator of Claude Code call it ‘terrifying’. 30+ highlights sourced by reading the paper in full, old-school, no AI summary. https://80000hours.org/aiexplained Check out my fast-growing (!) app, free to use, and code INSIDER15 for paid tiers: https://lmcouncil.ai AI Insiders ($9!): https://www.patreon.com/AIExplained Chapters: 00:00 - Introduction 00:56 - Internal Release + Availability 02:37 - General Capabilities 05:12 - Self-improvement? 06:15 - ‘Terrifying’ Landscape 11:07 - Safety Decision 13:22 - Coding 14:49 - Alignment, Awareness 19:52 - GUI for Agents/Claws + Hallucinations 21:34 - …Emotions? 25:29 - Her connection 244-page System Card: https://www-cdn.anthropic.com/8b8380204f74670be75e81c820ca8dda846ab289.pdf Project Glasswing: https://www.anthropic.com/glasswing Zero-Day Details: https://red.anthropic.com/2026/mythos-preview/ Mythos ‘terrifying’: https://x.com/bcherny/status/2041605852382351666 New Yorker Altman/Amodei: https://archive.fo/20260406100412/https://www.newyorker.com/magazine/2026/04/13/sam-altman-may-control-our-future-can-he-be-trusted Alignment Risk Update: https://www-cdn.anthropic.com/79c2d46d997783b9d2fb3241de43218158e5f25c.pdf In a Park: https://x.com/sleepinyourhat/status/2041584808514744742 “Uhm” - https://x.com/thsottiaux/status/2041749947385815109 Non-hype Newsletter: https://signaltonoise.beehiiv.com/ Podcast: https://aiexplainedopodcast.buzzsprout.com/

28 min
26 mar

OpenAI Spud, a Claude Model set to ‘stir governments’, Beast Mode ARC-AGI-3

First look at exclusive reports about OpenAI's new Spud model, and the model Anthropic think will stir governments to urgency, all in the context of the newly-launched ARC-AGI-3. What does the extreme difficulty of that benchmarks, and its quirky scoring metrics, mean for AI in 2026? https://assemblyai.com/aiexplained Check out my fast-growing (!) app, free to use, and code INSIDER15 for paid tiers: https://lmcouncil.ai AI Insiders ($9!): https://www.patreon.com/AIExplained Chapters: 00:00 - Introduction 00:55 - OpenAI Side Quests 01:58 - Claude New Model Coming + Universal Equity? 03:13 - ARC-AGI 3 05:00 - Intentional or Unintentional Gaming? 07:11 - But is it AGI Harbinger? No Harness 09:41 - Not the First 12:32 - Automated Researcher 15:00 - Claw Caveat Spud: https://www.theinformation.com/articles/openai-ceo-shifts-responsibilities-preps-spud-ai-model?utm_campaign=Editorial&utm_content=Article&utm_medium=organic_social&utm_source=bluesky%2Cfacebook%2Clinkedin%2Cthreads%2Ctwitter&rc=sy0ihq FT: OpenAI Special Model: https://www.ft.com/content/de9bf0af-b241-424f-8229-5870b1c0d93d?syn-25a6b1a6=1 Jensen Huang: https://www.forbes.com/sites/antoniopequenoiv/2026/03/23/nvidias-jensen-huang-says-he-thinks-weve-achieved-agi/ Axios Article: https://archive.fo/20260326100140/https://www.axios.com/2026/03/26/anthropic-pentagon-ai-deal#selection-827.0-829.257 https://arcprize.org/arc-agi/3 ARC AGI 3 Paper: https://arcprize.org/media/ARC_AGI_3_Technical_Report.pdf NetHack Leaderboard: https://balrogai.com/ Paper: https://ai.meta.com/research/publications/the-nethack-learning-environment/ https://x.com/_rockt/status/2036864121585438995 Claw Shells: https://x.com/DrJimFan/status/2036494601750716711 OpenAI Automated Researcher: https://www.technologyreview.com/2026/03/20/1134438/openai-is-throwing-everything-into-building-a-fully-automated-researcher/ Patreon Post: https://www.patreon.com/c/aiexplained/posts Eng Jobs: https://x.com/lennysan/status/2036535460726767793 Non-hype Newsletter: https://signaltonoise.beehiiv.com/ Podcast: https://aiexplainedopodcast.buzzsprout.com/

16 min

Elenco completo (59)

Creatore

Philip - Host of AI Explained YT
Anni di attività

2024 - 2026
Puntate

59
Classificazione

Contenuti adatti a tutti
Sito web del podcast

AI Explained Official Podcast

Tecnologia

Tecnologia

Bisettimanale
Tecnologia

Tecnologia

Ogni settimana
Tecnologia

Tecnologia

Ogni settimana
Tecnologia

Tecnologia

Ogni settimana
Tecnologia

Tecnologia

Ogni settimana
Tecnologia

Tecnologia

Ogni settimana
Tecnologia

Tecnologia

Ogni settimana

AI Explained Official Podcast

Claude Fable Blocked - 11 Quiet Details on What’s Next

Claude Fable 5 - Full 319 page Breakdown

New Claude - 244 page breakdown

Two Rival Bets on AGI: Google I/O Highlights

GPT 5.5 Arrives, DeepSeek V4 Drops, and the Compute War Intensifies

Claude Opus 4.7 - A New Frontier, in Performance … and Drama

Claude Mythos: Highlights from 244-page Release

OpenAI Spud, a Claude Model set to ‘stir governments’, Beast Mode ARC-AGI-3

Descrizione

Informazioni

Potrebbero piacerti anche…

AI Explained Official Podcast

Puntate

Claude Fable Blocked - 11 Quiet Details on What’s Next

Claude Fable 5 - Full 319 page Breakdown

New Claude - 244 page breakdown

Two Rival Bets on AGI: Google I/O Highlights

GPT 5.5 Arrives, DeepSeek V4 Drops, and the Compute War Intensifies

Claude Opus 4.7 - A New Frontier, in Performance … and Drama

Claude Mythos: Highlights from 244-page Release

OpenAI Spud, a Claude Model set to ‘stir governments’, Beast Mode ARC-AGI-3

Descrizione

Informazioni

Potrebbero piacerti anche…