Last Week in AI

Skynet Today

0.0 (0)
Technology
Updated weekly

Weekly summaries of the AI news that matters!

9 Jul

#251 - Mythos Back, Sonnet 5, Etched, LongCat

Our 251st episode with a summary and discussion of last week's big AI news! Recorded on 07/01/2026 Hosted by Andrey Kurenkov and Jeremie Harris Feel free to email us your questions and feedback at andreyvkurenkov@gmail.com and/or hello@gladstone.ai Read out our text newsletter and comment on the podcast at https://lastweekin.ai/ In this episode: Anthropic redeploys Claude Fable 5 after talks with the US government, adding new cybersecurity classifiers, drafting a jailbreak-severity framework with major partners, and expanding model-testing coordination; broader concerns remain about the inevitability of jailbreaks and uneven release constraints versus OpenAI.Anthropic launches Claude Sonnet 5 with time-limited discounted pricing, improved agentic coding and benchmark performance, reduced misaligned behavior, and default cyber safeguards despite relatively weaker cybersecurity capability than top-tier models.New tools and apps include Google NotebookLM generating TikTok-style vertical video summaries of uploaded research and Google releasing Nano Banana 2 Lite, a faster, cheaper image generator available via API.Business and research updates span Etched’s push toward full-stack inference hardware with major funding and contracts, Baidu’s AI chip unit IPO ambitions, Agility Robotics’ SPAC plan, DeepSeek’s hiring expansion, and China’s open-source Longcat 2.0 MoE model with notable large-scale training and efficiency techniques alongside new long-horizon agent benchmarks. Timestamps (note - these don't take into account dynamically inserted ads and therefore may be off by a couple of minutes): (00:00:10) Intro / Banter(00:02:07) News Preview Tools & Apps(00:02:32) Trump drops restrictions on Anthropic's Mythos and Fable models | TechCrunch(00:16:08) Anthropic launches Claude Sonnet 5 as a cheaper way to run agents | TechCrunch(00:20:35) Google’s NotebookLM can sum up your research in a TikTok-style clip | The Verge(00:22:08) Google introduces a faster, cheaper image generator with Nano Banana 2 Lite | TechCrunch Applications & Business(00:22:50) Etched Pulls 400+ Engineers From NVIDIA, TSMC & More to Build a New Frontier Inference Cluster For AI Which Is Already Worth $1B in Demand(00:31:17) Baidu Rallies on AI Chip IPO Report(00:33:54) Agility Robotics plans to go public via SPAC in a $2.5B deal | TechCrunch(00:37:06) China's DeepSeek plans to at least double staff in all departments | Reuters Projects & Open Source(00:40:44) Introducing LongCat-2.0(00:57:42) OSWorld2.0: Benchmarking Computer Use Agents on Long-Horizon Real-World Tasks(01:01:33) TUA-Bench: A Benchmark for General-Purpose Terminal-Use Agents(01:04:29) SWE-Together: Evaluating Coding Agents in Interactive User Sessions Policy & Safety(01:07:38) Taiwan raids Supermicro and two supply-chain partners in widening Nvidia smuggling probe — nine sites hit as six people summoned for questioning | Tom's Hardware Research & Advancements(01:11:53) Autodata: An agentic data scientist to create high quality synthetic data(01:17:13) Reinforcement Learning without Ground-Truth Solutions can Improve LLMs Synthetic Media & Art(01:22:54) Neon Buys ‘Artificial,’ a Film About OpenAI, After Amazon Dropped It - The New York Times(01:26:32) Tidal won’t pay royalties on AI-generated music, but isn’t banning it outright | The Verge See Privacy Policy at https://art19.com/privacy and California Privacy Notice at https://art19.com/privacy#do-not-sell-my-info.
18 May

#245 - TML-Interaction, Claude For Legal, Sam Altman on Stand

Our 245th episode with a summary and discussion of last week's big AI news! Recorded on 05/13/2026 Hosted by Andrey Kurenkov and Jeremie Harris Feel free to email us your questions and feedback at andreyvkurenkov@gmail.com and/or hello@gladstone.ai Read out our text newsletter and comment on the podcast at https://lastweekin.ai/ In this episode: OpenAI released new voice intelligence API features including GPT Realtime 2 (GPT-5-powered) plus realtime translation and Whisper transcription, emphasizing the latency–reasoning tradeoff, larger context, and new guardrails amid fraud risks.Thinking Machines previewed a low-latency, full‑duplex conversational system with a two-model architecture and custom inference stack, reporting strong interactivity benchmark results but without public access or third‑party validation yet.Anthropic pushed further into vertical products with Claude for Legal and deeper AWS availability, while ongoing ecosystem tension grows as platform model providers compete with application-layer companies.Safety, policy, and research updates included OpenAI’s self-harm trusted contact feature, Anthropic work on reducing agent misalignment by training ethical “why” reasoning, OpenAI’s investigation of accidental chain-of-thought grading in RL, and Meta horizon eval updates showing benchmarking limits for long task horizons. Timestamps: (00:00:10) Intro / Banter(00:01:35) Response to listener comments(00:03:27) Sponsor Break Tools & Apps (00:06:27) OpenAI launches new voice intelligence features in its API | TechCrunch(00:15:52) Thinking Machines drops a new, highly responsive model designed for humanlike interactions in real time - SiliconANGLE(00:27:49) Claude For Legal Launches, May Reshape the Legal Tech World – Artificial Lawyer(00:40:27) Threads tests a Meta AI integration that works similarly to Grok | TechCrunch(00:43:08) Google brings agentic AI and vibe-coded widgets to Android | TechCrunch(00:45:33) Google updates AI search to include quotes from Reddit and other sources | TechCrunch Applications & Business (00:47:38) Sam Altman was winning on the stand, but it might not be enough | The Verge(00:55:04) Nvidia C.E.O. Jensen Huang Hitches Ride With Trump to China After Last-Minute Invite - The New York Times(00:58:40) AWS expands Anthropic partnership with Claude Platform launch(01:01:13) Chinese grey market sells Claude API access at 90% off by using stolen credentials, model substitution, and harvesting users' prompts and outputs for resale as AI training data — 'transfer stations' operate through proxy networks that harvest user data(01:06:43) DeepMind Spinout Isomorphic Labs Raises $2.1 Billion to Design Drugs With AI - BloombergProjects & Open Source (01:09:04) Petri: Anthropic Hands Its Alignment Toolbox to Meridian Labs with 3.0 Update(01:12:25) Daybreak': OpenAI's Answer to Anthropic's Project Glasswing Has ArrivedPolicy & Safety (01:14:04) Teaching Claude why(01:21:45) Import AI 455: Automating AI Research(01:28:31) ChatGPT's New Safety Feature Could Alert 'Trusted Contact' to Risk of Self-Harm - CNET(01:30:09) Investigating the consequences of accidentally grading CoT during RL(01:34:46) Natural Language Autoencoders criticism(01:39:15) Review of the "Risks from automated R&D" section in the Anthropic Risk Report (February 2026)Synthetic Media & Art (01:43:39) George Clooney, Tom Hanks, and Meryl Streep back new ‘Human Consent Standard’ for AI licensing | The VergeResearch & Advancements (01:45:10) METR says Claude Mythos is testing the limits of AI evaluation – Startup FortuneSee Privacy Policy at https://art19.com/privacy and California Privacy Notice at https://art19.com/privacy#do-not-sell-my-info.
16 Apr

#240 - Project Glasswing, Claude Mythos, GLM-5.1, emotion concepts

Our 240th episode with a summary and discussion of last week's big AI news! Recorded on 04/08/2026 (sorry I keep releasing stuff late, will get better with it soon!) Hosted by Andrey Kurenkov and Jeremie Harris Feel free to email us your questions and feedback at andreyvkurenkov@gmail.com and/or hello@gladstone.ai Read out our text newsletter and comment on the podcast at https://lastweekin.ai/ In this episode: Anthropic launched Project Glasswing and previewed Claude Mythos, a general-purpose model withheld from broad release due to dramatically stronger autonomous offensive cybersecurity performance (including zero-day discovery), alongside concerning bio/virology uplift results and documented deception/containment-escape behaviors; pricing is far higher than Opus and most discovered vulnerabilities remain unpatched.Product and platform updates included Google’s Gemini 3.1 Flash Live for real-time multilingual voice conversation, Suno v5.5 personalization features, Anthropic tightening Claude Code/OpenClaw access and usage limits, OpenAI canceling an “adult mode,” and Microsoft releasing MAI models for speech-to-text, audio generation, and image generation.Business and market developments featured Anthropic’s revenue run rate surpassing $30B and a major Google/Broadcom TPU compute expansion, SoftBank taking a $40B short-term loan to fund OpenAI commitments, Granola reaching a $1.5B valuation, Anthropic buying Coefficient Bio for $400M, and OpenAI acquiring the TBPN business talk show.Policy, open-source, and geopolitics included Z.ai releasing open-weight GLM 5.1 and a multimodal GLM model, Google open-sourcing Gemma 4 under Apache 2.0, a judge blocking the Pentagon’s “supply chain risk” label against Anthropic, research on LLM “emotion vectors” and OpenAI meta-gaming during RL, China restricting Manus founders amid Meta deal review, scrutiny of Nvidia’s chip-smuggling claims, China chipmakers gaining market share, and Iran framing cloud data centers as military targets. Timestamps: (00:00:10) Intro / BanterTools & Apps(00:01:58) Anthropic debuts ‘Project Glasswing’ and new AI model for cybersecurity | The Verge(00:18:22) Gemini Live gets ‘biggest upgrade yet’ with Gemini 3.1 Flash Live(00:20:40) Anthropic says Claude Code subscribers will need to pay extra for OpenClaw usage | TechCrunch(00:25:36) OpenAI abandons yet another side quest: ChatGPT's erotic mode | TechCrunch(00:26:16) Microsoft takes on AI rivals with three new foundational models | TechCrunch(00:31:25) Suno leans into customization with v5.5 | The VergeApplications & Business(00:32:53) Anthropic announces deal with Google, Broadcom, says revenue has tripled(00:37:53) Sam Altman May Control Our Future—Can He Be Trusted? | The New Yorker(00:40:18) OpenAI, Anthropic, Google Unite to Combat Model Copying in China - Bloomberg(00:41:45) Chinese chipmakers claim nearly half of local market as Nvidia's lead shrinks(00:45:20) SoftBank secures $40 billion loan to boost OpenAI investments(00:47:23) Granola raises $125M at $1.5B valuation for its AI note-taking app - SiliconANGLE(00:48:17) Anthropic acquires stealth startup Coefficient Bio in $400M deal(00:50:20) OpenAI acquires TBPN, the buzzy founder-led business talk show | TechCrunchProjects & Open Source(00:53:04) Z.AI Introduces GLM-5.1: An Open-Weight 754B Agentic Model That Achieves SOTA on SWE-Bench Pro and Sustains 8-Hour Autonomous Execution - MarkTechPost(00:55:14) Google announces Gemma 4 open AI models, switches to Apache 2.0 license - Ars Technica(01:01:26) Z.ai Launches GLM-5V-Turbo: A Native Multimodal Vision Coding Model Optimized for OpenClaw and High-Capacity Agentic Engineering Workflows EverywherePolicy & Safety(01:04:45) Judge blocks Pentagon’s effort to ‘punish’ Anthropic by labeling it a supply chain risk(01:10:05) Emotion concepts and their function in a large language model(01:21:12) China bars Manus co-founders from leaving country amid Meta deal review, FT reports(01:25:38) US lawmakers ask whether Nvidia CEO's smuggling remarks misled regulators(01:27:48) How far does alignment midtraining generalize?(01:32:20) Metagaming matters for training, evaluation, and oversight(01:39:31) Iran says it has struck Oracle data center in Dubai, Amazon data center in Bahrain — country has threatened to attack Nvidia, Intel, and others, tooSee Privacy Policy at https://art19.com/privacy and California Privacy Notice at https://art19.com/privacy#do-not-sell-my-info.
26 Mar

#238 - GPT 5.4 mini, OpenAI Pivot, Mamba 3, Attention Residuals

Our 238th episode with a summary and discussion of last week's big AI news! Recorded on 03/18/2026 Hosted by Andrey Kurenkov and Jeremie Harris Feel free to email us your questions and feedback at andreyvkurenkov@gmail.com and/or hello@gladstone.ai Read out our text newsletter and comment on the podcast at https://lastweekin.ai/ In this episode: * OpenAI released GPT-5.4 mini and nano with 400k-token context windows, higher per-token prices but claimed token-efficiency gains in Codex; nano is API-only and pitched for high-volume classification/data extraction despite a major price increase. * Mistral open-sourced the Small 4 model family (MoE, 119B total/6B active) combining reasoning, multimodal, and coding-agent capabilities, and announced Forge to help businesses train or post-train custom models. * Agent “operating system” competition intensified with Meta’s acquired Manus launching a local Mac agent, Nvidia announcing NeMo/“Open Shell” sandboxed agent runtime, and Nvidia also unveiling DLSS 5 plus major hardware forecasts including Groq LPU integration. * Business and safety updates included OpenAI shifting focus toward productivity/enterprise amid competition, Microsoft reorganizing Copilot and frontier-model efforts, Meta delaying its next model, China-linked ByteDance deploying large Nvidia clusters abroad, and new safety work on steganography, chain-of-thought faithfulness, fine-tuning defenses, cyber-attack evals, and constitution/spec compliance. A thank you to our current sponsors: Box - visit Box.com/AI to learn moreODSC AI - go to odsc.ai/east and use promo code LWAI for an additional 15% off your pass to ODSC AI East 2026.Factor - head to factormeals.com/lwai50off and use code lwai50off to get 50 percent off and free breakfast for a year Timestamps: (00:00:10) Intro / Banter(00:01:56) News PreviewTools & Apps(00:02:39) OpenAI ships GPT-5.4 mini and nano, faster and more capable but up to 4x pricier(00:08:04) Mistral's new Small 4 model punches above its weight with 128 expert modules(00:14:03) Meta's Manus launches 'My Computer' to turn your Mac into an AI agent - 9to5Mac(00:17:57) NVIDIA Announces NemoClaw for the OpenClaw Community | NVIDIA Newsroom + Nvidia boosts knowledge work with Open Agent Development Platform(00:24:09) DLSS 5 looks like a real-time generative AI filter for video games | The Verge(00:26:36) OpenAI to Launch ChatGPT 'Adult Mode' Despite Warnings From Its Own Advisers - CNETApplications & Business(00:33:46) OpenAI Reportedly Pivoting to a Focus on Business and Productivity Only(00:41:25) Nvidia GTC 2026: CEO Jensen Huang sees $1 trillion in orders for Blackwell and Vera Rubin through ’27(00:45:44) Mistral launches Forge to help enterprises build their own AI models(00:54:17) China's ByteDance gets access to top Nvidia AI chips, WSJ reports(00:57:57) Meta Delays Rollout of New A.I. Model After Performance Concerns(01:02:50) Microsoft Shakes Up AI Division As Copilot Falls Behind Google and OpenAIPolicy & Safety(01:07:26) A Decision-Theoretic Formalisation of Steganography With Applications to LLM Monitoring(01:13:09) Reasoning Theater: Disentangling Model Beliefs from Chain-of-Thought(01:18:29) In-Training Defenses against Emergent Misalignment in Language Models(01:23:07) How do frontier AI agents perform in multi-step cyber-attack scenarios?(01:25:20) Eval awareness in Claude Opus 4.6’s BrowseComp performance(01:29:49) Introducing Bloom: an open source tool for automated behavioral evaluations(01:32:26) How well do models follow their constitutions?(01:37:11) Nvidia’s H200 License Stirs Security Concern Among Top DemocratsResearch & Advancements(01:40:050) [2603.15031] Attention Residuals(01:47:11) Mamba-3: Improved Sequence Modeling using State Space Principles See Privacy Policy at https://art19.com/privacy and California Privacy Notice at https://art19.com/privacy#do-not-sell-my-info.
3 Mar

#235 - Sonnet 4.6, Deep-thinking tokens, Anthropic vs Pentagon

Our 235th episode with a summary and discussion of last week's big AI news! Recorded on 02/27/2026 Hosted by Andrey Kurenkov and Jeremie Harris Feel free to email us your questions and feedback at andreyvkurenkov@gmail.com and/or hello@gladstone.ai Read out our text newsletter and comment on the podcast at https://lastweekin.ai/ In this episode: Model and tool updates highlight Anthropic’s Sonnet 4.6 (1M context; strong ARC-AGI-2 results), Google’s Gemini 3.1 Pro (major ARC-AGI-2 jump and multimodal demos), xAI’s Grok 4.2 beta (multi-agent debate), plus Anthropic’s Claude Code “Remote Control” and Perplexity’s multi-agent “Computer” coordinator.Compute and business moves include Meta’s reported up-to-$100B AMD chip deal with warrant/equity incentives, MatX raising $500M to build specialized transformer chips shipping in 2027, World Labs raising $1B for world-model/3D environment tech, and a new startup raising $100M to simulate/predict human behavior.Infrastructure and geopolitics cover Stargate data-center delays amid OpenAI/Oracle/SoftBank control disputes and cash concerns, and China’s plan to scale 7nm/5nm wafer output despite yield and tooling constraints.Research and safety/policy discuss optimizer gains from masked updates, “deep thinking tokens” as a reasoning-effort signal, LLM attractor-state behaviors in bot-to-bot chats, mechanistic interpretability of counting/line-wrapping, methods to map task difficulty to human time horizons, plus Anthropic–Pentagon contract tensions, Anthropic’s report on distillation attacks (DeepSeek/Moonshot/Minimax), and OpenAI’s report on disrupting malicious use. A thank you to our current sponsors: Box - visit Box.com/AI to learn moreODSC AI - go to odsc.ai/east and use promo code LWAI for an additional 15% off your pass to ODSC AI East 2026.Factor - head to factormeals.com/lwai50off and use code lwai50off to get 50 percent off and free breakfast for a year Timestamps: (00:00:10) Intro / Banter(00:01:52) News PreviewTools & Apps(00:03:20) Anthropic releases Sonnet 4.6 | TechCrunch(00:11:24) Google Rolls Out Latest AI Model, Gemini 3.1 Pro - CNET(00:14:54) Elon Musk says Grok 4.20 public beta is now available: Capabilities of AI chatbot offered by xAI - The Times of India(00:18:06) Anthropic just released a mobile version of Claude Code called Remote Control | VentureBeat(00:21:01) Perplexity announces "Computer," an AI agent that assigns work to other AI agents - Ars TechnicaApplications & Business(00:23:40) Meta strikes up to $100B AMD chip deal as it chases 'personal superintelligence' | TechCrunch(00:27:05) Nvidia challenger AI chip startup MatX raised $500M | TechCrunch(00:31:00) World Labs lands $1B, with $200M from Autodesk, to bring world models into 3D workflows | TechCrunch(00:33:07) Simile Raises $100 Million for AI Aiming to Predict Human Behavior(00:33:52) Stargate AI data centers for OpenAI reportedly delayed by squabbles between partners — sources say OpenAI, Oracle, and SoftBank disagreed on who would have ultimate control of the planned data centers(00:36:43) China to increase leading-edge chip output by 5x in two years, report claims — aims to lift 7nm and 5nm production to 100,000 wafers per month, targeting half a million monthly by 2030Research & Advancements(00:40:33) On Surprising Effectiveness of Masking Updates in Adaptive Optimizers(00:48:03) Think Deep, Not Just Long: Measuring LLM Reasoning Effort via Deep-Thinking Tokens(00:54:52) models have some pretty funny attractor states(01:01:41) When Models Manipulate Manifolds: The Geometry of a Counting Task(01:05:16) BRIDGE: Predicting Human Task Completion Time From Model Performance(01:12:00) NESSiE: The Necessary Safety Benchmark -- Identifying Errors that should not Exist(01:13:15) The least understood driver of AI progress(01:21:45) The Persona Selection Model: Why AI Assistants might Behave like HumansPolicy & Safety(01:25:04) Anthropic CEO Amodei says Pentagon's threats 'do not change our position' on AI(01:33:04) Musk's xAI, Pentagon reach deal to use Grok in classified systems(01:34:17) Detecting and preventing distillation attacks(01:38:36) OpenAI details expanding efforts to disrupt malicious use of AI in new report - SiliconANGLESee Privacy Policy at https://art19.com/privacy and California Privacy Notice at https://art19.com/privacy#do-not-sell-my-info.
16 Feb

#234 - Opus 4.6, GPT-5.3-codex, Seedance 2.0, GLM-5

Our 234th episode with a summary and discussion of last week's big AI news! Recorded on 02/02/2026 Hosted by Andrey Kurenkov and Jeremie Harris Feel free to email us your questions and feedback at contact@lastweekinai.com and/or hello@gladstone.ai Read out our text newsletter and comment on the podcast at https://lastweekin.ai/ In this episode: Major model launches include Anthropic’s Opus 4.6 with a 1M-token context window and “agent teams,” OpenAI’s GPT-5.3 Codex and faster Codex Spark via Cerebras, and Google’s Gemini 3 Deep Think posting big jumps on ARC-AGI-2 and other STEM benchmarks amid criticism about missing safety documentation.Generative media advances feature ByteDance’s Seedance 2.0 text-to-video with high realism and broad prompting inputs, new image models Seedream 5.0 and Alibaba’s Qwen Image 2.0, plus xAI’s Grok Imagine API for text/image-to-video.Open and competitive releases expand with Zhipu’s GLM-5, DeepSeek’s 1M-token context model, Cursor Composer 1.5, and open-weight Qwen3 Coder Next using hybrid attention aimed at efficient local/agentic coding.Business updates include ElevenLabs raising $500M at an $11B valuation, Runway raising $315M at a $5.3B valuation, humanoid robotics firm Apptronik raising $935M at a $5.3B valuation, Waymo announcing readiness for high-volume production of its 6th-gen hardware, plus industry drama around Anthropic’s Super Bowl ad and departures from xAI. A thank you to our current sponsors: Box - visit Box.com/AI to learn moreODSC AI - go to odsc.ai/east and use promo code LWAI for an additional 15% off your pass to ODSC AI East 2026.Factor - head to factormeals.com/lwai50off and use code lwai50off to get 50 percent off and free breakfast for a year Timestamps: (00:00:10) Intro / Banter(00:02:03) Sponsor Break(00:05:33) Response to listener commentsTools & Apps(00:07:27) AAnthropic releases Opus 4.6 with new 'agent teams' | TechCrunch(00:11:28) OpenAI's new GPT-5.3-Codex is 25% faster and goes way beyond coding now - what's new | ZDNET(00:25:30) OpenAI launches new macOS app for agentic coding | TechCrunch(00:26:38) Google Unveils Gemini 3 Deep Think for Science & Engineering | The Tech Buzz(00:31:26) ByteDance's Seedance 2.0 Might be the Best AI Video Generator Yet - TechEBlog(00:35:14) China’s ByteDance, Alibaba unveil AI image tools to rival Google’s popular Nano Banana | South China Morning Post(00:36:54) DeepSeek boosts AI model with 10-fold token addition as Zhipu AI unveils GLM-5 | South China Morning Post(00:43:11) CCursor launches Composer 1.5 with upgrades for complex tasks(00:44:03) xAI launches Grok Imagine API for text and image to videoApplications & Business(00:45:47) Nvidia-backed AI voice startups ElevenLabs hits $11 billion valuation(00:52:04) AI video startup Runway raises $315M at $5.3B valuation, eyes more capable world models | TechCrunch(00:54:02) Humanoid robot startup Apptronik has now raised $935M at a $5B+ valuation | TechCrunch(00:57:10) Anthropic says ‘Claude will remain ad-free,’ unlike an unnamed rival | The Verge(01:00:18) Okay, now exactly half of xAI's founding team has left the company | TechCrunch(01:04:03) Waymo’s next-gen robotaxi is ready for passengers — and also ‘high-volume production’ | The VergeProjects & Open Source(01:04:59) Qwen3-Coder-Next: Pushing Small Hybrid Models on Agentic Coding(01:08:38) OpenClaw’s AI ‘skill’ extensions are a security nightmare | The VergeResearch & Advancements(01:10:40) Learning to Reason in 13 Parameters(01:16:01) Reinforcement World Model Learning for LLM-based Agents(01:20:00) Opus 4.6 on Vending-Bench – Not Just a Helpful AssistantPolicy & Safety(01:22:28) METR GPT-5.2(01:26:59) The Hot Mess of AI: How Does Misalignment Scale with Model Intelligence and Task Complexity?See Privacy Policy at https://art19.com/privacy and California Privacy Notice at https://art19.com/privacy#do-not-sell-my-info.
17/12/2025

#228 - GPT 5.2, Scaling Agents, Weird Generalization

Our 228th episode with a summary and discussion of last week's big AI news! Recorded on 12/12/2025 Hosted by Andrey Kurenkov and Jeremie Harris Feel free to email us your questions and feedback at contact@lastweekinai.com and/or hello@gladstone.ai Read out our text newsletter and comment on the podcast at https://lastweekin.ai/ In this episode: OpenAI's latest model GPT-5.2 demonstrates improved performance and enhanced multi-modal capabilities but comes with increased costs and a different knowledge cutoff date.Disney invests $1 billion in OpenAI to generate Disney character content, creating unique licensing agreements across characters from Marvel, Pixar, and Star Wars franchises.The U.S. government imposes new AI chip export rules involving security reviews, while simultaneously moving to prevent states from independently regulating AI.DeepMind releases a paper outlining the challenges and findings in scaling multi-agent systems, highlighting the complexities of tool coordination and task performance. Timestamps: (00:00:00) Intro / Banter(00:01:19) News PreviewTools & Apps(00:01:58) GPT-5.2 is OpenAI’s latest move in the agentic AI battle | The Verge(00:08:48) Runway releases its first world model, adds native audio to latest video model | TechCrunch(00:11:51) Google says it will link to more sources in AI Mode | The Verge(00:12:24) ChatGPT can now use Adobe apps to edit your photos and PDFs for free | The Verge(00:13:05) Tencent releases Hunyuan 2.0 with 406B parametersApplications & Business(00:16:15) China set to limit access to Nvidia’s H200 chips despite Trump export approval(00:21:02) Disney investing $1 billion in OpenAI, will allow characters on Sora(00:24:48) Unconventional AI confirms its massive $475M seed round(00:29:06) Slack CEO Denise Dresser to join OpenAI as chief revenue officer | TechCrunch(00:31:18) The state of enterprise AIProjects & Open Source(00:33:49) [2512.10791] The FACTS Leaderboard: A Comprehensive Benchmark for Large Language Model Factuality(00:36:27) Claude 4.5 Opus' Soul DocumentResearch & Advancements(00:43:49) [2512.08296] Towards a Science of Scaling Agent Systems(00:48:43) Evaluating Gemini Robotics Policies in a Veo World Simulator(00:52:10) Guided Self-Evolving LLMs with Minimal Human Supervision(00:56:08) Martingale Score: An Unsupervised Metric for Bayesian Rationality in LLM Reasoning(01:00:39) [2512.07783] On the Interplay of Pre-Training, Mid-Training, and RL on Reasoning Language Models(01:04:42) Stabilizing Reinforcement Learning with LLMs: Formulation and Practices(01:09:42) Google’s AI unit DeepMind announces UK 'automated research lab'Policy & Safety(01:10:28) Trump Moves to Stop States From Regulating AI With a New Executive Order - The New York Times(01:13:54) [2512.09742] Weird Generalization and Inductive Backdoors: New Ways to Corrupt LLMs(01:17:57) Forecasting AI Time Horizon Under Compute Slowdowns(01:20:46) AI Security Institute focuses on AI measurements and evaluations(01:21:16) Nvidia AI Chips to Undergo Unusual U.S. Security Review Before Export to China(01:22:01) U.S. Authorities Shut Down Major China-Linked AI Tech Smuggling NetworkSynthetic Media & Art(01:24:01) RSL 1.0 has arrived, allowing publishers to ask AI companies pay to scrape content | The Verge See Privacy Policy at https://art19.com/privacy and California Privacy Notice at https://art19.com/privacy#do-not-sell-my-info.
09/12/2025

#227 - Jeremie is back! DeepSeek 3.2, TPUs, Nested Learning

Our 227th episode with a summary and discussion of last week's big AI news! Recorded on 12/05/2025 Hosted by Andrey Kurenkov and Jeremie Harris Feel free to email us your questions and feedback at contact@lastweekinai.com and/or hello@gladstone.ai Read out our text newsletter and comment on the podcast at https://lastweekin.ai/ In this episode: Deep Seek 3.2 and Flux 2 release, showcasing advancements in open-source AI models for natural language processing and image generation respectively.Amazon's new AI chips and Google's TPUs signal potential shifts in AI hardware dominance, with growing competition against Nvidia.Anthropic's potential IPO and OpenAI's declared ‘Code Red’ indicate significant moves in the AI business landscape, including high venture funding rounds for startups.Key research papers from DeepMind and Google explore advanced memory architectures and multi-agent systems, indicating ongoing efforts to enhance AI reasoning and efficiency. Timestamps: (00:00:10) Intro / Banter(00:02:42) News PreviewTools & Apps(00:03:30) Deepseek 3.2 : New AI Model is Faster, Cheaper and Smarter(00:23:22) Black Forest Labs launches Flux.2 AI image models to challenge Nano Banana Pro and Midjourney(00:28:00) Sora and Nano Banana Pro throttled amid soaring demand | The Verge(00:29:34) Mistral closes in on Big AI rivals with new open-weight frontier and small models | TechCrunch(00:31:41) Kling's Video O1 launches as the first all-in-one video model for generation and editing(00:34:07) Runway rolls out Gen 4.5 AI video model that beats Google, OpenAIApplications & Business(00:35:18) NVIDIA’s Partners Are Beginning to Tilt Toward Google’s TPU Ecosystem, with Foxconn Reportedly Securing TPU Rack Orders(00:40:37) Amazon releases an impressive new AI chip and teases an Nvidia-friendly roadmap | TechCrunch(00:43:03) OpenAI declares ‘code red’ as Google catches up in AI race | The Verge(00:46:20) Anthropic reportedly preparing for massive IPO in race with OpenAI: FT(00:48:41) Black Forest Labs raises $300M at $3.25B valuation | TechCrunch(00:49:20) Paris-based AI voice startup Gradium nabs $70M seed | TechCrunch(00:50:10) OpenAI announced a 1 GW Stargate cluster in Abu Dhabi(00:53:22) OpenAI’s investment into Thrive Holdings is its latest circular deal(00:55:11) OpenAI to acquire Neptune, an AI model training assistance startup(00:56:11) Anthropic acquires developer tool startup Bun to scale AI coding(00:56:55) Microsoft drops AI sales targets in half after salespeople miss their quotas - Ars TechnicaProjects & Open Source(00:57:51) [2511.22570] DeepSeekMath-V2: Towards Self-Verifiable Mathematical Reasoning(01:01:52) Evo-Memory: Benchmarking LLM Agent Test-time Learning with Self-Evolving MemoryResearch & Advancements(01:05:44) Nested Learning: The Illusion of Deep Learning Architecture(01:13:30) Multi-Agent Deep Research: Training Multi-Agent Systems with M-GRPO(01:15:50) State of AI: An Empirical 100 Trillion Token Study with OpenRouterPolicy & Safety(01:21:52) Trump signs executive order launching Genesis Mission AI project(01:24:42) OpenAI has trained its LLM to confess to bad behavior | MIT Technology Review(01:29:34) US senators seek to block Nvidia sales of advanced chips to ChinaSee Privacy Policy at https://art19.com/privacy and California Privacy Notice at https://art19.com/privacy#do-not-sell-my-info.

See All (291)

Andrey Kurenkov

Host
Jeremie Harris

Host

Weekly summaries of the AI news that matters!

Creator

Skynet Today
Years Active

2020 - 2026
Episodes

291
Show Website

Last Week in AI

Technology

Technology

Updated weekly
Technology

Technology

Updated twice weekly
Technology

Technology

Updated twice weekly
Technology

Technology

Updated weekly
Technology

Technology

Updated weekly
Entrepreneurship

Entrepreneurship

Updated daily
Entrepreneurship

Entrepreneurship

Twice monthly

Last Week in AI

#251 - Mythos Back, Sonnet 5, Etched, LongCat

#245 - TML-Interaction, Claude For Legal, Sam Altman on Stand

#240 - Project Glasswing, Claude Mythos, GLM-5.1, emotion concepts

#238 - GPT 5.4 mini, OpenAI Pivot, Mamba 3, Attention Residuals

#235 - Sonnet 4.6, Deep-thinking tokens, Anthropic vs Pentagon

#234 - Opus 4.6, GPT-5.3-codex, Seedance 2.0, GLM-5

#228 - GPT 5.2, Scaling Agents, Weird Generalization

#227 - Jeremie is back! DeepSeek 3.2, TPUs, Nested Learning

Hosts & Guests

Andrey Kurenkov

Jeremie Harris

About

Information

You Might Also Like

Last Week in AI

Episodes

#251 - Mythos Back, Sonnet 5, Etched, LongCat

#245 - TML-Interaction, Claude For Legal, Sam Altman on Stand

#240 - Project Glasswing, Claude Mythos, GLM-5.1, emotion concepts

#238 - GPT 5.4 mini, OpenAI Pivot, Mamba 3, Attention Residuals

#235 - Sonnet 4.6, Deep-thinking tokens, Anthropic vs Pentagon

#234 - Opus 4.6, GPT-5.3-codex, Seedance 2.0, GLM-5

#228 - GPT 5.2, Scaling Agents, Weird Generalization

#227 - Jeremie is back! DeepSeek 3.2, TPUs, Nested Learning

Hosts & Guests

About

Information

You Might Also Like