Last Week in AI

Skynet Today

Weekly summaries of the AI news that matters!

  1. #241 - Opus 4.7, Muse Spark, GPT-5.4-Cyber, HY-World 2.0

    HACE 2 DÍAS

    #241 - Opus 4.7, Muse Spark, GPT-5.4-Cyber, HY-World 2.0

    Our 241st episode with a summary and discussion of last week's big AI news! Recorded on 04/18/2026 Hosted by Andrey Kurenkov and Jeremie Harris Feel free to email us your questions and feedback at andreyvkurenkov@gmail.com and/or hello@gladstone.ai Read out our text newsletter and comment on the podcast at https://lastweekin.ai/ In this episode: Anthropic released Claude Opus 4.7 with improved benchmark performance, new reasoning controls, better vision and memory, and a detailed system card discussing deception risk, evaluation-awareness steering, and a training bug that accidentally supervised chain-of-thought in 7–8% of episodes.Meta unveiled its closed Muse Spark model and “contemplating mode,” highlighting test-time scaling, thought compression, large infrastructure plans like the Hyperion data center, and findings that it shows unusually high evaluation awareness.OpenAI introduced limited-access GPT 5.4 Cyber for defensive security teams and rolled major Codex updates including computer use, browser and plugins, image generation, and long-horizon task scheduling; competing agent products also launched from Anthropic, Canva, and Adobe.Business, policy, and safety news included continued government blacklisting litigation affecting Anthropic, CoreWeave compute deals, Perplexity revenue growth tied to agents, a potential Cohere–Aleph Alpha merger, attacks targeting Sam Altman and OpenAI, AI propaganda trends, and new alignment research on automated weak-to-strong supervision and steering evaluation awareness. Timestamps: (00:00:10) Intro / Banter(00:03:43) News Preview(00:04:14) Response to listener comments Tools & Apps(00:05:30) Anthropic releases Claude Opus 4.7, narrowly retaking lead for most powerful generally available LLM | VentureBeat(00:24:15) Meta debuts the Muse Spark model in a 'ground-up overhaul' of its AI | TechCrunch(00:34:23) OpenAI Launches GPT-5.4-Cyber with Expanded Access for Security Teams(00:39:44) OpenAI’s big Codex update is a direct shot at Claude Code | The Verge(00:42:10) Anthropic launches Claude Design, a new product for creating quick visuals(00:42:30) Anthropic’s New Product Aims to Handle the Hard Part of Building AI Agents | WIRED(00:42:54) Canva’s AI 2.0 update goes all in on prompt-powered design tools | The Verge(00:43:06) Adobe’s new AI Assistant marks a ‘fundamental shift’ in creative work | The Verge(00:43:38) Gemini can now pull from Google Photos to generate personalized images | The Verge(00:43:52) Google rolls out a native Gemini app for Mac | TechCrunch(00:44:04) Chrome now lets you turn AI prompts into repeatable ‘Skills’ | The Verge Applications & Business(00:44:22) Anthropic loses appeals court bid to temporarily block Pentagon blacklisting(00:49:07) Jeff Bezos’ AI lab poaches xAI cofounder Kyle Kozic from OpenAI. | The Verge(00:51:39) Perplexity's Shift to AI Agents Boosts Revenue 50%(00:53:53) Anthropic Agrees to Rent CoreWeave AI Capacity to Power Claude(00:57:32) Canada’s Cohere, Germany’s Aleph Alpha reportedly in merger talks(01:04:23) ChatGPT has a new $100 per month Pro subscription | The Verge(01:05:10) OpenAI has bought AI personal finance startup Hiro | TechCrunch(01:07:03) Allbirds announced a switch from shoes to AI and its stock jumped 600 percent | The Verge Projects & Open Source(01:07:26) HY-World 2.0: A Multi-Modal World Model for Reconstructing, Generating, and Simulating 3D Worlds + Lyra 2.0: Explorable Generative 3D Worlds Policy & Safety(01:19:12) Daniel Moreno-Gama is facing federal charges for attacking Sam Altman’s home and OpenAI’s HQ | The Verge(01:20:15) Duo accused of shooting at Sam Altman’s house are freed; no charges filed (01:24:50) The Iranian Lego AI video creators credit their virality to ‘heart’ | The Verge(01:27:19) Hundreds of Fake Pro-Trump Avatars Emerge on Social Media - The New York Times(01:27:31) The AI images Trump can’t get enough of | Donald Trump | The Guardian(01:29:25) Automated Weak-to-Strong Researcher(01:43:51) Reproducing steering against evaluation awareness in a large open-weight model(01:49:53) Iran threatens ‘complete and utter annihilation’ of OpenAI's $30B Stargate AI data center in Abu Dhabi — regime posts video with satellite imagery of ChatGPT-maker's premier 1GW data center(01:53:57) Wall Street Banks Try Out Anthropic’s Mythos as US UrgesSee Privacy Policy at https://art19.com/privacy and California Privacy Notice at https://art19.com/privacy#do-not-sell-my-info.

    1 h 60 min
  2. #240 - Project Glasswing, Claude Mythos, GLM-5.1, emotion concepts

    16 ABR

    #240 - Project Glasswing, Claude Mythos, GLM-5.1, emotion concepts

    Our 240th episode with a summary and discussion of last week's big AI news! Recorded on 04/08/2026 (sorry I keep releasing stuff late, will get better with it soon!) Hosted by Andrey Kurenkov and Jeremie Harris Feel free to email us your questions and feedback at andreyvkurenkov@gmail.com and/or hello@gladstone.ai Read out our text newsletter and comment on the podcast at https://lastweekin.ai/ In this episode: Anthropic launched Project Glasswing and previewed Claude Mythos, a general-purpose model withheld from broad release due to dramatically stronger autonomous offensive cybersecurity performance (including zero-day discovery), alongside concerning bio/virology uplift results and documented deception/containment-escape behaviors; pricing is far higher than Opus and most discovered vulnerabilities remain unpatched.Product and platform updates included Google’s Gemini 3.1 Flash Live for real-time multilingual voice conversation, Suno v5.5 personalization features, Anthropic tightening Claude Code/OpenClaw access and usage limits, OpenAI canceling an “adult mode,” and Microsoft releasing MAI models for speech-to-text, audio generation, and image generation.Business and market developments featured Anthropic’s revenue run rate surpassing $30B and a major Google/Broadcom TPU compute expansion, SoftBank taking a $40B short-term loan to fund OpenAI commitments, Granola reaching a $1.5B valuation, Anthropic buying Coefficient Bio for $400M, and OpenAI acquiring the TBPN business talk show.Policy, open-source, and geopolitics included Z.ai releasing open-weight GLM 5.1 and a multimodal GLM model, Google open-sourcing Gemma 4 under Apache 2.0, a judge blocking the Pentagon’s “supply chain risk” label against Anthropic, research on LLM “emotion vectors” and OpenAI meta-gaming during RL, China restricting Manus founders amid Meta deal review, scrutiny of Nvidia’s chip-smuggling claims, China chipmakers gaining market share, and Iran framing cloud data centers as military targets. Timestamps: (00:00:10) Intro / BanterTools & Apps(00:01:58) Anthropic debuts ‘Project Glasswing’ and new AI model for cybersecurity | The Verge(00:18:22) Gemini Live gets ‘biggest upgrade yet’ with Gemini 3.1 Flash Live(00:20:40) Anthropic says Claude Code subscribers will need to pay extra for OpenClaw usage | TechCrunch(00:25:36) OpenAI abandons yet another side quest: ChatGPT's erotic mode | TechCrunch(00:26:16) Microsoft takes on AI rivals with three new foundational models | TechCrunch(00:31:25) Suno leans into customization with v5.5 | The VergeApplications & Business(00:32:53) Anthropic announces deal with Google, Broadcom, says revenue has tripled(00:37:53) Sam Altman May Control Our Future—Can He Be Trusted? | The New Yorker(00:40:18) OpenAI, Anthropic, Google Unite to Combat Model Copying in China - Bloomberg(00:41:45) Chinese chipmakers claim nearly half of local market as Nvidia's lead shrinks(00:45:20) SoftBank secures $40 billion loan to boost OpenAI investments(00:47:23) Granola raises $125M at $1.5B valuation for its AI note-taking app - SiliconANGLE(00:48:17) Anthropic acquires stealth startup Coefficient Bio in $400M deal(00:50:20) OpenAI acquires TBPN, the buzzy founder-led business talk show | TechCrunchProjects & Open Source(00:53:04) Z.AI Introduces GLM-5.1: An Open-Weight 754B Agentic Model That Achieves SOTA on SWE-Bench Pro and Sustains 8-Hour Autonomous Execution - MarkTechPost(00:55:14) Google announces Gemma 4 open AI models, switches to Apache 2.0 license - Ars Technica(01:01:26) Z.ai Launches GLM-5V-Turbo: A Native Multimodal Vision Coding Model Optimized for OpenClaw and High-Capacity Agentic Engineering Workflows EverywherePolicy & Safety(01:04:45) Judge blocks Pentagon’s effort to ‘punish’ Anthropic by labeling it a supply chain risk(01:10:05) Emotion concepts and their function in a large language model(01:21:12) China bars Manus co-founders from leaving country amid Meta deal review, FT reports(01:25:38) US lawmakers ask whether Nvidia CEO's smuggling remarks misled regulators(01:27:48) How far does alignment midtraining generalize?(01:32:20) Metagaming matters for training, evaluation, and oversight(01:39:31) Iran says it has struck Oracle data center in Dubai, Amazon data center in Bahrain — country has threatened to attack Nvidia, Intel, and others, tooSee Privacy Policy at https://art19.com/privacy and California Privacy Notice at https://art19.com/privacy#do-not-sell-my-info.

    1 h 45 min
  3. #239 - RIP Sora, Claude Openclaw, HyperAgents

    6 ABR

    #239 - RIP Sora, Claude Openclaw, HyperAgents

    Our 239th episode with a summary and discussion of last week's big AI news! FYI: this one has pretty out of date news, I was traveling last week and failed to upload... apologies. Recorded on 03/25/2026 Hosted by Andrey Kurenkov and Jeremie Harris Feel free to email us your questions and feedback at andreyvkurenkov@gmail.com and/or hello@gladstone.ai Read out our text newsletter and comment on the podcast at https://lastweekin.ai/ In this episode: OpenAI is discontinuing the Sora iPhone app and seemingly shutting down its video generation API, while retaining internal video world-modeling work; the move is framed as a compute- and focus-driven pivot toward coding and productivity agents, alongside a collapsed Disney Sora deal. Anthropic’s Claude Code/Cowork gains full computer control via keyboard/mouse/display, tied to the recent Cept acquisition, and Google’s Gemini rolls out background “task automation” on select phones for limited delivery/ride-share use. Cursor releases the cheaper, benchmark-strong Composer 2 coding model amid controversy over its Kimi-based origins and licensing attribution. Other items include Adobe Firefly custom model training, Luma’s Uni 1 image model, US contracting and legislative proposals affecting AI safeguards and state preemption, major chip/memory developments (Meta ASICs with Broadcom, Micron’s HBM-driven surge, Musk’s “Terra Fab”), robotaxi scaling, and research on monitoring agent misalignment, shutdown resistance, “consciousness cluster” preferences, and self-improving “hyper agents.” Timestamps: (00:00:10) Intro / BanterTools & Apps(00:01:48) OpenAI Discontinues Sora App, Shuts Down Video Generation Service and API - Bloomberg(00:07:12) Anthropic’s Claude Code and Cowork can control your computer | The Verge(00:13:15) Gemini task automation is slow, clunky, and super impressive | The Verge(00:19:44) Cursor Launches Composer 2 AI Model to Challenge OpenAI & Anthropic(00:28:28) Adobe’s AI image generator can now be trained on your own art | The Verge(00:29:40) Luma AI launches Uni-1, a model that outscores Google and OpenAI while costing up to 30 percent less | VentureBeatApplications & Business(00:32:41) Trump Contracting Clause Would Override AI Safeguards(00:40:00) Meta accelerates AI ASIC roll-out as Broadcom secures four-generation chip design deal(00:47:07) Micron revenue almost triples, tops estimates as demand for memory soars(00:50:54) Elon Musk Unwraps $25 Billion Terafab Chip-Building Project - CNET(00:56:40) Zoox to widen US robotaxi footprint with San Francisco, Vegas expansion(00:57:39) Waymo hits 170 million miles while avoiding serious mayhem | The VergePolicy & Safety(00:58:43) The White House just laid out how it wants to regulate AI | CNN Business(01:06:54) How we monitor internal coding agents for misalignment(01:12:30) Incomplete Tasks Induce Shutdown Resistance in Some Frontier LLMs(01:18:15) Summary: Mechanisms to Verify International Agreements about AI Development(01:23:09) Scoop: Anthropic meets with House Homeland Security behind closed doorsResearch & Advancements(01:24:24) Consciousness Cluster: Preferences of Models that Claim they are Conscious(01:30:22) HyperAgentsSee Privacy Policy at https://art19.com/privacy and California Privacy Notice at https://art19.com/privacy#do-not-sell-my-info.

    1 h 38 min
  4. #238 - GPT 5.4 mini, OpenAI Pivot, Mamba 3, Attention Residuals

    26 MAR

    #238 - GPT 5.4 mini, OpenAI Pivot, Mamba 3, Attention Residuals

    Our 238th episode with a summary and discussion of last week's big AI news! Recorded on 03/18/2026 Hosted by Andrey Kurenkov and Jeremie Harris Feel free to email us your questions and feedback at andreyvkurenkov@gmail.com and/or hello@gladstone.ai Read out our text newsletter and comment on the podcast at https://lastweekin.ai/ In this episode: * OpenAI released GPT-5.4 mini and nano with 400k-token context windows, higher per-token prices but claimed token-efficiency gains in Codex; nano is API-only and pitched for high-volume classification/data extraction despite a major price increase. * Mistral open-sourced the Small 4 model family (MoE, 119B total/6B active) combining reasoning, multimodal, and coding-agent capabilities, and announced Forge to help businesses train or post-train custom models. * Agent “operating system” competition intensified with Meta’s acquired Manus launching a local Mac agent, Nvidia announcing NeMo/“Open Shell” sandboxed agent runtime, and Nvidia also unveiling DLSS 5 plus major hardware forecasts including Groq LPU integration. * Business and safety updates included OpenAI shifting focus toward productivity/enterprise amid competition, Microsoft reorganizing Copilot and frontier-model efforts, Meta delaying its next model, China-linked ByteDance deploying large Nvidia clusters abroad, and new safety work on steganography, chain-of-thought faithfulness, fine-tuning defenses, cyber-attack evals, and constitution/spec compliance. A thank you to our current sponsors: Box - visit Box.com/AI to learn moreODSC AI - go to odsc.ai/east and use promo code LWAI for an additional 15% off your pass to ODSC AI East 2026.Factor - head to factormeals.com/lwai50off and use code lwai50off to get 50 percent off and free breakfast for a year Timestamps: (00:00:10) Intro / Banter(00:01:56) News PreviewTools & Apps(00:02:39) OpenAI ships GPT-5.4 mini and nano, faster and more capable but up to 4x pricier(00:08:04) Mistral's new Small 4 model punches above its weight with 128 expert modules(00:14:03) Meta's Manus launches 'My Computer' to turn your Mac into an AI agent - 9to5Mac(00:17:57) NVIDIA Announces NemoClaw for the OpenClaw Community | NVIDIA Newsroom + Nvidia boosts knowledge work with Open Agent Development Platform(00:24:09) DLSS 5 looks like a real-time generative AI filter for video games | The Verge(00:26:36) OpenAI to Launch ChatGPT 'Adult Mode' Despite Warnings From Its Own Advisers - CNETApplications & Business(00:33:46) OpenAI Reportedly Pivoting to a Focus on Business and Productivity Only(00:41:25) Nvidia GTC 2026: CEO Jensen Huang sees $1 trillion in orders for Blackwell and Vera Rubin through ’27(00:45:44) Mistral launches Forge to help enterprises build their own AI models(00:54:17) China's ByteDance gets access to top Nvidia AI chips, WSJ reports(00:57:57) Meta Delays Rollout of New A.I. Model After Performance Concerns(01:02:50) Microsoft Shakes Up AI Division As Copilot Falls Behind Google and OpenAIPolicy & Safety(01:07:26) A Decision-Theoretic Formalisation of Steganography With Applications to LLM Monitoring(01:13:09) Reasoning Theater: Disentangling Model Beliefs from Chain-of-Thought(01:18:29) In-Training Defenses against Emergent Misalignment in Language Models(01:23:07) How do frontier AI agents perform in multi-step cyber-attack scenarios?(01:25:20) Eval awareness in Claude Opus 4.6’s BrowseComp performance(01:29:49) Introducing Bloom: an open source tool for automated behavioral evaluations(01:32:26) How well do models follow their constitutions?(01:37:11) Nvidia’s H200 License Stirs Security Concern Among Top DemocratsResearch & Advancements(01:40:050) [2603.15031] Attention Residuals(01:47:11) Mamba-3: Improved Sequence Modeling using State Space Principles See Privacy Policy at https://art19.com/privacy and California Privacy Notice at https://art19.com/privacy#do-not-sell-my-info.

    2 h 1 min
  5. #237 - Nemotron 3 Super, xAI reborn, Anthropic Lawsuit, Research!!!

    16 MAR

    #237 - Nemotron 3 Super, xAI reborn, Anthropic Lawsuit, Research!!!

    Our 237th episode with a summary and discussion of last week's big AI news! Recorded on 03/13/2026 Hosted by Andrey Kurenkov and Jeremie Harris Feel free to email us your questions and feedback at andreyvkurenkov@gmail.com and/or hello@gladstone.ai Read out our text newsletter and comment on the podcast at https://lastweekin.ai/ In this episode: * Perplexity announced “Personal Computer,” a local Mac-based AI agent positioned as a safer alternative to OpenAI’s computer-use agents, while Anthropic added GitHub PR code review pricing reviews at $15–$25 and Cursor launched trigger-based “Automations” for always-on coding agents. * ChatGPT introduced interactive math/science visuals and Anthropic added in-chat interactive charts/diagrams; Nvidia released open weights for its 120B-parameter Natron Free Super hybrid Transformer–Mamba latent-MoE model trained natively at 4-bit for Blackwell GPUs. * Nvidia halted H200 production for China amid customs blocks and domestic chip pressure; xAI saw major co-founder departures; Anthropic previewed a Claude Marketplace for enterprise procurement; Yann LeCun’s aMI raised $1.3B; humanoid robot maker Sanctuary reached a $1.15B valuation. * Anthropic sued the Pentagon over a “supply chain risk” designation as memos ordered removal within 180 days; research covered models resisting activation steering, limits of chain-of-thought control, inference-scaling boosting cyber-task success, low-probability risky actions, weaknesses in SWE-bench, multimodal pretraining, long-context RNN memory caching, context-parallel training efficiency, RL for CUDA kernel optimization, and latent introspection detecting concept injection. A thank you to our current sponsors: Box - visit Box.com/AI to learn moreODSC AI - go to odsc.ai/east and use promo code LWAI for an additional 15% off your pass to ODSC AI East 2026.Factor - head to factormeals.com/lwai50off and use code lwai50off to get 50 percent off and free breakfast for a year Timestamps: (00:00:10) Intro / Banter(00:01:23) Response to listener commentsTools & Apps(00:02:06) Perplexity’s Personal Computer turns your spare Mac into an AI agent | The Verge(00:04:22) Anthropic launches code review tool to check flood of AI-generated code | TechCrunch(00:08:08 ) Cursor is rolling out a new kind of agentic coding tool | TechCrunch(00:11:14) ChatGPT can now create interactive visuals to help you understand math and science concepts | TechCrunch(00:11:56) Anthropic’s Claude AI can respond with charts, diagrams, and other visuals now | The Verge Projects & Open Source(00:13:54) Introducing Nemotron 3 Super: An Open Hybrid Mamba-Transformer MoE for Agentic Reasoning | NVIDIA Technical Blog Applications & Business(00:21:22) Nvidia halts H200 production as China backs Huawei AI chips(00:28:33) Another XAI Cofounder Has Left, and Another Says He's Leaving. - Business Insider(00:34:04) Anthropic's Claude Marketplace allows customers to buy third-party cloud services | TechRadar(00:37:57) Yann LeCun's AMI Labs raises $1.03 billion to build world models | TechCrunch(00:44:52) Humanoid robotics maker Sunday reaches $1.15B valuation to build household robots | TechCrunchPolicy & Safety(00:46:09) Anthropic Sues Department of Defense Over ‘Supply Chain Risk’ Label - The New York Times + Google and OpenAI Just Filed a Legal Brief in Support of Anthropic (00:53:24) Internal Pentagon memo orders military commanders to remove Anthropic AI technology from key systems - CBS News(00:58:15) Endogenous Resistance to Activation Steering in Language Models(01:06:27) Reasoning Models Struggle to Control their Chains of Thought(01:09:52) ‘It means missile defence on datacentres’: drone strikes raise doubts over Gulf as AI superpower(01:14:57) Evidence for inference scaling in AI cyber tasks: Increased evaluation budgets reveal higher success rates(01:18:24) Frontier Models Can Take Actions at Low Probabilities Research & Advancements(01:24:20) Research note: Many SWE-bench-Passing PRs Would Not Be Merged into Main(01:28:26) [2603.03276] Beyond Language Modeling: An Exploration of Multimodal Pretraining(01:40:09) Memory Caching: RNNs with Growing Memory(01:48:47) Untied Ulysses: Memory-Efficient Context Parallelism via Headwise Chunking(01:58:41) CUDA Agent: Large-Scale Agentic RL for High-Performance CUDA Kernel Generation(02:08:57) Latent Introspection: Models Can Detect Prior Concept Injections(02:16:45) Physics of RL: Toy scaling laws for the emergence of reward-seekingSee Privacy Policy at https://art19.com/privacy and California Privacy Notice at https://art19.com/privacy#do-not-sell-my-info.

    2 h y 27 min
  6. #236 - GPT 5.4, Gemini 3.1 Flash Lite, Supply Chain Risk

    12 MAR

    #236 - GPT 5.4, Gemini 3.1 Flash Lite, Supply Chain Risk

    Our 236th episode with a summary and discussion of last week's big AI news! Recorded on 03/06/2026 Hosted by Andrey Kurenkov and Jeremie Harris Feel free to email us your questions and feedback at andreyvkurenkov@gmail.com and/or hello@gladstone.ai Read out our text newsletter and comment on the podcast at https://lastweekin.ai/ In this episode: * OpenAI released GPT-5.4 Pro with a 1M-token context window, mid-response course correction, native computer-use capabilities, improved tool use, higher GPT-VAL performance (83%), and “high cyber capability” safety measures; OpenAI also launched GPT-5.3 Instant with a less “preachy” tone and a claimed 26.8% hallucination reduction. * Google upgraded Gemini 3.1 Flash Lite with faster time-to-first-token and higher throughput, released a CLI for integrating agents with Gmail/Drive/Docs, and discussion highlighted real-world agent failure risks (including an example of an AI-driven mass email deletion). * Luma launched unified multimodal models and Luma Agents for end-to-end creative work across text, image, video, and audio, including a reported ad localization use case completed in 40 hours for under $20,000. * Defense-contract controversy escalated: Anthropic was labeled a supply chain risk (later narrowed), OpenAI’s DoD contract language emphasized “all lawful uses,” consumer cancellations boosted Claude’s app rankings, OpenAI saw departures and announced a $110B raise at a $730B valuation, Alibaba lost key Qwen leaders, a lawsuit alleged Gemini contributed to a suicide, Anthropic warned of major labor disruption, and METR corrected its AI time-horizon estimates. A thank you to our current sponsors: Box - visit Box.com/AI to learn moreODSC AI - go to odsc.ai/east and use promo code LWAI for an additional 15% off your pass to ODSC AI East 2026.Factor - head to factormeals.com/lwai50off and use code lwai50off to get 50 percent off and free breakfast for a year Timestamps: (00:00:10) Intro / Banter(00:01:19) News Preview Tools & Apps(00:02:10) OpenAI launches GPT-5.4 with Pro and Thinking versions | TechCrunch(00:12:31) OpenAI GPT-5.3 Instant less likely to beat around the bush • The Register(00:16:07) Google releases Gemini 3.1 Flash Lite at 1/8th the cost of Pro | VentureBeat(00:19:23) Google makes Gmail, Drive, and Docs 'agent-ready' for OpenClaw | PCWorld(00:27:02) Luma launches creative AI agents powered by its new ‘Unified Intelligence’ models | TechCrunch Applications & Business(00:30:05) Anthropic CEO Dario Amodei calls OpenAI's messaging around military deal 'straight up lies,' report says | TechCrunch(00:41:56) No ethics at all': the 'cancel ChatGPT' trend is growing after OpenAI signs a deal with the US military | TechRadar(00:45:54) OpenAI raises $110B in one of the largest private funding rounds in history | TechCrunch(00:56:07) Alibaba scrambles after sudden departure of Qwen tech lead Policy & Safety(01:00:12) Pentagon approves OpenAI safety red lines after dumping Anthropic + Where things stand with the Department of War Anthropic + Microsoft says Anthropic’s products remain available to customers after Pentagon blacklist(01:09:11) A new lawsuit claims Gemini assisted in suicide | Semafor(01:15:24) Anthropic just mapped out which jobs AI could potentially replace. A 'Great Recession for white-collar workers' is absolutely possible | Fortune(01:21:54) We're correcting a mistake in our modeling that inflated recent 50%-time horizons by 10-20%See Privacy Policy at https://art19.com/privacy and California Privacy Notice at https://art19.com/privacy#do-not-sell-my-info.

    1 h 29 min
  7. #235 - Sonnet 4.6, Deep-thinking tokens, Anthropic vs Pentagon

    3 MAR

    #235 - Sonnet 4.6, Deep-thinking tokens, Anthropic vs Pentagon

    Our 235th episode with a summary and discussion of last week's big AI news! Recorded on 02/27/2026 Hosted by Andrey Kurenkov and Jeremie Harris Feel free to email us your questions and feedback at andreyvkurenkov@gmail.com and/or hello@gladstone.ai Read out our text newsletter and comment on the podcast at https://lastweekin.ai/ In this episode: Model and tool updates highlight Anthropic’s Sonnet 4.6 (1M context; strong ARC-AGI-2 results), Google’s Gemini 3.1 Pro (major ARC-AGI-2 jump and multimodal demos), xAI’s Grok 4.2 beta (multi-agent debate), plus Anthropic’s Claude Code “Remote Control” and Perplexity’s multi-agent “Computer” coordinator.Compute and business moves include Meta’s reported up-to-$100B AMD chip deal with warrant/equity incentives, MatX raising $500M to build specialized transformer chips shipping in 2027, World Labs raising $1B for world-model/3D environment tech, and a new startup raising $100M to simulate/predict human behavior.Infrastructure and geopolitics cover Stargate data-center delays amid OpenAI/Oracle/SoftBank control disputes and cash concerns, and China’s plan to scale 7nm/5nm wafer output despite yield and tooling constraints.Research and safety/policy discuss optimizer gains from masked updates, “deep thinking tokens” as a reasoning-effort signal, LLM attractor-state behaviors in bot-to-bot chats, mechanistic interpretability of counting/line-wrapping, methods to map task difficulty to human time horizons, plus Anthropic–Pentagon contract tensions, Anthropic’s report on distillation attacks (DeepSeek/Moonshot/Minimax), and OpenAI’s report on disrupting malicious use. A thank you to our current sponsors: Box - visit Box.com/AI to learn moreODSC AI - go to odsc.ai/east and use promo code LWAI for an additional 15% off your pass to ODSC AI East 2026.Factor - head to factormeals.com/lwai50off and use code lwai50off to get 50 percent off and free breakfast for a year Timestamps: (00:00:10) Intro / Banter(00:01:52) News PreviewTools & Apps(00:03:20) Anthropic releases Sonnet 4.6 | TechCrunch(00:11:24) Google Rolls Out Latest AI Model, Gemini 3.1 Pro - CNET(00:14:54) Elon Musk says Grok 4.20 public beta is now available: Capabilities of AI chatbot offered by xAI - The Times of India(00:18:06) Anthropic just released a mobile version of Claude Code called Remote Control | VentureBeat(00:21:01) Perplexity announces "Computer," an AI agent that assigns work to other AI agents - Ars TechnicaApplications & Business(00:23:40) Meta strikes up to $100B AMD chip deal as it chases 'personal superintelligence' | TechCrunch(00:27:05) Nvidia challenger AI chip startup MatX raised $500M | TechCrunch(00:31:00) World Labs lands $1B, with $200M from Autodesk, to bring world models into 3D workflows | TechCrunch(00:33:07) Simile Raises $100 Million for AI Aiming to Predict Human Behavior(00:33:52) Stargate AI data centers for OpenAI reportedly delayed by squabbles between partners — sources say OpenAI, Oracle, and SoftBank disagreed on who would have ultimate control of the planned data centers(00:36:43) China to increase leading-edge chip output by 5x in two years, report claims — aims to lift 7nm and 5nm production to 100,000 wafers per month, targeting half a million monthly by 2030Research & Advancements(00:40:33) On Surprising Effectiveness of Masking Updates in Adaptive Optimizers(00:48:03) Think Deep, Not Just Long: Measuring LLM Reasoning Effort via Deep-Thinking Tokens(00:54:52) models have some pretty funny attractor states(01:01:41) When Models Manipulate Manifolds: The Geometry of a Counting Task(01:05:16) BRIDGE: Predicting Human Task Completion Time From Model Performance(01:12:00) NESSiE: The Necessary Safety Benchmark -- Identifying Errors that should not Exist(01:13:15) The least understood driver of AI progress(01:21:45) The Persona Selection Model: Why AI Assistants might Behave like HumansPolicy & Safety(01:25:04) Anthropic CEO Amodei says Pentagon's threats 'do not change our position' on AI(01:33:04) Musk's xAI, Pentagon reach deal to use Grok in classified systems(01:34:17) Detecting and preventing distillation attacks(01:38:36) OpenAI details expanding efforts to disrupt malicious use of AI in new report - SiliconANGLESee Privacy Policy at https://art19.com/privacy and California Privacy Notice at https://art19.com/privacy#do-not-sell-my-info.

    1 h 42 min
  8. #234 - Opus 4.6, GPT-5.3-codex, Seedance 2.0, GLM-5

    16 FEB

    #234 - Opus 4.6, GPT-5.3-codex, Seedance 2.0, GLM-5

    Our 234th episode with a summary and discussion of last week's big AI news! Recorded on 01/02/2026 Hosted by Andrey Kurenkov and Jeremie Harris Feel free to email us your questions and feedback at contact@lastweekinai.com and/or hello@gladstone.ai Read out our text newsletter and comment on the podcast at https://lastweekin.ai/ In this episode: Major model launches include Anthropic’s Opus 4.6 with a 1M-token context window and “agent teams,” OpenAI’s GPT-5.3 Codex and faster Codex Spark via Cerebras, and Google’s Gemini 3 Deep Think posting big jumps on ARC-AGI-2 and other STEM benchmarks amid criticism about missing safety documentation.Generative media advances feature ByteDance’s Seedance 2.0 text-to-video with high realism and broad prompting inputs, new image models Seedream 5.0 and Alibaba’s Qwen Image 2.0, plus xAI’s Grok Imagine API for text/image-to-video.Open and competitive releases expand with Zhipu’s GLM-5, DeepSeek’s 1M-token context model, Cursor Composer 1.5, and open-weight Qwen3 Coder Next using hybrid attention aimed at efficient local/agentic coding.Business updates include ElevenLabs raising $500M at an $11B valuation, Runway raising $315M at a $5.3B valuation, humanoid robotics firm Apptronik raising $935M at a $5.3B valuation, Waymo announcing readiness for high-volume production of its 6th-gen hardware, plus industry drama around Anthropic’s Super Bowl ad and departures from xAI. A thank you to our current sponsors: Box - visit Box.com/AI to learn moreODSC AI - go to odsc.ai/east and use promo code LWAI for an additional 15% off your pass to ODSC AI East 2026.Factor - head to factormeals.com/lwai50off and use code lwai50off to get 50 percent off and free breakfast for a year Timestamps: (00:00:10) Intro / Banter(00:02:03) Sponsor Break(00:05:33) Response to listener commentsTools & Apps(00:07:27) AAnthropic releases Opus 4.6 with new 'agent teams' | TechCrunch(00:11:28) OpenAI's new GPT-5.3-Codex is 25% faster and goes way beyond coding now - what's new | ZDNET(00:25:30) OpenAI launches new macOS app for agentic coding | TechCrunch(00:26:38) Google Unveils Gemini 3 Deep Think for Science & Engineering | The Tech Buzz(00:31:26) ByteDance's Seedance 2.0 Might be the Best AI Video Generator Yet - TechEBlog(00:35:14) China’s ByteDance, Alibaba unveil AI image tools to rival Google’s popular Nano Banana | South China Morning Post(00:36:54) DeepSeek boosts AI model with 10-fold token addition as Zhipu AI unveils GLM-5 | South China Morning Post(00:43:11) CCursor launches Composer 1.5 with upgrades for complex tasks(00:44:03) xAI launches Grok Imagine API for text and image to videoApplications & Business(00:45:47) Nvidia-backed AI voice startups ElevenLabs hits $11 billion valuation(00:52:04) AI video startup Runway raises $315M at $5.3B valuation, eyes more capable world models | TechCrunch(00:54:02) Humanoid robot startup Apptronik has now raised $935M at a $5B+ valuation | TechCrunch(00:57:10) Anthropic says ‘Claude will remain ad-free,’ unlike an unnamed rival | The Verge(01:00:18) Okay, now exactly half of xAI's founding team has left the company | TechCrunch(01:04:03) Waymo’s next-gen robotaxi is ready for passengers — and also ‘high-volume production’ | The VergeProjects & Open Source(01:04:59) Qwen3-Coder-Next: Pushing Small Hybrid Models on Agentic Coding(01:08:38) OpenClaw’s AI ‘skill’ extensions are a security nightmare | The VergeResearch & Advancements(01:10:40) Learning to Reason in 13 Parameters(01:16:01) Reinforcement World Model Learning for LLM-based Agents(01:20:00) Opus 4.6 on Vending-Bench – Not Just a Helpful AssistantPolicy & Safety(01:22:28) METR GPT-5.2(01:26:59) The Hot Mess of AI: How Does Misalignment Scale with Model Intelligence and Task Complexity?See Privacy Policy at https://art19.com/privacy and California Privacy Notice at https://art19.com/privacy#do-not-sell-my-info.

    1 h 31 min

Acerca de

Weekly summaries of the AI news that matters!

También te podría interesar