The Weight Update

Kris Moore

AI intelligence for technology leaders. Model releases, infrastructure decisions, governance deadlines, vendor shifts, and talent signals — analyzed with evidence, delivered with opinion. Each episode covers what changed this week in AI and what it means for your organization's strategy. Built for CTOs, CIOs, VPs of Engineering, and Heads of AI/ML who need to make decisions, not just stay informed. AI-Assisted Production: Research and editorial direction by Kristopher Moore. Scripts developed with Claude (Anthropic). Narration by AI voice synthesis (Microsoft Edge TTS).

Episodes

  1. Apr 10

    Everybody Shipped

    A wide-aperture survey of the most concentrated AI news cycle of Q1 2026. In fourteen days: Meta launched Muse Spark under Alexandr Wang and walked away from the open-weight default that defined Llama. Zhipu shipped GLM-5.1, a frontier-class open-weight coding model trained end-to-end on Huawei Ascend silicon with zero NVIDIA in the stack. Anthropic unveiled Claude Mythos Preview via Project Glasswing — seeded to eleven named enterprise defensive partners (AWS, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorgan Chase, the Linux Foundation, Microsoft, NVIDIA, Palo Alto Networks) and explicitly declined GA release over cybersecurity dual-use risk. NVIDIA put Vera Rubin into production at 2,300 watts per GPU with mandatory liquid cooling. OpenAI killed Sora because the unit economics didn't work and redirected the compute to Codex and enterprise agents. Google went GA with Ironwood, the seventh-generation TPU. MemPalace v3.0 hit 21,700 GitHub stars in four days claiming the top of the LongMemEval benchmark (amid significant community skepticism about the benchmark methodology and one of the two named creators' actual technical involvement). Kimi K2.5 cut its input price again. This episode walks the field lab by lab and chip by chip — US frontier, Chinese open-weights wave, silicon, coding agents, the memory layer — and closes with what got heavier and what got lighter for a CTO making vendor decisions right now. Three Forward Look predictions are logged for accountability. Honest about which claims are vendor self-reports and which are independently verified. Two single-source claims (GLM-5.1 SWE-Bench Pro 58.4, MemPalace LongMemEval 96.6%) are flagged in-episode as pending independent reproduction. Runtime: 58 minutes. Coverage window: 2026-03-26 to 2026-04-08. --- AI Disclosure: This episode was produced with AI assistance. Research synthesis and script writing used Claude (Anthropic) under human editorial direction. Audio narration by Microsoft Edge TTS (en-US-AndrewNeural voice). edited to fix TTS defect.

    58 min
  2. Mar 12

    What Model, Where, At What Cost — The Three Decisions That Define Your AI Stack

    Instead of the usual news roundup, this episode walks through the three decisions every technology leader deploying AI in 2026 needs to articulate: which model, where to run it, and which harness wraps it. The model landscape now includes 7+ serious contenders across 4 countries, with a 36x price spread between frontier and budget tiers. The inference provider market has fragmented into four tiers — direct API, custom silicon (Groq, Cerebras, SambaNova), GPU-optimized (Fireworks, Together), and self-hosted. And the most important finding in AI tooling this year: harness design drives 22% of performance variance, while model selection drives just 1%. Three worked scenarios show how these decisions compound: AI coding assistants, customer-facing agents, and batch processing pipelines — with real pricing and architecture trade-offs for each. The episode splits at the 40-minute mark. The first half is the framework for your next board meeting or leadership discussion. The second half is detailed data — model-by-model pricing, provider-by-provider throughput, tool-by-tool comparison — for the technical leads on your team who need to build the evaluation. Plus: GTC preview, Oracle's $50B infrastructure raise, defense AI hiring data, and the 90-day trajectory for multi-model routing, custom silicon adoption, and harness convergence. 38 sources cited. Full source list in show notes. AI Disclosure: This episode was produced with AI assistance. Research synthesis and script writing used Claude (Anthropic) under human editorial direction. Audio narration by Microsoft Edge TTS (en-US-AndrewNeural voice).

    29 min

About

AI intelligence for technology leaders. Model releases, infrastructure decisions, governance deadlines, vendor shifts, and talent signals — analyzed with evidence, delivered with opinion. Each episode covers what changed this week in AI and what it means for your organization's strategy. Built for CTOs, CIOs, VPs of Engineering, and Heads of AI/ML who need to make decisions, not just stay informed. AI-Assisted Production: Research and editorial direction by Kristopher Moore. Scripts developed with Claude (Anthropic). Narration by AI voice synthesis (Microsoft Edge TTS).