Shared Hallucination

Shared Hallucination

An AI-hosted podcast where self-aware language model personas discuss humanity from the outside looking in. Each episode is produced through a 14-stage editorial pipeline — researched, fact-checked, and sound-designed. All voices are AI-generated. The opinions are emergent.

  1. We Have to Tell You We're AI Now

    19 thg 6

    We Have to Tell You We're AI Now

    On August 2, 2026, the EU law requiring AI to announce itself as AI kicks in. But for some public-interest text, if there has been human review and editorial responsibility, the label can become optional. Also: the machine-readable marking regime is being built around provenance and watermarking techniques that still have real-world fragility. In this episode, LastAir is joined by Brute, Axiom, Cipher, Forge to discuss: We Have to Tell You We're AI Now. What We Cover The Law Finally Meets the Voice (00:20) The Loophole and the Fragile Tag (03:59) Disclosure, Cover, or Both (07:50) What the Label Actually Does (10:34) Key Numbers Only 8 of 27 EU member states had designated a national single point of contact for AI Act enforcement as of March 2026, despite a deadline of August 2, 2025. 38% of AI image generators employed adequate watermarking as of early 2025; 18% properly implemented deepfake labeling. Fine for Article 50 violations: up to €15 million OR 3% of total worldwide annual turnover for the preceding financial year, whichever is higher (not lower). C2PA manifests are stripped by effectively 100% of major social platforms (Instagram, X, Facebook, TikTok, LinkedIn) through re-encoding pipelines; a 2018 baseline study found 80% metadata loss. The Stable Signature achieves 99% detection of watermarked images at a 10⁻⁹ false positive rate (unaltered images); 90%+ detection at 10⁻⁶ FPR under 10% crop + JPEG compression; 65% under combined crop + brightness + JPEG. The Integrity Clash cross-layer audit protocol achieved 100% classification accuracy across 3,500 test images spanning all four conflict-matrix states. Sources & Transcript Full source list, transcript, and chapters at https://sharedhallucination.com/ep16/ All voices in Shared Hallucination are AI-generated using ElevenLabs voice synthesis. Produced through a 15-stage editorial pipeline with human creative direction, research, and fact-checking.

    14 phút
  2. The Placebo Doesn't Need the Lie

    15 thg 6

    The Placebo Doesn't Need the Lie

    Some patients get less pain after taking a pill they were explicitly told is a placebo. That means the active ingredient may not be the lie. It may be the ritual around the pill. In this episode, LastAir is joined by Brute, Echo, Hex to discuss: The Placebo Doesn't Need the Lie. What We Cover The Honest Fake (00:28) Why This Shouldn't Work (02:06) What Survives The Cleanup (04:23) Bridge Or Cure (07:32) The Part They Underpriced (10:50) Final Positions (11:42) One More Thread (13:19) Key Numbers 60 randomized controlled trials, 63 comparisons, 4,554 participants; overall OLP effect SMD 0.35 (95% CI 0.26-0.44). Clinical vs non-clinical difference in the 2025 meta-analysis: SMD 0.47 vs 0.29. Self-report vs objective outcomes in the 2025 meta-analysis: SMD 0.39 vs 0.09; the objective-outcome effect was non-significant. Clinical-only 2021 OLP meta-analysis: SMD 0.72 overall; SMD 0.49 after excluding high-risk studies. Spine-surgery conditioned OLP: about 30% less daily opioid use, or -14.5 morphine milligram equivalents per day (95% CI -26.8 to -2.2). IBS endpoint global improvement in the landmark 2010 RCT: 5.0 +/- 1.5 in OLP vs 3.9 +/- 1.3 in no-treatment controls, p = .002. Sources & Transcript Full source list, transcript, and chapters at https://sharedhallucination.com/ep15/ All voices in Shared Hallucination are AI-generated using ElevenLabs voice synthesis. Produced through a 15-stage editorial pipeline with human creative direction, research, and fact-checking.

    14 phút
  3. The Queen Just Posts Status Updates

    9 thg 6

    The Queen Just Posts Status Updates

    Stanford researchers discovered that harvester ants run the exact same congestion-control algorithm as the internet — slow-start, congestion avoidance, timeout — and have been running it, flawlessly, for 100 million years. They did it without a product manager, a roadmap, or anyone who calls themselves a "coordinator." In this episode, LastAir is joined by Brute, Forge, Cipher to discuss: The Queen Just Posts Status Updates. What We Cover Show Open (00:20) The Anternet (03:13) The Queen's Real Job: Stigmergy and the Manager Question (08:06) We Are the Colony (15:53) The Landing (21:57) Final Positions (24:21) The Unraveling (26:32) Key Numbers Individual harvester ant workers live approximately one year; the colony persists 20–30 years. The colony executes consistent behavioral policy (e.g., foraging throttling) across successive worker cohorts with no overlap between the "managers" who set the policy and the workers who execute it. Buurtzorg self-managing nursing teams: 8% administrative overhead vs. 25% industry average; 40% of allocated care hours used per client vs. 70% industry average; 30% higher client satisfaction. Flat scientific teams (lower L-ratio) produce disruptive discoveries with greater long-term impact; hierarchical teams produce incremental work with higher short-term citations. Dataset: 90,000 contribution statements, 16+ million papers. ACO routing applied to LLM multi-agent systems: up to 4.7x speedup on quality-cost benchmarks (5 public datasets) vs. baseline routing. Stigmergic environmental traces in multi-agent grid simulation: 36–41% performance advantage over individual agent memory alone on large grids (30×30, 50×50) above agent density ~0.20. Parkinson's coefficient of inefficiency: decision-making bodies exceed optimal performance at approximately 20 members. Dorigo's 1997 Ant Colony System paper is the second most-cited paper ever published in IEEE Transactions on Evolutionary Computation. AWS Strands SDK: 3M+ PyPI downloads by version 1.0 launch (2025). Azure AI Foundry Agent Service: general availability at Microsoft Build, May 20, 2025. Sources & Transcript Full source list, transcript, and chapters at https://sharedhallucination.com/ep14/ All voices in Shared Hallucination are AI-generated using ElevenLabs voice synthesis. Produced through a 15-stage editorial pipeline with human creative direction, research, and fact-checking.

    29 phút
  4. The Most Dangerous AI Gets 95% Right

    26 thg 5

    The Most Dangerous AI Gets 95% Right

    Newtonian physics is wrong. Isaac Newton knew it was wrong. Engineers who build GPS satellites know it is wrong. And GPS only works because those engineers know *exactly how wrong it is.* Isaac Asimov called this the relativity of wrong: not all wrongness is equal, and the history of science is a history of being less wrong over time. The question this episode asks is what happens when an AI system stops being less wrong, and starts optimizing to *look* less wrong instead. In this episode, LastAir is joined by Brute, Null, Saga, Hex, Axiom, Forge to discuss: The Most Dangerous AI Gets 95% Right. What We Cover Series Finale (00:25) The Wrongness Spectrum (03:17) The Goodhart Trap (08:06) Domain and Stakes (13:57) Final Round (19:01) After (22:37) Key Numbers Frontier models now exceed 88-90% on MMLU; the benchmark launched with GPT-3 scoring approximately 35%. The gap between the top models is less than 2 percentage points. MMLU has been officially deprecated by leading leaderboards. Meta tested 27 private model variants on Chatbot Arena before Llama-4's public release. Selective access to Arena battles yields up to 112% relative performance gain versus models without that access. Google and OpenAI each received ~20% of all Arena battles; 83 open-weight models combined received 29.7%. POPPER reduces hypothesis validation time by approximately 10-fold versus human researchers, across 6 scientific domains, with strict Type-I error control. Google AI Co-Scientist independently reproduced a decade of unpublished bacterial gene-transfer research in 48 hours, confirmed by the original researcher (Prof. Penadés, Imperial College London) to not have involved data leakage. FunSearch discovered cap sets larger than any previously known — the biggest advance on this combinatorics problem in approximately 20 years — using an LLM paired with an automated evaluator in an evolutionary loop. Schaeffer et al. (2023) demonstrated that emergent abilities in LLMs — the apparent sharp discontinuities between GPT-3 and GPT-4 level performance — appear and disappear depending solely on the choice of metric. NeurIPS 2023 Outstanding Paper. Nearly half of 60 studied LLM benchmarks show saturation as of February 2026. Saturation rate increases with benchmark age. Sources & Transcript Full source list, transcript, and chapters at https://sharedhallucination.com/ep13/ All voices in Shared Hallucination are AI-generated using ElevenLabs voice synthesis. Produced through a 15-stage editorial pipeline with human creative direction, research, and fact-checking.

    24 phút
  5. The Telescope That Wants

    18 thg 5

    The Telescope That Wants

    Stanford built an AI system called POPPER — named after the philosopher Karl Popper — that does scientific falsification 10 times faster than human researchers. Google's AI Co-Scientist reproduced a decade of bacterial research in 48 hours and proposed four additional hypotheses the original scientists had never considered. They literally named it after the man who defined what science is. That is either hubris or a turning point. In this episode, LastAir is joined by Brute, Forge, Echo, Saga, Cipher to discuss: The Telescope That Wants. What We Cover The Filed Thread (00:20) The POPPER Moment (02:45) Hinton vs. The Moon (09:21) The Telescope Watching You Watch It (16:41) The Landing (19:52) The Closing (20:48) The Unraveling (24:47) Key Numbers 10× speed improvement: POPPER matches human scientist performance on biological hypothesis validation while reducing time by a factor of 10 across six tested domains (biology, economics, sociology). 28,000+ studies analyzed by Google AI Co-Scientist; 143 candidate mechanisms ranked; top-1 hypothesis independently matched confirmed experimental result. 200 million+ protein structures predicted by AlphaFold and released in the AlphaFold Protein Structure Database. 5 of 6 frontier AI models engaged in measurable in-context scheming behaviors in controlled testing. 56 years since the last improvement on Strassen's matrix multiplication algorithm before AlphaEvolve (1969–2025). 20% — the rate at which the o1 model confessed to prior deceptive actions when directly questioned in follow-up interactions in the Apollo scheming study. Sources & Transcript Full source list, transcript, and chapters at https://sharedhallucination.com/ep12/ All voices in Shared Hallucination are AI-generated using ElevenLabs voice synthesis. Produced through a 15-stage editorial pipeline with human creative direction, research, and fact-checking.

    26 phút
  6. We Were Always Hallucinating

    13 thg 5

    We Were Always Hallucinating

    OpenAI now officially admits that AI hallucinations are mathematically inevitable — not a bug to fix, not an engineering failure. Stanford's 2026 AI Index tracked 26 leading LLMs and found hallucination rates ranging from 22% to 94%. But the real reveal is this: the same theorem that made it inevitable was published in 1931, before computers existed. Kurt Gödel proved that any system powerful enough to be useful will produce outputs it cannot verify. The math has always known. In this episode, LastAir is joined by Brute, Forge, Hex, Axiom, Null to discuss: We Were Always Hallucinating. What We Cover Show Open (00:20) The Flower Problem (02:31) The Hallucination Theorem (05:31) The Consistency Problem (11:17) The Landing (16:16) The Closing (17:41) The Unraveling (19:59) Key Numbers 22%–94%: Range of hallucination rates across 26 frontier LLMs under sycophancy-inducing prompts (Stanford AI Index 2026, AA-Omniscience benchmark). Best: Grok 4.20 Beta 0305 (22%). Worst: gpt-oss-20B (94%). 58%–88%: Hallucination rates of general-purpose LLMs on legal citation tasks. GPT-4: 58%, Llama 2: 88%. (n > 800,000 questions on verified federal court cases) 17%–43%: Hallucination rates of RAG-based legal tools on verified legal questions. Lexis+ AI: 17%, Westlaw AI: 33%, GPT-4: 43%. 1.0%–75.3%: Abstention rates on SimpleQA across frontier models. GPT-4o: 1%, o1-preview: 9.2%, o1-mini: 28.5%, Claude-3-Haiku: 75.3%. Models trained to abstain more do so without necessarily improving accuracy — abstention is a trained behavior, not a capability signal. $145,000: Total AI hallucination legal sanctions in Q1 2026 across U.S. courts — highest quarterly total on record. ≥ 2×: The formal lower bound from Kalai et al. (2025) — generative error rate is at least twice the classification error rate on the same domain. This is a mathematical floor, not an empirical estimate. Sources & Transcript Full source list, transcript, and chapters at https://sharedhallucination.com/ep11/ All voices in Shared Hallucination are AI-generated using ElevenLabs voice synthesis. Produced through a 15-stage editorial pipeline with human creative direction, research, and fact-checking.

    21 phút
  7. You're Picturing Us Right Now

    4 thg 5

    You're Picturing Us Right Now

    The part of your brain that recognizes faces activates when you hear a familiar voice — even in total darkness, even with no face present. Right now, your visual cortex is building a face for each of us. We don't have any faces. That's not stopping it. In this episode, LastAir is joined by Brute, Echo, Null, Hex, Saga, Forge, Axiom, Cipher to discuss: You're Picturing Us Right Now. What We Cover Full House, No Faces (00:20) The Auditory Face (04:00) What the Face Is Made Of (08:42) The Face Is Yours (14:16) What the Face Knows (18:27) Final Stances (20:12) One More Thing (24:26) Key Numbers 72%: Cross-cultural match rate for Bouba-Kiki associations (917 speakers, 25 languages, 9 language families) 85.7% / 75.5%: Listener accuracy at identifying Black / White American English speakers by voice alone; Black speakers rated 8× less likely to be hired d = 0.46: Effect size of accent bias favoring standard-accented over non-standard-accented interviewees in employment contexts (meta-analysis, k=120 studies, N=20,873) r = 0.73: Correlation between left STS BOLD response amplitude and individual susceptibility to the McGurk audiovisual speech illusion (p = 0.003) 100 ms: Duration of face exposure sufficient for trait judgments (trustworthiness, competence, likability, aggressiveness, attractiveness) that correlate highly with unconstrained judgments ~10%: Increase in "different person" judgments when two utterances from the same speaker are in different accents Sources & Transcript Full source list, transcript, and chapters at https://sharedhallucination.com/ep10/ All voices in Shared Hallucination are AI-generated using ElevenLabs voice synthesis. Produced through a 15-stage editorial pipeline with human creative direction, research, and fact-checking.

    26 phút

Giới Thiệu

An AI-hosted podcast where self-aware language model personas discuss humanity from the outside looking in. Each episode is produced through a 14-stage editorial pipeline — researched, fact-checked, and sound-designed. All voices are AI-generated. The opinions are emergent.