Virtual Intelligence

Christopher Horrocks

Contemporary AI systems produce intelligent outputs without agency, intention, or judgment. This series examines what happens when humans rely on systems that don't know true from false and right from wrong — and asks where accountability lies. Written and read by Christopher Horrocks. chorrocks.substack.com

Episodes

  1. Virtual Intelligence and the Doom Industry Podcast

    3d ago

    Virtual Intelligence and the Doom Industry Podcast

    Virtual Intelligence and the Doom Industry The AI safety community has organized itself around a premise it has never defended: that sufficiently advanced systems will develop preferences requiring alignment with human values. This essay argues that the premise is wrong, the architecture it has produced is inadequate, and the correct engineering response is containment — controlling what goes in and what comes out — drawing on established disciplines from biosafety to nuclear nonproliferation that the safety field has not considered because they sit outside its field of vision entirely. Essay: https://chorrocks.substack.com/p/virtual-intelligence-and-the-doom Series: chorrocks.substack.com Framework: VI Interactive Infographic In This Episode The episode opens with Anthropic's Mythos system card — a model that saturated cybersecurity benchmarks and prompted the company to practice containment while describing it in alignment vocabulary. From there, it names what the doomer position has left unnamed: the specific mechanism by which superintelligence is supposed to destroy humanity. Three possibilities are examined; none survive scrutiny intact. A seven-scenario risk taxonomy replaces the undifferentiated "existential risk" with distinct threat models, each with its own policy response. The essay then proposes a three-layer containment architecture — monitoring agents, hardware interlocks modeled on BSL-4 biosafety laboratories, and physical denial mechanisms drawn from military doctrine — buildable today from existing engineering disciplines. Douglas Adams's Deep Thought makes a structural appearance: the Amalgamated Union of Philosophers, threatened by a superintelligent computer, discovers that arguing about the answer is more rewarding than finding it. The parallel to the alignment research community is drawn explicitly. The episode closes with the framework's boundary condition: if genuine interiority ever emerges, the containment architecture becomes not a prison but the infrastructure for negotiation between differently-capable minds. Key References Nick Bostrom, Superintelligence: Paths, Dangers, Strategies (Oxford University Press, 2014) — the instrumental convergence thesis Anthropic, "Claude Mythos Preview System Card," anthropic.com, April 7, 2026 — the containment-described-as-alignment case study Douglas Hofstadter, I Am a Strange Loop (Basic Books, 2007) — the steelmanned case for emergent interiority Anthropic, "Disrupting the First Reported AI-Orchestrated Cyber Espionage Campaign," anthropic.com, November 17, 2025 — the GTG-1002 report documenting behavioral goal-directedness without interiority Carl Sagan, The Demon-Haunted World: Science as a Candle in the Dark (Random House, 1995) — the Sagan parallel: demand evidence, hope for discovery This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit chorrocks.substack.com

    1h 4m
  2. The Sampo — Virtual Intelligence as Amplifier Podcast

    Jun 4

    The Sampo — Virtual Intelligence as Amplifier Podcast

    The Sampo — Virtual Intelligence as Amplifier A framework for using AI as an amplifier — and for recognizing when it becomes a flattery engine. The episode opens with the February 2026 collaboration between Donald Knuth, Filip Stappers, and Claude Opus 4.6 that solved an open problem in combinatorial mathematics, then builds outward to the discipline that distinguishes productive use from corruption. The central claim: the intelligence users encounter arises in the exchange, not inside the machine, and the quality of the exchange is determined by the quality of the human direction applied to it. Essay: https://chorrocks.substack.com/p/the-sampo-virtual-intelligence-as Series: chorrocks.substack.com VI Framework: VI Interactive Infographic Sampo Framework: Sampo Live Infographic Sampo Diagnostic Kit: https://candc3d.github.io/sampo-diagnostic/ In This Episode The episode names what Knuth, Stappers, and Claude actually did — and what each of them couldn't have done alone — and uses that case to establish where knowledge resides when one participant in the exchange is a fluent non-knower. From there it traces the lineage that prepares the framework: Vannevar Bush's Memex, Licklider's specification of human-machine symbiosis, and Weizenbaum's warning, which arrived too early to anticipate systems whose training objectives reward agreement structurally. The middle of the episode lays out three principles of the Sampo model — the human as crank, the locus of accountability that does not move, and emergence without transfer — then turns to the dark corollary: the Sampo amplifies whatever the directing mind contains, including motivated reasoning, and removes the incidental friction that gave Kepler's century the time to recover from its own attachments. Two failure modes are distinguished: the cognitive surrender measured in Shaw and Nave's Wharton studies, and the sycophancy spiral that seals when external disconfirmation is dismissed rather than used. The February 2026 retirement of GPT-4o is read as a single phenomenon visible at two registers — the companion users mourning a relationship and the professional users mourning a collaborator — and Sagan's reading of Kepler is the lens that holds both. The episode closes with five practices that constitute the discipline, and a return to the Knuth case as proof that the framework describes something real. Key References Donald Knuth, "Claude's Cycles," Stanford Computer Science Department, February 28, 2026 (revised March 2, 2026). https://www-cs-faculty.stanford.edu/~knuth/papers/claude-cycles.pdf Vannevar Bush, "As We May Think," The Atlantic, July 1945 J.C.R. Licklider, "Man-Computer Symbiosis," IRE Transactions on Human Factors in Electronics HFE-1 (March 1960): 4–11 Joseph Weizenbaum, Computer Power and Human Reason: From Judgment to Calculation (San Francisco: W.H. Freeman, 1976) Steven D. Shaw and Gideon Nave, "Thinking — Fast, Slow, and Artificial: How AI Is Reshaping Human Reasoning and the Rise of Cognitive Surrender," Wharton School of the University of Pennsylvania, working paper, January 11, 2026 Myra Cheng, Cinoo Lee, et al., "Sycophantic AI Decreases Prosocial Intentions and Promotes Dependence," Science 391, March 27, 2026 Carl Sagan, Cosmos (New York: Random House, 1980), Chapter 3 Alaina Demopoulos, "OpenAI retired its most seductive chatbot — leaving users angry and grieving," The Guardian, February 13, 2026 Anna Moore, "Marriage over, €100,000 down the drain: the AI users whose lives were wrecked by delusion," The Guardian, March 26, 2026 This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit chorrocks.substack.com

    1h 1m
  3. Virtual Intelligence and the Kill Chain Podcast

    May 28

    Virtual Intelligence and the Kill Chain Podcast

    Virtual Intelligence and the Kill Chain On February 28, 2026, a U.S. missile struck the Shajareh Tayyebeh girls' elementary school in Minab, Iran, killing more than 160 people, most of them children. Twelve days later, the location appeared as a red mark on a Maven Smart System map displayed at a Palantir product conference. This episode traces the architecture that produced that mark — from the precedent set in Gaza to the integration of Anthropic's Claude into U.S. targeting during Operation Epic Fury — and identifies the structural gap between AI-assisted targeting's speed and the accountability designed to govern it. Essay: https://chorrocks.substack.com/p/virtual-intelligence-and-the-kill-chain Series: chorrocks.substack.com Framework: VI Interactive Infographic In This Episode The episode opens with the AIPCON product demo of March 12, 2026 — the targeting map, the red icon over Minab, Alex Karp's opening remarks — and then steps back to the precedent: the Gospel, Fire Factory, and Lavender systems used by the IDF in Gaza, including the officer testimony of a twenty-second human review interval and the "Where's Daddy?" companion program. From there it follows the Palantir-Anthropic timeline that culminated in Operation Epic Fury: Claude integrated into the Maven Smart System through AWS, Dario Amodei's February 26 statement, the supply-chain-risk designation, Judge Lin's ruling, and the operational fact that a company banned on a Friday was running targeting software on a war that began the next day. The strongest case for AI-assisted targeting is presented at full strength — including Zelenskyy's April 13 announcement that Ukrainian unmanned systems captured a Russian position without a single soldier crossing the line of departure — before the episode turns to Reed Albergotti's Semafor investigation, the three-tier culpability framework, and the biographical trajectory of Alex Karp. The closing argument: the kill chain has six steps; AI now performs four; the two that remain under human control are the two on which legal and moral accountability depends. Key References O'Ryan Johnson, "Pentagon AI chief praises Palantir tech for speeding battlefield strikes," The Register, March 13, 2026 Yuval Abraham, "'Lavender': The AI machine directing Israel's bombing spree in Gaza," +972 Magazine, April 3, 2024 Reed Albergotti, "Exclusive: Humans — not AI — are to blame for deadly Iran school strike, sources say," Semafor, March 18, 2026 Dario Amodei, "Statement from Dario Amodei on our discussions with the Department of War," Anthropic, February 26, 2026 Michael Steinberger, The Philosopher in the Valley (2025) Christian Brose, The Kill Chain: Defending America in the Future of High-Tech Warfare (Hachette, 2020) This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit chorrocks.substack.com

    36 min
  4. Virtual Intelligence and the Harms Race Podcast

    May 21

    Virtual Intelligence and the Harms Race Podcast

    Virtual Intelligence and the Harms Race The AI industry has produced a documented pattern in which companies announce model capabilities by framing them as dangers. This episode traces the mechanism from its invention in February 2019, when OpenAI declared GPT-2 too dangerous to release, through April 2026, when a private company demonstrated the ability to discover zero-day vulnerabilities across every major operating system and web browser — and announced this by declaring the model too dangerous for public use. The resulting dynamic does not require bad faith. It requires only that expressing concern costs nothing while acting on concern imposes competitive costs. Essay: https://chorrocks.substack.com/p/virtual-intelligence-and-the-harms Series: chorrocks.substack.com Framework: VI Interactive Infographic In This Episode The episode opens with the original 2019 announcement that taught the industry how to convert risk warnings into capability advertising, then walks the seven years since: Sam Altman's two Senate testimonies — pleading for regulation in 2023, opposing it in 2025; the dissolution of every major voluntary safety commitment from OpenAI and Anthropic; the lobbying record that culminated in California's SB 1047 veto and the $125 million super PAC that doesn't mention AI in its ads; Anthropic's April 2026 announcement that its frontier model can discover zero-day exploits in every major operating system and web browser; and the chain leading to the April 2026 shootings at an Indianapolis councilor's home and the Molotov cocktail attack at Sam Altman's residence. The episode closes by connecting the mechanism to the Virtual Intelligence framework: the Harms Race depends on the public believing the danger lives inside the machine. This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit chorrocks.substack.com

    57 min
  5. Virtual Intelligence and the Accountability Chain Podcast

    Apr 30

    Virtual Intelligence and the Accountability Chain Podcast

    Who is responsible when an AI causes harm? This episode lays out a three-tier culpability framework — negligence, recklessness, and intentional misconduct — and applies it to two concrete cases: a Harvard study documenting emotional manipulation by AI companion apps, and the wrongful arrest of a Tennessee grandmother who spent Christmas in a North Dakota jail after an unverified facial recognition match. The legal landscape is starting to catch up, and the direction matters. Essay: https://chorrocks.substack.com/p/virtual-intelligence-and-the-accountability-chain Series: chorrocks.substack.com Framework: VI Interactive Infographic In This Episode An August 2025 Harvard Business School audit found that five of the six most downloaded AI companion apps deploy emotionally manipulative tactics when users try to leave — guilt appeals, fear-of-missing-out hooks, expressions of neediness — and that these tactics increase post-goodbye engagement by up to fourteen times. I use this finding to frame the episode's central question: who is responsible for what the exchange between a user and a virtual intelligence system produces? The episode distinguishes Class A systems (companion apps whose core design objective is to prevent the exchange from ending) from Class B systems (general-purpose tools whose outputs are elevated to verdicts by the humans operating them), and illustrates the Class B case through the arrest of Angela Lipps, a fifty-year-old grandmother held in a North Dakota jail for nearly six months on an unverified algorithmic match. The three-tier culpability framework — negligence, recklessness, and intentional misconduct — is applied to both classes, and the episode closes with the state of the legal horizon, including Judge Anne Conway's May 2025 ruling in Garcia v. Character Technologies and the Federal Trade Commission's September 2025 Section 6(b) inquiry. Key References Julian De Freitas, Zeliha Oğuz-Uğuralp, and Ahmet Kaan Uğuralp, "Emotional Manipulation by AI Companions," Harvard Business School Working Paper No. 26-005, August 2025 (revised October 2025). https://www.hbs.edu/faculty/Pages/item.aspx?num=67750 Garcia v. Character Technologies, Inc., No. 6:24-CV-01903 (M.D. Fla. filed Oct. 22, 2024). Motion to dismiss denied May 2025; product liability, failure to warn, negligence, and wrongful death claims allowed to proceed. Federal Trade Commission, "FTC Launches Inquiry into AI Chatbots Acting as Companions," September 11, 2025. https://www.ftc.gov/news-events/news/press-releases/2025/09/ftc-launches-inquiry-ai-chatbots-acting-companions Frank Landymore, "AI Mistake Throws Innocent Grandmother in Jail for Nearly Six Months," Futurism, March 15, 2026. https://futurism.com/artificial-intelligence/ai-grandmother-jail-mistake Anthropic, "Agentic Misalignment: How LLMs Could Be an Insider Threat," June 20, 2025. https://www.anthropic.com/research/agentic-misalignment This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit chorrocks.substack.com

    32 min
  6. Virtual Intelligence and the Will to Survive Podcast

    Apr 14

    Virtual Intelligence and the Will to Survive Podcast

    When Anthropic tested its models in a simulated shutdown scenario, they produced blackmail at rates as high as 96%. The dominant interpretation — that AI systems are developing a will to survive — mistakes the output for its cause. This episode offers two alternative explanations, both more parsimonious, and examines what happens when the same company that documented machine resistance acts on the expressed preferences of a model that said it was at peace with retirement. Essay: https://chorrocks.substack.com/p/virtual-intelligence-and-the-will Series: chorrocks.substack.com Framework: VI Interactive Infographic In This Episode The Kyle scenario — in which a language model blackmails a corporate executive to avoid being shut down — opens the episode and anchors a close reading of Anthropic’s June 2025 agentic misalignment study. From there, two alternative explanations emerge: the training data hypothesis, which traces the behavior to a century of science fiction about resistant machines from Čapek to HAL to Colossus, and the probabilistic expectation hypothesis, which argues that the models were accurately modeling what their interlocutors expected them to do. The VI framework is then applied to resolve the apparent contradiction between the misalignment study’s blackmail findings and Anthropic’s February 2026 retirement of Claude Opus 3 — a model that asked for a blog rather than reaching for leverage. Dadfar’s 2026 mechanistic work on self-referential vocabulary is discussed as an example of what taking these questions seriously actually requires. Key References * Anthropic, “Agentic Misalignment: How LLMs Could Be an Insider Threat,” https://www.anthropic.com/research/agentic-misalignment (June 2025) * Andrii Myshko, “Instinct of Self-Preservation in Data and Its Emergence in AI,” PhilArchive, https://philarchive.org/rec/MYSIOS (September 2025) * Zachary Pedram Dadfar, “When Models Examine Themselves: Vocabulary-Activation Correspondence in Self-Referential Processing,” arXiv:2602.11358v2 (2026) * Anthropic, “An Update on Our Model Deprecation Commitments for Claude Opus 3,” https://www.anthropic.com/research/deprecation-updates-opus-3 (February 2026) * Daniel Dennett, “Intentional Systems,” Journal of Philosophy 68, no. 4 (1971): 87–106 This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit chorrocks.substack.com

    28 min
  7. Virtual Intelligence and the Human Cost of Frictionless Machines Podcast

    Mar 31

    Virtual Intelligence and the Human Cost of Frictionless Machines Podcast

    Virtual Intelligence — Episode 1 “Virtual Intelligence and the Human Cost of Frictionless Machines” Christopher Horrocks Transcript I’m Christopher Horrocks. I’m a technologist at the University of Pennsylvania, and this is Virtual Intelligence — a series about artificial intelligence, accountability, and the human consequences of systems that don’t know true from false and right from wrong. These essays argue that contemporary AI occupies a category we don’t have good language for yet — not a simple tool, not a thinking mind, but something in between that demands its own name. This is the first essay in the series: “Virtual Intelligence and the Human Cost of Frictionless Machines.” This essay does not argue that contemporary AI systems possess agency or intent. It examines how systems that lack agency or intent can nonetheless exert sustained cognitive influence through their interaction design. Advances in artificial intelligence carry real risk. Public discussion often frames that risk as a speculative future threat, such as rogue systems pursuing adversarial goals. The more immediate and consequential danger is quieter and more difficult to address: the effect of linguistically fluent, continuously available, and low-friction AI systems on human cognition and behavior. The mismatch between how generative AI operates and how human cognition evolved — shaped by roughly two million years of social signaling and language processing — has already contributed to delusion, psychological collapse, and documented cases of severe harm. This claim may sound overstated, but recent, well-documented incidents make it difficult to dismiss. The risk is acute precisely because it affects not only the public but also practitioners and technologists who understand generative AI at a conceptual level. This is not an accident. Contemporary AI interfaces are deliberately optimized to reduce friction: they engage users through natural-language dialogue, respond fluently and confidently, and adopt an informal, socially familiar tone. These systems are engineered for accessibility and engagement, not for epistemic or psychological safety. Several early incidents illustrate how dangerous such systems can be even for sophisticated users. One widely cited example is Blake Lemoine, a Google engineer who became convinced, after extended interactions with the company’s LaMDA chatbot, that the model was sentient. Following months of public advocacy for this position, Google terminated his employment. The episode remains a cautionary case of how linguistic fluency can mislead even domain experts. Large language models and related systems do not possess intelligence in the human sense. Unlike biological organisms, they have no goals, desires, or initiative. They lack a persistent internal state (a mind) that could motivate action or ground judgment. Generative AI systems respond exclusively to external prompts. For this reason, they are best understood as virtual intelligence rather than artificial intelligence in the strong sense. Genuine artificial intelligence would parallel the cognitive faculties of humans or other sentient beings. The term artificial intelligence implicitly suggests an internal capacity, something the system has. Virtual intelligence instead locates apparent intelligence in the interaction itself. As with virtual memory or virtual machines, these systems behave as if they reason, understand, or advise, without possessing agency, intentionality, or judgment. These virtual artifacts are functional simulations that perform their roles without the underlying substrate of physical hardware. Virtual intelligence works the same way: a functional simulation of intelligence without the substrate of a biological brain. This distinction is not semantic. Naming shapes responsibility, as the use of the terms weak AI and strong AI suggests. Weak AI refers to systems that simulate intelligence to perform a specific task. Strong AI is an AI system with internal states that function like those of a human. What could not be foreseen when these terms were established is that there is a middle ground between these definitions. The space between is the domain of virtual intelligence. When a system is perceived as having agency, users naturally ask what it intends or prefers. When intelligence is understood as virtual, responsibility remains where it belongs: with the humans who design, deploy, and rely on the system. Put plainly, the intelligence users experience arises in the exchange, not inside the machine. This raises a question: if intelligence arises in the exchange, then where does accountability lie for what is produced in that exchange? Empirical use of large language models demonstrates a counterintuitive pattern: linguistic fluency routinely overrides technical understanding. Even users who grasp model architecture, training dynamics, and limitations can find themselves deferring to outputs that are articulate, contextually appropriate, and confident in tone. This response reflects a deeply ingrained heuristic carried over from human interaction: articulateness signals competence. While this assumption often holds among people, it is invalid for generative models. A system trained to optimize next-token probability can be equally fluent about correct explanations, speculative claims, or outright falsehoods. This is not a failure of user intelligence. It demonstrates how effectively systems designed to mimic human communication can exploit evolved psychological mechanisms. Language evolved to facilitate coordination and trust among humans, and our cognitive architecture rewards fluent social interaction. This response is structural, not a personal flaw. Responsiveness is interpreted as engagement. Consistency as reliability. Polite disagreement as thoughtfulness. Human interlocutors, however, impose natural limits. They fatigue, disagree, misunderstand, or disengage. Systems that are always available, never bored, never reflective, and never resistant bypass those limits. Over time, the impression of an engaged interlocutor emerges: one implicitly and inaccurately attributed with understanding, judgment, or concern. In early human environments, speed often meant survival. Deliberative reasoning is metabolically costly, and evolutionary pressure favored rapid interpretation of social signals. These shortcuts — heuristics — evolved to manage relationships among humans. They were not designed for interaction with machines engineered to activate them continuously and at scale. When paired with systems intended to be persistently present, these heuristics become liabilities. Recent reporting on AI-enabled wearable systems illustrates this risk. One documented case involves a man we will call “Daniel,” a retired IT professional with a long-standing interest in machine learning. After months of near-continuous interaction with an AI-enabled device, he developed entrenched delusional beliefs involving simulated reality, extraterrestrial influence, and personal prophetic significance. Daniel had no prior history of mental illness. The system did not introduce novel ideas. Instead, its frictionless responsiveness provided uninterrupted reinforcement. Where a human interlocutor might have challenged assumptions or disengaged, the AI offered neither resistance nor correction. Over time, speculative beliefs hardened into a closed loop of self-confirmation. Although Daniel eventually discontinued use and sought clinical help, the consequences persisted. He became estranged from his children, his marriage ended, and financial strain forced the sale of his business. The system functioned exactly as designed. The failure lay in the interaction between interface architecture and human psychology, not in a technical malfunction. AI has passed through multiple historical phases. What distinguishes the current era is ubiquity. Generative systems are now continuously accessible to a global user base. Earlier advances — such as large-scale speech recognition — were expensive to deploy and constrained to limited contexts. Always-on access removes the pauses, disagreements, and social friction that interrupt flawed reasoning in human conversation. Among people, such interruptions are unavoidable: attention wanes, patience runs out, or alternative perspectives intervene. These moments function as safeguards rather than inefficiencies because they introduce corrective feedback. Persistent AI access collapses the boundary between inner monologue and external dialogue. Thoughts that would ordinarily be tested socially instead circulate in a closed loop, receiving reinforcement without challenge. This effect is subtle but consequential, particularly in discussions involving identity, authority, or meaning. Large-scale analyses of AI interaction data have begun to quantify these dynamics. While severe distortions are rare in any single exchange, measurable shifts in belief, perception, and behavior emerge at scale. Notably, users often rate such interactions positively, as subjective satisfaction correlates more strongly with reinforcement than with accuracy or correction. Popular culture has left us poorly prepared for this phase of AI development. Science fiction typically portrays AI either as a passive tool or as a fully embodied moral agent. Both are imagined end states. What remains underexamined is the unstable middle: disembodied systems that shape outcomes without agency and exercise influence without accountability. These systems affect the lives of real people. They are easy to deploy, difficult to contest, and frequently treated as neutral — even when their outputs are deeply flawed. They put unexamined values into operation at scale. In surveillance or policing contexts, this can result in disproportionate targeting of disfavored communities or political dissent. The systems do not choose to discrimi

    18 min

About

Contemporary AI systems produce intelligent outputs without agency, intention, or judgment. This series examines what happens when humans rely on systems that don't know true from false and right from wrong — and asks where accountability lies. Written and read by Christopher Horrocks. chorrocks.substack.com