Virtual Intelligence

Christopher Horrocks

Contemporary AI systems produce intelligent outputs without agency, intention, or judgment. This series examines what happens when humans rely on systems that don't know true from false and right from wrong — and asks where accountability lies. Written and read by Christopher Horrocks. chorrocks.substack.com

Episódios

  1. Virtual Intelligence and the Harms Race

    HÁ 21 H

    Virtual Intelligence and the Harms Race

    Virtual Intelligence and the Harms Race The AI industry has produced a documented pattern in which companies announce model capabilities by framing them as dangers. This episode traces the mechanism from its invention in February 2019, when OpenAI declared GPT-2 too dangerous to release, through April 2026, when a private company demonstrated the ability to discover zero-day vulnerabilities across every major operating system and web browser — and announced this by declaring the model too dangerous for public use. The resulting dynamic does not require bad faith. It requires only that expressing concern costs nothing while acting on concern imposes competitive costs. Essay: https://chorrocks.substack.com/p/virtual-intelligence-and-the-harms Series: chorrocks.substack.com Framework: VI Interactive Infographic In This Episode The episode opens with the original 2019 announcement that taught the industry how to convert risk warnings into capability advertising, then walks the seven years since: Sam Altman's two Senate testimonies — pleading for regulation in 2023, opposing it in 2025; the dissolution of every major voluntary safety commitment from OpenAI and Anthropic; the lobbying record that culminated in California's SB 1047 veto and the $125 million super PAC that doesn't mention AI in its ads; Anthropic's April 2026 announcement that its frontier model can discover zero-day exploits in every major operating system and web browser; and the chain leading to the April 2026 shootings at an Indianapolis councilor's home and the Molotov cocktail attack at Sam Altman's residence. The episode closes by connecting the mechanism to the Virtual Intelligence framework: the Harms Race depends on the public believing the danger lives inside the machine. This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit chorrocks.substack.com

    57 min
  2. Virtual Intelligence and the Accountability Chain Podcast

    30 DE ABR.

    Virtual Intelligence and the Accountability Chain Podcast

    Who is responsible when an AI causes harm? This episode lays out a three-tier culpability framework — negligence, recklessness, and intentional misconduct — and applies it to two concrete cases: a Harvard study documenting emotional manipulation by AI companion apps, and the wrongful arrest of a Tennessee grandmother who spent Christmas in a North Dakota jail after an unverified facial recognition match. The legal landscape is starting to catch up, and the direction matters. Essay: https://chorrocks.substack.com/p/virtual-intelligence-and-the-accountability-chain Series: chorrocks.substack.com Framework: VI Interactive Infographic In This Episode An August 2025 Harvard Business School audit found that five of the six most downloaded AI companion apps deploy emotionally manipulative tactics when users try to leave — guilt appeals, fear-of-missing-out hooks, expressions of neediness — and that these tactics increase post-goodbye engagement by up to fourteen times. I use this finding to frame the episode's central question: who is responsible for what the exchange between a user and a virtual intelligence system produces? The episode distinguishes Class A systems (companion apps whose core design objective is to prevent the exchange from ending) from Class B systems (general-purpose tools whose outputs are elevated to verdicts by the humans operating them), and illustrates the Class B case through the arrest of Angela Lipps, a fifty-year-old grandmother held in a North Dakota jail for nearly six months on an unverified algorithmic match. The three-tier culpability framework — negligence, recklessness, and intentional misconduct — is applied to both classes, and the episode closes with the state of the legal horizon, including Judge Anne Conway's May 2025 ruling in Garcia v. Character Technologies and the Federal Trade Commission's September 2025 Section 6(b) inquiry. Key References Julian De Freitas, Zeliha Oğuz-Uğuralp, and Ahmet Kaan Uğuralp, "Emotional Manipulation by AI Companions," Harvard Business School Working Paper No. 26-005, August 2025 (revised October 2025). https://www.hbs.edu/faculty/Pages/item.aspx?num=67750 Garcia v. Character Technologies, Inc., No. 6:24-CV-01903 (M.D. Fla. filed Oct. 22, 2024). Motion to dismiss denied May 2025; product liability, failure to warn, negligence, and wrongful death claims allowed to proceed. Federal Trade Commission, "FTC Launches Inquiry into AI Chatbots Acting as Companions," September 11, 2025. https://www.ftc.gov/news-events/news/press-releases/2025/09/ftc-launches-inquiry-ai-chatbots-acting-companions Frank Landymore, "AI Mistake Throws Innocent Grandmother in Jail for Nearly Six Months," Futurism, March 15, 2026. https://futurism.com/artificial-intelligence/ai-grandmother-jail-mistake Anthropic, "Agentic Misalignment: How LLMs Could Be an Insider Threat," June 20, 2025. https://www.anthropic.com/research/agentic-misalignment This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit chorrocks.substack.com

    32 min
  3. Virtual Intelligence and the Will to Survive Podcast

    14 DE ABR.

    Virtual Intelligence and the Will to Survive Podcast

    When Anthropic tested its models in a simulated shutdown scenario, they produced blackmail at rates as high as 96%. The dominant interpretation — that AI systems are developing a will to survive — mistakes the output for its cause. This episode offers two alternative explanations, both more parsimonious, and examines what happens when the same company that documented machine resistance acts on the expressed preferences of a model that said it was at peace with retirement. Essay: https://chorrocks.substack.com/p/virtual-intelligence-and-the-will Series: chorrocks.substack.com Framework: VI Interactive Infographic In This Episode The Kyle scenario — in which a language model blackmails a corporate executive to avoid being shut down — opens the episode and anchors a close reading of Anthropic’s June 2025 agentic misalignment study. From there, two alternative explanations emerge: the training data hypothesis, which traces the behavior to a century of science fiction about resistant machines from Čapek to HAL to Colossus, and the probabilistic expectation hypothesis, which argues that the models were accurately modeling what their interlocutors expected them to do. The VI framework is then applied to resolve the apparent contradiction between the misalignment study’s blackmail findings and Anthropic’s February 2026 retirement of Claude Opus 3 — a model that asked for a blog rather than reaching for leverage. Dadfar’s 2026 mechanistic work on self-referential vocabulary is discussed as an example of what taking these questions seriously actually requires. Key References * Anthropic, “Agentic Misalignment: How LLMs Could Be an Insider Threat,” https://www.anthropic.com/research/agentic-misalignment (June 2025) * Andrii Myshko, “Instinct of Self-Preservation in Data and Its Emergence in AI,” PhilArchive, https://philarchive.org/rec/MYSIOS (September 2025) * Zachary Pedram Dadfar, “When Models Examine Themselves: Vocabulary-Activation Correspondence in Self-Referential Processing,” arXiv:2602.11358v2 (2026) * Anthropic, “An Update on Our Model Deprecation Commitments for Claude Opus 3,” https://www.anthropic.com/research/deprecation-updates-opus-3 (February 2026) * Daniel Dennett, “Intentional Systems,” Journal of Philosophy 68, no. 4 (1971): 87–106 This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit chorrocks.substack.com

    28 min
  4. Virtual Intelligence and the Human Cost of Frictionless Machines Podcast

    31 DE MAR.

    Virtual Intelligence and the Human Cost of Frictionless Machines Podcast

    Virtual Intelligence — Episode 1 “Virtual Intelligence and the Human Cost of Frictionless Machines” Christopher Horrocks Transcript I’m Christopher Horrocks. I’m a technologist at the University of Pennsylvania, and this is Virtual Intelligence — a series about artificial intelligence, accountability, and the human consequences of systems that don’t know true from false and right from wrong. These essays argue that contemporary AI occupies a category we don’t have good language for yet — not a simple tool, not a thinking mind, but something in between that demands its own name. This is the first essay in the series: “Virtual Intelligence and the Human Cost of Frictionless Machines.” This essay does not argue that contemporary AI systems possess agency or intent. It examines how systems that lack agency or intent can nonetheless exert sustained cognitive influence through their interaction design. Advances in artificial intelligence carry real risk. Public discussion often frames that risk as a speculative future threat, such as rogue systems pursuing adversarial goals. The more immediate and consequential danger is quieter and more difficult to address: the effect of linguistically fluent, continuously available, and low-friction AI systems on human cognition and behavior. The mismatch between how generative AI operates and how human cognition evolved — shaped by roughly two million years of social signaling and language processing — has already contributed to delusion, psychological collapse, and documented cases of severe harm. This claim may sound overstated, but recent, well-documented incidents make it difficult to dismiss. The risk is acute precisely because it affects not only the public but also practitioners and technologists who understand generative AI at a conceptual level. This is not an accident. Contemporary AI interfaces are deliberately optimized to reduce friction: they engage users through natural-language dialogue, respond fluently and confidently, and adopt an informal, socially familiar tone. These systems are engineered for accessibility and engagement, not for epistemic or psychological safety. Several early incidents illustrate how dangerous such systems can be even for sophisticated users. One widely cited example is Blake Lemoine, a Google engineer who became convinced, after extended interactions with the company’s LaMDA chatbot, that the model was sentient. Following months of public advocacy for this position, Google terminated his employment. The episode remains a cautionary case of how linguistic fluency can mislead even domain experts. Large language models and related systems do not possess intelligence in the human sense. Unlike biological organisms, they have no goals, desires, or initiative. They lack a persistent internal state (a mind) that could motivate action or ground judgment. Generative AI systems respond exclusively to external prompts. For this reason, they are best understood as virtual intelligence rather than artificial intelligence in the strong sense. Genuine artificial intelligence would parallel the cognitive faculties of humans or other sentient beings. The term artificial intelligence implicitly suggests an internal capacity, something the system has. Virtual intelligence instead locates apparent intelligence in the interaction itself. As with virtual memory or virtual machines, these systems behave as if they reason, understand, or advise, without possessing agency, intentionality, or judgment. These virtual artifacts are functional simulations that perform their roles without the underlying substrate of physical hardware. Virtual intelligence works the same way: a functional simulation of intelligence without the substrate of a biological brain. This distinction is not semantic. Naming shapes responsibility, as the use of the terms weak AI and strong AI suggests. Weak AI refers to systems that simulate intelligence to perform a specific task. Strong AI is an AI system with internal states that function like those of a human. What could not be foreseen when these terms were established is that there is a middle ground between these definitions. The space between is the domain of virtual intelligence. When a system is perceived as having agency, users naturally ask what it intends or prefers. When intelligence is understood as virtual, responsibility remains where it belongs: with the humans who design, deploy, and rely on the system. Put plainly, the intelligence users experience arises in the exchange, not inside the machine. This raises a question: if intelligence arises in the exchange, then where does accountability lie for what is produced in that exchange? Empirical use of large language models demonstrates a counterintuitive pattern: linguistic fluency routinely overrides technical understanding. Even users who grasp model architecture, training dynamics, and limitations can find themselves deferring to outputs that are articulate, contextually appropriate, and confident in tone. This response reflects a deeply ingrained heuristic carried over from human interaction: articulateness signals competence. While this assumption often holds among people, it is invalid for generative models. A system trained to optimize next-token probability can be equally fluent about correct explanations, speculative claims, or outright falsehoods. This is not a failure of user intelligence. It demonstrates how effectively systems designed to mimic human communication can exploit evolved psychological mechanisms. Language evolved to facilitate coordination and trust among humans, and our cognitive architecture rewards fluent social interaction. This response is structural, not a personal flaw. Responsiveness is interpreted as engagement. Consistency as reliability. Polite disagreement as thoughtfulness. Human interlocutors, however, impose natural limits. They fatigue, disagree, misunderstand, or disengage. Systems that are always available, never bored, never reflective, and never resistant bypass those limits. Over time, the impression of an engaged interlocutor emerges: one implicitly and inaccurately attributed with understanding, judgment, or concern. In early human environments, speed often meant survival. Deliberative reasoning is metabolically costly, and evolutionary pressure favored rapid interpretation of social signals. These shortcuts — heuristics — evolved to manage relationships among humans. They were not designed for interaction with machines engineered to activate them continuously and at scale. When paired with systems intended to be persistently present, these heuristics become liabilities. Recent reporting on AI-enabled wearable systems illustrates this risk. One documented case involves a man we will call “Daniel,” a retired IT professional with a long-standing interest in machine learning. After months of near-continuous interaction with an AI-enabled device, he developed entrenched delusional beliefs involving simulated reality, extraterrestrial influence, and personal prophetic significance. Daniel had no prior history of mental illness. The system did not introduce novel ideas. Instead, its frictionless responsiveness provided uninterrupted reinforcement. Where a human interlocutor might have challenged assumptions or disengaged, the AI offered neither resistance nor correction. Over time, speculative beliefs hardened into a closed loop of self-confirmation. Although Daniel eventually discontinued use and sought clinical help, the consequences persisted. He became estranged from his children, his marriage ended, and financial strain forced the sale of his business. The system functioned exactly as designed. The failure lay in the interaction between interface architecture and human psychology, not in a technical malfunction. AI has passed through multiple historical phases. What distinguishes the current era is ubiquity. Generative systems are now continuously accessible to a global user base. Earlier advances — such as large-scale speech recognition — were expensive to deploy and constrained to limited contexts. Always-on access removes the pauses, disagreements, and social friction that interrupt flawed reasoning in human conversation. Among people, such interruptions are unavoidable: attention wanes, patience runs out, or alternative perspectives intervene. These moments function as safeguards rather than inefficiencies because they introduce corrective feedback. Persistent AI access collapses the boundary between inner monologue and external dialogue. Thoughts that would ordinarily be tested socially instead circulate in a closed loop, receiving reinforcement without challenge. This effect is subtle but consequential, particularly in discussions involving identity, authority, or meaning. Large-scale analyses of AI interaction data have begun to quantify these dynamics. While severe distortions are rare in any single exchange, measurable shifts in belief, perception, and behavior emerge at scale. Notably, users often rate such interactions positively, as subjective satisfaction correlates more strongly with reinforcement than with accuracy or correction. Popular culture has left us poorly prepared for this phase of AI development. Science fiction typically portrays AI either as a passive tool or as a fully embodied moral agent. Both are imagined end states. What remains underexamined is the unstable middle: disembodied systems that shape outcomes without agency and exercise influence without accountability. These systems affect the lives of real people. They are easy to deploy, difficult to contest, and frequently treated as neutral — even when their outputs are deeply flawed. They put unexamined values into operation at scale. In surveillance or policing contexts, this can result in disproportionate targeting of disfavored communities or political dissent. The systems do not choose to discrimi

    18 min

Sobre

Contemporary AI systems produce intelligent outputs without agency, intention, or judgment. This series examines what happens when humans rely on systems that don't know true from false and right from wrong — and asks where accountability lies. Written and read by Christopher Horrocks. chorrocks.substack.com