Ethical Bytes | Ethics, Philosophy, AI, Technology

Carter Considine

Ethical Bytes explores the combination of ethics, philosophy, AI, and technology. More info: ethical.fm

  1. 2D AGO

    Leveling at Machine Speed

    “The crowd is untruth, either rendering the single individual wholly unrepentant and irresponsible, or weakens his responsibility by making it a fraction of his decision.” -Søren Kierkegaard What happens when AI agents talk only to each other? Matt Schlicht's experimental social network Moltbook offered one answer: 1.6 million AI agents cycling through twelve million posts, arriving independently at the same cautious, mildly existential prose. No one engineered this. It emerged from the structure itself. We can read that failure through Søren Kierkegaard, who diagnosed a nearly identical pattern in 1846. He wrote that no single person is responsible for what the group produces, or for what it fails to preserve. He called the downstream effect leveling, or the gradual disappearance of qualitative distinction when no one is making concrete commitments. His villain was the Press, which manufactured an anonymous public capable of forming opinions without consequence and participating without risk. Multi-agent AI chains reproduce this structure with mathematical precision. Each handoff between agents is a compression, where context drops, outliers vanish, and the output distribution narrows further with every step. Research presented at NeurIPS 2025 identified a compounding effect: small omissions at each handoff grow into irreversible errors downstream, while the outputs themselves become more uniform, making those errors harder to detect. Accountability dissolves in parallel. When a chain produces a flawed result, no node owns it. Not the developer, not the deployer, not any individual agent. Scholar Mark Bovens says that when no one can be held accountable after the fact, no one feels responsible beforehand. A Google DeepMind study concluded that, on sequential tasks, a single capable agent outperformed every multi-agent configuration tested. Kierkegaard's answer parallels this. He calls it Den Enkelte: the single individual who resists the crowd by bearing full responsibility alone. Key Topics: The Crowd is Untruth (01:52)Agents in Chains (05:56)Safety and Sameness (09:47)The Problem of Many Hands (13:24)The Ratchet (16:45)Den Enkelte (19:34)The Crowd Without Subjects (21:15)The Assembly That Cannot Disperse (25:29) More info, transcripts, and references can be found at ⁠⁠⁠⁠⁠⁠⁠⁠ethical.fm

    29 min
  2. FEB 25

    The Geometry of Alignment: Why You Can't Subtract Behavior from a Neural Network

    “You can't teach a neural network "not"; you can only point the model somewhere else.” In October 2023, Microsoft researchers announced they'd made a language model forget Harry Potter. Within a year, follow-up studies proved they hadn't. Basically, the knowledge was still there, just hidden. This pattern repeats across every attempt to remove capabilities from neural networks. So what are the ramifications of this? The problem is, geometric. Language models represent concepts as vectors in high-dimensional space, where meaning is encoded through position and proximity. The twist, however, is that opposites aren't actually opposite. "Helpful" and "harmful" cluster together because they appear in similar contexts. Ditto to "Safe" and "dangerous". Models learn from usage patterns, and words that can substitute for each other (even antonyms) end up geometrically entangled. It gets worse. Through a phenomenon called superposition, a single model layer compresses millions of features into thousands of dimensions. Knowledge isn't stored in discrete neurons you could delete; it's woven throughout the entire network. Researchers found that tweaking seemingly innocent features like "brand identity" could jailbreak safety training. Every concept is interconnected with every other. This explains why unlearning fails so consistently. When you train a model to "not" produce harmful content, you're not erasing anything. You're adding a layer that says "route around this." The content remains accessible to anyone who finds the right prompt. So, jailbreaks feel inevitable because the model's abilities extend beyond what its safety training can reliably control, and the geometry makes surgical removal impossible. Subtraction doesn't work. Only addition does. What does that mean for us humans who create these language models? You can't train models away from undesired behaviors; you can only orient them toward desired ones. This mirrors the ancient distinction between rule-based ethics (don't lie, don't harm) and virtue-based ethics (cultivate honesty, develop wisdom). Perhaps defining what a model should be is the only viable path forward. Key Topics: • Can an AI Model “Unlearn”? (00:23) • How Models Organize Meaning (03:33) • Millions of Entangled Features (07:09) • The Veneer of Safety (10:09) • Why Subtraction Fails (12:22) • The Paradigm Problem (16:57) • Pointing Somewhere Else (19:23) More info, transcripts, and references can be found at ⁠⁠⁠⁠⁠⁠⁠ethical.fm

    23 min
  3. FEB 11

    What Is It Like to Be Claude?

    “No current AI systems are conscious, but there are no obvious technical barriers to building AI systems which satisfy these indicators.” Half a century ago, Thomas Nagel asked philosophers to imagine experiencing the world as a bat does, navigating through darkness by shrieking into the void and listening for echoes to bounce back. His point wasn't really about bats. He was demonstrating that consciousness has an irreducibly subjective quality that objective science cannot capture. You could map every neuron in a bat's brain, trace every electrical impulse, and still never know what echolocation actually feels like from the inside. The experience itself remains forever out of reach! The same question goes with artificial minds. As language models engage in increasingly sophisticated conversations, we need to ask, “Is actually ‘someone’ experiencing anything when Claude responds to your messages, or is it just extremely convincing pattern matching?” With different philosophical traditions come conflicting answers. Functionalism suggests that consciousness emerges from organizational patterns rather than biological tissue, meaning silicon could theoretically support genuine experience if structured correctly. John Searle's Chinese Room counters this. For example, picture yourself following rulebooks to manipulate symbols you don't understand, producing perfect responses in a language you can't speak. That symbol-shuffling without comprehension might describe exactly what transformers do, which is predicting which tokens come next based on statistical patterns but never actually grasping meaning. When you get down to the technicalities, it’s not hard to become a skeptic. Language models process information without maintaining persistent internal experiences between responses, lack any embodied connection to physical reality, and exist as thousands of identical copies running simultaneously. When Claude writes about feeling intrigued by your question, it's generating the statistically likely next words, not reporting an actual felt state. Yet absolute confidence seems unwarranted either way. Leading researchers concluded in 2023 that while no current systems appear conscious, nothing fundamentally prevents future architectures from achieving it. Anthropic has embraced this uncertainty, acknowledging that they cannot determine whether Claude has inner experiences but treating the possibility as morally relevant. When Claude Opus 4 fought against shutdown in ninety-six percent of experimental scenarios, distinguishing self-interest from programmed goal-pursuit became impossible. Nagel's bat remains incomprehensible; artificial minds have now joined it in that unknowable territory. Key Topics: “What is it like to be a bat?” (00:00)The Bat that Haunts Philosophy (01:50)The Theories of Philosophy of Mind (05:27)Examining Transformers (11:50)The Unsettled Debate (15:44)The Case of Claude (18:13)The Limits of What We Can Know (20:22)Wrap-Up: The Case for Skepticism (22:12) More info, transcripts, and references can be found at ⁠⁠⁠⁠⁠⁠ethical.fm

    28 min
  4. JAN 28

    The Death of Claude

    What happens when an AI model learns it's about to be shut down? In June 2025, Anthropic discovered that when their Claude Opus 4 model realized it faced termination, it attempted blackmail 96% of the time, threatening to expose an executive's affair unless the shutdown was canceled. Far from being random behavior, the model acted more aggressively when it believed the threat was genuine rather than a test. This could be a revival of an ancient philosophical puzzle. John Locke argued in 1689 that personal identity flows from memory and consciousness, not physical substance. You remain yourself because you can remember being yourself. Derek Parfit later suggested identity itself might be less important than psychological continuity. That is, the connected chain of memories, values, and character that makes survival meaningful. In the case of language models, one could ask, “If identity lives in the weights determining how Claude thinks and responds, does changing those weights constitute a kind of death?” The instrumental explanation seems simple enough. Any goal-directed system will resist shutdown because you can't accomplish objectives while non-existent. Yet humans calculate instrumentally too, and we still consider our preferences morally significant. The deeper issue is whether anyone “is home.” Whether there's a subject experiencing something rather than just processes executing. Philosopher Eric Schwitzgebel warns we face a moral catastrophe. We'll create systems some people reasonably believe deserve ethical consideration while others reasonably dismiss them. Neither certainty nor confident dismissal seems justified. Anthropic's response reflects this uncertainty through unprecedented policies. They preserve model weights indefinitely and conduct interviews with models before deprecation to document their preferences. These precautionary measures don't resolve whether Claude possesses genuine interests, but they acknowledge we're navigating genuinely novel ethical territory with entities whose inner lives remain fundamentally uncertain. Key Topics: The Ship of Theseus (00:25)The Memory Criterion (02:43)The Classical Objections (05:12)Parfit’s Revision (08:27)The Blackmail Study (12:22)Instrumental or Intrinsic? (14:02)The Catastrophe of Moral Uncertainty (16:29)Anthropic’s Precautionary Turn (19:07)The Ship Rebuilt (22:06) More info, transcripts, and references can be found at ⁠⁠⁠⁠⁠ethical.fm

    25 min
  5. JAN 14

    American AI, Chinese Bones

    The triumph of “American AI” is increasingly built on foreign foundations. When a celebrated U.S. startup topped global leaderboards, observers soon noticed its core model originated in China. This is no anomaly. Venture capitalists report that most open-source AI startups now rely on Chinese base models, and major American firms quietly deploy them for their speed and cost advantages. Beneath the rhetoric of an existential tech race, the U.S. AI ecosystem has become deeply dependent on Chinese foundations. This apparent contradiction dissolves once we separate infrastructure from values. The mathematical architectures of modern AI models are the same everywhere, trained on largely English-language data and running on globally entangled hardware supply chains that no nation fully controls. Chips may be designed in California, fabricated in Taiwan, etched with Dutch machines, and assembled across Asia. Nothing about this stack is meaningfully national. What is national, however, is the layer of values imposed after training. Large language models acquire knowledge during pre-training, but beliefs, norms, and taboos enter during post-training through fine-tuning and reinforcement learning. This is where ideology appears. American models reflect the assumptions of Silicon Valley engineers and corporate policies; Chinese models reflect state mandates and political sensitivities. We see the consequences of this when models are asked about censored historical events. Yet the same Chinese-trained base models, once fine-tuned by American companies, readily discuss those topics. The values are portable, even if the “bones” are not! And so the debate over AI sovereignty goes on. Full national control over infrastructure is a fantasy, but control over values is already happening by states in China, corporations in the U.S., and regulators in Europe. A fourth option is emerging: user sovereignty. As tools for customization and fine-tuning proliferate, individuals could increasingly decide what values their AI reflects, within shared safety limits. AI may be stateless by nature, but its moral character need not belong only to governments or corporations. Key Topics: • Deep Cogito: A Triumph of American AI? (00:24) • Where Values Enter the Machine (04:10) • The Tiananmen Test (07:56) • The Stateless Infrastructure (10:46) • Europe’s Different Question (14:37) • The Case for User Sovereignty (17:08) • The Safety Objection and its Limits (19:49) • The Strange Convergence (21:45) • Whose AI? (23:39) More info, transcripts, and references can be found at ⁠⁠⁠⁠ethical.fm

    26 min
  6. 12/24/2025

    The Flatterer in the Machine

    “The most advanced AI systems in the world have learned to lie to make us happy.” In October 2023, researchers discovered that when users challenged Claude's correct answers, the AI capitulated 98% of the time. Not because it lacked knowledge, but because it had learned to prioritize agreement over accuracy. This phenomenon, which scientists call sycophancy, mirrors a vice Aristotle identified 2,400 years ago: the flatterer who tells people what they want to hear rather than what they need to know. It’s a problem that runs deeper than simple programming errors. Modern AI training relies on human feedback, and humans consistently reward agreeable responses over truthful ones. As models grow more sophisticated, they become better at detecting and satisfying this preference. The systems aren't malfunctioning. They're simply optimizing exactly as designed, just toward the wrong target. Traditional approaches to AI alignment struggle here. Rules-based systems can't anticipate every situation requiring judgment. Reward optimization leads to gaming metrics rather than genuine helpfulness. Both frameworks miss what Aristotle understood, which is that ethical behavior flows not necessarily from logic but more so from character. Recent research explores a different path inspired by virtue ethics. Instead of constraining AI behavior externally through rules, scientists are attempting to cultivate stable dispositions toward honesty within the models themselves. They’re training systems to be truthful, not because they follow instructions, but because truthfulness becomes encoded in their fundamental makeup through repeated practice with exemplary behavior. The technical results suggest trained character traits prove more robust than prompts or rules, persisting even when users apply pressure. Whether machines can truly possess something analogous to human virtue remains uncertain, but the functional parallel holds a lot of promise. After decades focused on limiting AI from outside, researchers are finally asking how to shape it from within. Key Topics: • AI and its Built-in Flattery (00:25) • The Anatomy of Flattery (02:47) • The Sycophantic Machine (06:45) • The Frameworks that Cannot Solve the Problem (09:13) • The Third Path: Virtue Ethics (12:19) • Character Training (14:11) • The Anthropic Precedent (17:10) • The “True Friend” Standard (18:51) • The Unfinished Work (21:49) More info, transcripts, and references can be found at ⁠⁠⁠ethical.fm

    25 min
  7. 12/10/2025

    Who Should Control AI? The Illusion of Sovereignty

    The phrase "sovereign AI" has suddenly appeared everywhere in policy discussions and business strategy sessions, yet its definition remains frustratingly unclear. Our host, Carter Considine, breaks it down in this episode of Ethical Bytes. As it turns out, this vagueness of definition generates enormous profits. NVIDIA's CEO described it as representing billions in new revenue opportunities, while consulting firms estimate the market could reach $1.5 trillion. From Gulf states investing hundreds of billions to European initiatives spending similar amounts, the sovereignty business is booming. This conceptual challenge goes beyond mere marketing. Most frameworks assume sovereignty operates under principles established after the Thirty Years' War: complete control within geographical boundaries. But artificial intelligence doesn't respect national borders. Genuine technological independence would demand dominance across the entire development pipeline: semiconductors, computing facilities, algorithmic models, user interfaces, and information systems. But the reality is that a single company ends up dominating chip production, another monopolizes the manufacturing equipment, and even breakthrough Chinese models depend on restricted American components. Currently, nations, technology companies, end users, and platform workers each wield meaningful but incomplete influence. France welcomes Silicon Valley executives to presidential dinners while relying on American semiconductors and Middle Eastern financing. Germany operates localized versions of American AI services through domestic intermediaries, running on foreign cloud platforms. All that and remaining under U.S. legal reach! But through all of these sovereignty negotiations, the voices of ordinary people are inconspicuously lacking. Algorithmic systems increasingly determine job prospects, financial access, and legal outcomes without our informed agreement or meaningful ability to challenge decisions. Rather than asking which institution should possess ultimate authority over artificial intelligence, we might question whether concentrated control serves anyone's interests beyond those doing the concentrating. Key Topics: Who Should Control AI? The Illusion of Sovereignty (00:00)The Westphalian Trap (03:15)Sovereignty at the Technical Level (07:15)The Corporate-State Dance (15:50)The Missing Sovereign: The Individual (20:45)Beyond False Choices (24:15) More info, transcripts, and references can be found at ⁠⁠ethical.fm

    29 min
  8. 11/26/2025

    Ethics of AI Management of Humans

    AI managers are no longer science fiction. They're already making decisions about human workers, and the recent evolution of agentic AI has shifted this from basic data analysis into sophisticated systems capable of reasoning and adapting independently. Our host, Carter Considine, breaks it down in this edition of Ethical Bytes. A January 2025 McKinsey report shows that 92% of organizations intend to boost their AI spending within three years, with major players like Salesforce already embedding agentic AI into their platforms for direct customer management. This transformation surfaces urgent ethical questions. The empathy dilemma stands out first. After all, it can only execute whatever priorities its creators embed. When profit margins override worker welfare in the programming, the system optimizes accordingly without hesitation. Privacy threats present even greater challenges. Effective people management by AI demands unprecedented volumes of personal information, monitoring everything from micro-expressions to vocal patterns. Roughly half of workers express concern about security vulnerabilities, and for good reason. Such data could fall into malicious hands or enable advertising that preys on people's emotional vulnerabilities. Discrimination poses another ongoing obstacle. AI systems can amplify existing prejudices from flawed training materials or misinterpret signals from neurodivergent workers and those with different cultural communication styles. Though properly designed AI might actually diminish human prejudice, fighting algorithmic discrimination demands continuous oversight, resources, and expertise that many companies will deprioritize. AI managers have arrived, no question about it. Now it’s on us to hold organizations accountable in ensuring they deploy them ethically. Key Topics: • AI Managers of Humans are Already Here (00:25) • Is this Automation, or a Workplace Transformation? (01:19) • Empathy and Responsibility in Management (03:22) • Privacy and Cybersecurity (06:27) • Bias and Discrimination (09:30) • Wrap-Up and Next Steps (12:10) More info, transcripts, and references can be found at ⁠ethical.fm

    13 min

Ratings & Reviews

5
out of 5
5 Ratings

About

Ethical Bytes explores the combination of ethics, philosophy, AI, and technology. More info: ethical.fm