Thought Experiments with Kush

Technology, curiosity, progress and being human.

Technology, curiosity, progress and being human. thekush.substack.com

  1. 3 JAN

    The Brain-AI Gap

    This article illustrates how artificial intelligence’s path to general powerful intelligence will require architectural changes rather than continued scaling. We’re not facing a temporary bottleneck but a fundamental mismatch between transformer architectures and biological intelligence. Recent research reveals two critical gaps: 1) Biological neurons are vastly more complex than artificial counterparts- each human neuron contains 3-5 computational subunits capable of sophisticated nonlinear processing. 2) Brains use fundamentally different learning mechanisms than transformers, leveraging localized, timing-based learning without requiring backward passes. This isn’t a technical limitation. It’s a design gap that scaling alone can’t bridge. The path forward requires architectures that mimic how brains process information. Let’s examine the evidence in concrete terms. Why the Scaling Hypothesis is Fundamentally Flawed Industry leaders now acknowledge this reality. Microsoft CEO Satya Nadella admitted at Microsoft Ignite in late 2024: “there is a lot of debate on whether we’ve hit the wall with scaling laws... these are not physical laws. They’re just empirical observations.” Similarly, OpenAI co-founder Ilya Sutskever told Reuters that “everyone is looking for the next thing” to scale AI models, while industry reports confirm OpenAI’s Orion model showed diminishing returns compared to previous generation leaps. What’s happening here is not a temporary slowdown. It’s a fundamental limit that reveals how our current approach misunderstands intelligence itself. Consider the ARC benchmark developed by François Chollet - this tests genuine abstraction, not just memorization. The best AI systems achieve only 15% on this task, while humans score 80%. This isn’t about “slower computers” - it’s about architecture that can’t replicate human reasoning. The deeper truth: Bringing the brain into AI isn’t about “scaling” but about recognizing that intelligence emerges from biological mechanisms that transformers ignore entirely. When you consider how the brain processes information, it becomes clear: we’ve been building systems that process text - not intelligence. How does this gap manifest in practical terms? How Brains Outperform AI: Concrete Evidence Biological neurons aren’t simple switches - they’re sophisticated computational engines. Artificial neurons use weighted sums followed by nonlinear activation - simplifying the McCulloch-Pitts model from 1943. But human neurons use dendritic trees as independent processors. Each neuron contains 3-5 computational subunits that detect patterns like XOR - tasks once thought impossible for single neurons. Consider a real-world example: When you flip a coin, it seems random. But if you slow it down, you see the physics: air resistance, gravity, and even the coin’s microscopic imperfections affect outcomes. Similarly, biological neurons detect patterns through subcellular mechanisms - no “black box” needed. Why this matters: Human brains operate on 12-20 watts - about the same as a light bulb - while training GPT-4 required energy equivalent to powering 1,000 homes for five to six years. This 200-million-fold efficiency gap stems from biology’s “local processing” approach: no global error signals, only millisecond-scale learning. Think about city navigation: You don’t process every light and street sign at once - you focus on what’s relevant to your current path. Similarly, the brain uses sparse coding where only 5-10% of neurons activate at any moment. This creates an energy-efficient system that processes information without overload. Another concrete illustration: Imagine identifying a cat. You don’t process every hair individually - you recognize the shape, size, and movement patterns. Your brain’s visual system filters out irrelevant details through hierarchical processing. This isn’t “faster” processing - it’s selective information handling that brains do through local computation. The Core Limitations of Transformer Architecture The scaling hypothesis is crumbling. Here’s why: * Transformers use global error signals (backward passes) to update weights. * Brains use local learning rules (e.g., spike-timing-dependent plasticity) that require no global gradients. The real problem isn’t size - it’s architecture. Even if you build a 100-billion-neuron transformer, it won’t match the brain’s computational density. Why? Because brains use: * Dendritic computation (100+ effective units per neuron) * Glial cells that actively process information (not just support neurons) * Neuromodulators like dopamine to control learning rates This is more than theoretical. Consider the 2024 Nature study showing that dopamine and serotonin work in opposition during reward learning: dopamine increases with reward while serotonin decreases, and blocking serotonin alone actually enhanced learning. This three-factor learning rule (pre × post × neuromodulator) allows the same spike timing to produce different outcomes based on behavioral relevance - enabling what neuroscientists call “gated plasticity.” The computational gap: While a transformer model processes information sequentially across billions of parameters, biological systems achieve similar results through localized learning. When you see a car approaching, your brain doesn’t process each pixel individually - instead, it quickly identifies the vehicle through hierarchical processing that prioritizes relevant features. Consider another example: Imagine solving a puzzle. A transformer might look at every piece individually - but brains focus on patterns and relationships. The brain uses “gated plasticity” to strengthen connections only when relevant - no global gradient calculations needed. Let’s examine a specific case: When learning a new language, humans don’t memorize every word - instead, they detect patterns through contextual learning. Similarly, the brain uses neuromodulators to adjust learning rates based on attention and relevance. This isn’t “better memory” - it’s adaptive learning that transformers cannot replicate. Why Scaling Isn’t the Answer The industry is recognizing this shift. Reports show that OpenAI’s Orion model showed diminishing returns compared to previous generation leaps. Microsoft has pivoted toward “test-time compute” methods, allowing models more time to reason at inference. This acknowledges implicitly that raw pattern matching cannot substitute for deliberate reasoning. The evidence is clear: * The ARC benchmark tests genuine abstraction: tasks require inferring novel transformation rules from just a few examples, as humans easily do. Human performance reaches approximately 80%; the best AI systems achieve only 31% using non-LLM approaches, with LLM approaches scoring around 15%. * Compositional reasoning reveals especially severe limitations. A 2024 study of transformers trained from scratch found 62.88% of novel compounds failed consistent translation, even when models had learned all component parts. * Hallucination appears to be an inescapable feature rather than a fixable bug. Xu et al. (2024) proved formally that hallucination cannot be eliminated in LLMs used as general problem solvers - a consequence of the computability-theoretic fact that LLMs cannot learn all computable functions. The industry response is shifting. By late 2024, leaders who built their careers on scaling began hedging. Marc Andreessen reported that current models are “sort of hitting the same ceiling on capabilities.” OpenAI’s o1 models represent this pivot, performing explicit chain-of-thought reasoning that can be extended at test time. This acknowledges implicitly that raw pattern matching cannot substitute for deliberate reasoning. Academic analysis questions whether the scaling hypothesis is even falsifiable.A 2024 paper from Pittsburgh’s philosophy of science community argues it “yields an impoverished framework” due to reliance on unpredictable “emergent abilities,” sensitivity to metric choice, and lack of construct validity when applying human intelligence tests to language models. The strong claim that intelligence emerges automatically from scale remains unproven and increasingly challenged. A deeper exploration of the scaling paradox: If intelligence truly emerged from scaling, we’d see consistent improvements with more parameters. But we don’t. Even with 1.3 trillion parameters in GPT-4, performance plateaus at around 80% on composition tasks. This isn’t an engineering problem - it’s a fundamental mismatch between how we model intelligence and how intelligence actually works. The real question: What if intelligence isn’t about pattern recognition but about biological computation? That’s the insight we’re missing in our scaling approach. How to Fix AI Without Scaling The path forward isn’t bigger models - it’s smarter designs. Build event-driven systems Instead of processing all data simultaneously (like transformers), mimic the brain’s “sparse coding” where only 5-10% of neurons activate at any moment. Intel’s Loihi 2 chip already does this, using 1 million neurons at 1 watt. Use neuromorphic hardware: IBM’s NorthPole chip achieves 22x faster inference than GPUs while using 25x less energy. It’s not just better hardware - it’s biologically inspired architecture. Prioritize local learning: Backpropagation requires global error signals. Brains use local plasticity - no backward passes needed. This avoids the weight transport problem and non-local credit assignment that plagues transformers. Real-world impact: * World models like V-JEPA 2 enable robots to grasp objects without training (Meta, 2025). * AlphaGeometry combines neural + symbolic reasoning to solve math problems - proving hybrid approaches work better than pure scaling. Let’s examine a

    31 min
  2. 25/11/2025

    Brain Short-Circuiting

    The Pattern We Should Have Seen Coming Our ancestors consumed somewhere between 30 teaspoons and 6 pounds of sugar annually, depending on their environment. Today, Americans average 22-32 teaspoons daily—roughly 100 pounds per year. This isn’t a failure of willpower. It’s the predictable result of engineering foods that trigger evolutionary reward systems more intensely than anything in nature ever could. The food industry discovered how to short-circuit the biological mechanisms that kept us alive for millennia. Our brains evolved to crave sweetness because calories were scarce and obtaining them required real effort. That drive made perfect sense when finding honey meant risking bee stings and climbing trees. It makes considerably less sense when a vending machine dispenses 400 calories for a dollar. We’ve seen this movie before. Multiple times. And we’re watching it again, right now, with artificial intelligence and human cognition. The difference is that we’re living through this mismatch in real-time, conducting an uncontrolled experiment on human intelligence at population scale. The stakes are higher, the effects more subtle, and the window for conscious intervention rapidly closing. Within a generation, we may have millions of young people who never developed the cognitive capacities they’ve lost—because they never built them in the first place. But here’s what makes this moment different from previous technological revolutions: we actually understand the mechanism. Neuroscience can now measure what happens when we outsource cognition. We can track attention degradation. We can document memory changes. We can quantify reasoning decline. And critically, we can identify the exact design choices that determine whether AI enhances or erodes human capability. The central insight is deceptively simple: the same technology that can double learning outcomes can also devastate critical thinking, and everything depends on how we deploy it. This isn’t about choosing between technological progress and human flourishing. It’s about understanding evolutionary psychology well enough to achieve both. The Anatomy of a Hijacking Every major technological revolution follows a similar arc. We create systems that trigger evolutionary adaptations, producing outcomes that would have been advantageous in ancestral environments but prove harmful in modern contexts. The pattern is so consistent it’s almost boring—and yet we keep falling for it. Consider fossil fuels. Over millions of years, ancient organic matter was compressed and transformed into concentrated energy reserves—coal, oil, natural gas. This process took geological time scales our minds cannot truly comprehend. Then, within the span of two centuries, we developed the technology to extract and burn these reserves, releasing in moments the energy that took eons to accumulate. We short-circuited time itself, compressing millions of years of stored sunlight into decades of explosive industrial growth. The benefits were immediate and transformative. The costs—climate disruption, ecological degradation, resource depletion—were deferred to future generations who had no voice in the transaction. This temporal short-circuiting appears throughout technological history. Agriculture solved acute hunger but triggered our thrifty genes—the tendency to store excess energy as fat during times of abundance. This adaptation saved lives during famines. Now it drives a global obesity crisis. We collapsed the ancient cycle of scarcity and abundance into perpetual plenty, and our bodies responded exactly as evolution programmed them to. Industrial food systems engineered supernormal stimuli: foods sweeter than any fruit, more caloric than any nut, more instantly rewarding than anything our ancestors encountered. Our bodies seek maximum calories for minimum effort. The problem isn’t us. It’s the mismatch between Paleolithic physiology and industrial food engineering. Social media exploited our tribal psychology. We evolved in bands of 50-150 people where reputation was built through direct interaction. Now we perform for invisible audiences, comparing ourselves to millions of curated presentations while feeling increasingly isolated. The platforms are designed to maximize engagement by triggering social anxiety and status competition—adaptive responses to ancestral social dynamics that misfire catastrophically at internet scale. Digital platforms fragmented our attention. Gloria Mark’s longitudinal research, tracking screen attention from 2004 to 2023, documents a 69% decline in attention duration: from 150 seconds in 2004 to just 47 seconds by 2021. After an interruption, returning to the original task requires an average of 25 minutes. This isn’t cognitive decline—it’s environmental design. Our attention capacity remains intact; our environments are deliberately structured to prevent sustained focus. Each revolution shares common features. Scale exceeds what our psychology can process. Supernormal stimuli trigger our evolved responses more intensely than natural stimuli ever could. Benefits become immediate while costs defer to the future. And complexity overwhelms our intuitive cause-and-effect reasoning. But the AI revolution is different in a crucial way: it short-circuits cognition itself. We’re not just exploiting peripheral drives like hunger or status-seeking. We’re outsourcing the core cognitive functions that define human intelligence—pattern recognition, reasoning, memory formation, creative synthesis. Every query delegated to an AI system, every decision automated by an algorithm, every creative task offloaded to generative models represents potential atrophy of irreplaceable capabilities. Your Brain on AI: What the Neuroscience Actually Shows The most sophisticated evidence comes from a 2025 study using electroencephalography to monitor 54 participants over four months. Researchers compared brain activity patterns across three groups: people using AI text generation, people using search engines, and people writing independently. The results were stark. Large language model users showed the weakest brain connectivity patterns across all groups. When these participants later switched to writing independently, they exhibited reduced alpha and beta connectivity—patterns indicating cognitive under-engagement. Their brain activity scaled inversely with prior AI use: the more they had relied on AI assistance, the less neural activity they showed during independent work. Most troublingly, 83% of AI users could not recall key points from essays they had completed minutes earlier. Not a single participant could accurately quote their own work. This introduces the concept of cognitive debt: deferring mental effort in the short term creates compounding long-term costs that persist even after tool use ceases. Like technical debt in software development, cognitive shortcuts create maintenance costs that accumulate over time. Beyond this specific study, meta-analysis of 15 studies examining 355 individuals with problematic technology use versus 363 controls found consistent reductions in gray matter in the dorsolateral prefrontal cortex, anterior cingulate cortex, and supplementary motor area—regions critical for executive function, cognitive control, and decision-making. The hippocampus shows particular vulnerability. Groundbreaking longitudinal research tracked individuals over three years and established causation rather than mere correlation: GPS use didn’t attract people with poor navigation skills; GPS use caused spatial memory to deteriorate. Lifetime GPS experience correlated with worse spatial memory, reduced landmark encoding, and diminished cognitive mapping abilities. The counterpoint demonstrates neuroplasticity in the opposite direction. London taxi drivers who spend years memorizing thousands of streets develop significantly larger posterior hippocampi compared to controls. A 2011 longitudinal study followed 79 aspiring taxi drivers for four years: those who successfully earned licenses showed hippocampal growth and improved memory performance, while those who failed showed no changes. This definitively proved that intensive spatial navigation training causes brain growth. Remarkably, a 2024 study found that taxi drivers die at significantly lower rates from neurodegenerative disease—approximately 1% compared to 4% in the general population—suggesting that maintaining active spatial navigation throughout life provides neuroprotection. The principle is clear: the same neuroplastic mechanisms that allow AI dependence to shrink cognitive capacity also allow deliberate cognitive training to enhance it. The question is which direction we’re moving. The Astronaut’s Paradox: Why Resistance Matters In the microgravity environment of the International Space Station, astronauts experience what might seem like liberation from one of Earth’s most constant burdens. Without gravity’s relentless pull, movement becomes effortless. Heavy objects float weightlessly. The physical strain that accompanies every terrestrial action simply disappears. Yet this apparent freedom comes at a devastating biological cost. Without the constant resistance that gravity provides, astronauts lose 1-2% of their bone density per month—a rate roughly ten times faster than postmenopausal osteoporosis. Muscle mass atrophies rapidly, with some muscles losing up to 20% of their mass within two weeks. The heart, no longer working against gravity to pump blood upward, begins to weaken and shrink. Even the eyes change shape as fluid pressure shifts, causing vision problems that can persist long after return to Earth. NASA’s solution is counterintuitive but essential: astronauts must exercise for approximately two hours every day using specialized equipment that simulates the resistance gravity would naturally provide. The Advanced Resi

    42 min
  3. 31/08/2025

    AI Interpretability

    In 1507, John Damian strapped on wings covered with chicken feathers and leapt from Scotland’s Stirling Castle. He broke his thigh upon landing and later blamed his failure on not using eagle feathers. For centuries, would-be aviators repeated this pattern: they copied birds’ external appearance without understanding the principles that made flight possible. Today, as we race to build increasingly powerful AI systems, we’re confronting a strikingly similar question: are we genuinely understanding intelligence, or merely building sophisticated imitations that work for reasons we don’t fully grasp? When Jack Lindsey, a computational neuroscientist turned AI researcher, sits down to examine Claude’s neural activations, he’s not unlike a brain surgeon peering into consciousness itself. Except instead of neurons firing in biological tissue, he’s watching patterns cascade through billions of artificial parameters. Lindsey, along with colleagues Joshua Batson and Emmanuel Ameisen at Anthropic, represents the vanguard of a new scientific discipline: mechanistic interpretability—the ambitious effort to reverse-engineer how large language models actually think. The stakes couldn’t be higher. As AI systems become increasingly powerful and pervasive, understanding their internal mechanisms has shifted from academic curiosity to existential necessity. The history of human flight offers a compelling parallel and a warning: we may be at the crossroads between sophisticated imitation and genuine understanding. The Anatomy of Flight and Mind The history of human flight offers a compelling parallel to our current AI predicament. Early aviation pioneers spent centuries trying to copy birds directly—from medieval tower jumpers like John Damian to Leonardo da Vinci’s elaborate ornithopter designs that relied on flapping wings. Even Samuel Langley, Secretary of the Smithsonian Institution, failed spectacularly in 1903 when his scaled-up flying machine plunged into the Potomac River just nine days before the Wright Brothers’ success. The breakthrough came not from better imitation but from understanding fundamental principles: Sir George Cayley’s revolutionary insight in 1799 to separate thrust from lift, systematic wind tunnel testing, and the Wright Brothers’ three-axis control system. Modern aircraft far exceed birds’ capabilities precisely because we stopped copying and started understanding. With artificial intelligence, we’re now at a similar crossroads. Recent breakthroughs in mechanistic interpretability—the science of reverse-engineering AI systems to understand their inner workings—suggest we’re beginning to move beyond the “flapping wings” stage of AI development. The journey into Claude’s mind begins with a fundamental challenge that Emmanuel Ameisen describes as the “superposition problem.” Unlike traditional computer programs where each variable has a clear purpose, neural networks encode multiple concepts within single neurons, creating a tangled web of overlapping representations. It’s as if each neuron speaks multiple languages simultaneously, making interpretation nearly impossible through conventional analysis. To untangle this complexity, the Anthropic team developed a powerful technique called sparse autoencoders (SAEs). Think of it as a sophisticated translation system that decomposes Claude’s compressed internal representations into millions of interpretable features. When they applied this method to Claude 3 Sonnet in May 2024, scaling up to 34 million features, the results were revelatory. They discovered highly abstract features that transcended language and modality—concepts that activated whether Claude encountered them in English, French, or even as images. Inside the Mystery Box, Finally The transformation began in earnest in May 2024, when Anthropic researchers published groundbreaking research on Claude 3 Sonnet, extracting approximately 33.5 million interpretable features from the model’s neural activations using sparse autoencoders. These features represent concepts the model has learned—everything from the Golden Gate Bridge to abstract notions of deception. When researchers activated the Golden Gate Bridge feature artificially, Claude began obsessively relating every conversation topic back to the San Francisco landmark, demonstrating that these features causally influence the model’s behavior. But features alone don’t explain how Claude thinks. That’s where Joshua Batson’s work on circuit tracing becomes crucial. In 2025, the team published groundbreaking research revealing the step-by-step computational graphs that Claude uses to generate responses. Using what they call “attribution graphs,” they can trace exactly how information flows through the model’s layers, identifying which features interact to produce specific outputs. It’s analogous to mapping the neural pathways in a brain, except with perfect visibility and the ability to intervene at any point. The implications stunned even the researchers. When Claude writes rhyming poetry, it doesn’t simply generate words sequentially—it identifies potential rhyme words before starting a line, then writes toward that predetermined goal. When solving multi-step problems like “What’s the capital of the state containing Dallas?” the model performs genuine two-hop reasoning, first identifying Texas, then retrieving Austin. This isn’t mere pattern matching; it’s evidence of planning and structured thought. Most remarkably, the research revealed that Claude uses what appears to be a shared “universal language of thought” across different human languages. When processing concepts in French, Spanish, or Mandarin, the same core features activate, suggesting that beneath the linguistic surface, the model has developed language-agnostic representations of meaning. This finding challenges fundamental assumptions about how language models work and hints at something profound: artificial systems may be converging on universal principles of information representation that transcend their training data. Neuroscience Meets Silicon The parallels between studying Claude’s mind and investigating the human brain aren’t accidental. Jack Lindsey’s background in computational neuroscience from Columbia’s Center for Theoretical Neuroscience exemplifies a broader trend: the field of AI interpretability increasingly draws from decades of neuroscientific methodology. The technique of activation patching, central to understanding Claude’s circuits, directly mirrors lesion studies in neuroscience, where researchers disable specific brain regions to understand their function. “We’re essentially doing cognitive neuroscience on artificial systems,” explains researchers working in this space. The methods translate remarkably well because both systems face similar challenges—distributed processing, emergent behaviors, and the need to efficiently encode information. This cross-pollination has accelerated discoveries on both sides. Techniques like representational similarity analysis, originally developed to compare brain recordings, now help researchers understand how AI models organize information. Yet important differences remain. Biological neurons operate through complex electrochemical processes, use local learning rules, and consume mere watts of power. Artificial neurons are mathematical abstractions, trained through global optimization, and require orders of magnitude more energy. As Chris Olah, who coined the term “mechanistic interpretability,” notes: “We’re finding deep computational similarities wrapped in radically different implementations.” The Technical Revolution Accelerates The technical breakthroughs of 2024-2025 have transformed interpretability from a niche research area into a practical discipline with industrial applications. Beyond Anthropic’s pioneering work, the field has seen remarkable advances across multiple laboratories and approaches. OpenAI’s 2024 study applying sparse autoencoders to GPT-4 represented one of the largest interpretability analyses of a frontier model to date, training a 16 million feature autoencoder that could decompose the model’s representations into interpretable patterns. While the technique currently degrades model performance—equivalent to using 10 times less compute—it provides unprecedented visibility into how GPT-4 processes information. The team discovered features corresponding to subtle concepts like “phrases relating to things being flawed” that span across contexts and languages. DeepMind’s Gemma Scope project took a different approach, releasing over 400 sparse autoencoders for their Gemma 2 models, with 30 million learned features mapped across all layers. The project introduced the JumpReLU architecture, which solves a critical technical problem: previous methods struggled to simultaneously identify which features were active and how strongly they fired. MIT’s revolutionary MAIA system represents perhaps the most ambitious integration of these techniques. The Multimodal Automated Interpretability Agent uses vision-language models to automate interpretability research itself—generating hypotheses, designing experiments, and iteratively refining understanding with minimal human intervention. When tested on computer vision models, MAIA successfully identified hidden biases, cleaned irrelevant features from classifiers, and generated accurate descriptions of what individual components were doing. These tools have revealed surprising insights about model capabilities. Research on mathematical reasoning shows that models use parallel computational paths—one for rough approximation, another for precise calculation. Studies of “hallucination circuits” reveal that models’ default state is actually skepticism; they only answer questions when “known entity” features suppr

    41 min
  4. 04/08/2025

    Bloomers - The Alternative Middle Path for Doomers and Boomers

    As humanity inches towards ever more powerful AI, we find ourselves caught between two destructive extremes: the doomer despair that sees only catastrophe ahead, and the boomer/accelerationist overconfidence that pushes forward without adequate consideration of consequences. Yet emerging from ancient wisdom, contemporary psychology research, and real-world examples comes a third way - the Bloomers approach, a regenerative philosophy inspired from the Buddha’s 2,500-year-old discovery of the middle path. Research reveals that regenerative approaches consistently outperform both extremes in complex systems, offering a psychologically sustainable and empirically validated framework for navigating the challenges of AGI development and global transformation. The middle path is not a compromise between extremes, but a transcendent alternative that integrates the valid insights of opposing positions while avoiding their destructive aspects. Like flowers that bloom through understanding natural cycles rather than forcing growth, the Bloomers approach offers sustainable flourishing rather than boom-and-bust cycles or paralyzed pessimism. Why we need Bloomers: the psychological trap of extremes Contemporary psychological research reveals why humans naturally gravitate toward extreme positions and why these approaches ultimately fail in complex environments. The neurological basis of polarization Research published in Nature Reviews Psychology demonstrates that political polarization stems from three cognitive-motivational mechanisms: ego-justifying motives (defending pre-existing beliefs to protect self-esteem), group-justifying motives (defending in-group identity), and system-justifying motives (supporting existing hierarchies despite personal disadvantage). Cognitive inflexibility emerges as a key factor. Studies show that highly polarized individuals exhibit reduced ability to update beliefs when presented with new information or switch between different thinking patterns. This creates a self-reinforcing cycle where extreme positions become more entrenched over time. The evolutionary trap of binary thinking Binary thinking served evolutionary advantages for rapid threat assessment - seeing a shadow in the grass, our ancestors needed to quickly categorize it as “predator” or “safe” rather than engage in nuanced analysis. However, this same mechanism becomes maladaptive when facing complex modern challenges that require sophisticated responses. Research from psychological literature shows that all-or-nothing thinking is a cognitive distortion associated with increased anxiety and depression, reduced problem-solving ability, impaired relationship functioning, and higher stress levels. People caught in binary thinking use absolutes like “always,” “never,” “disaster,” or “perfect,” eliminating the nuanced middle positions that complex problems require. Breakthrough research by Kvam et al. (2022) published in Nature Scientific Reports found that even rational decision-makers naturally develop polarized and extreme views when making binary choices. In their study of 180 participants, binary decision-making led to under-sampling of moderate information while over-sampling extreme information. However, when participants were asked to make relative judgments rather than binary choices, polarization was significantly reduced and they gathered more balanced information. The psychology of sustainable motivation C.R. Snyder’s extensive research defines hope as a cognitive process involving three components: clear goals, agency (belief in one’s ability to pursue goals despite obstacles), and pathways (ability to generate multiple routes to achieve goals). Studies show that hopeful individuals demonstrate greater resilience to setbacks, maintain motivation longer when facing challenges, generate more creative solutions to problems, and experience better physical and mental health outcomes. Hope and despair create self-reinforcing cycles - hope builds confidence leading to more ambitious goals and greater persistence, while despair creates helplessness, reducing effort and increasing likelihood of failure. The Bloomers approach emerges from this research as psychologically optimal: it maintains hope while acknowledging genuine challenges, develops multiple pathways rather than single solutions, and builds agency through practical engagement rather than abstract theorizing. The Buddha’s template: from extremes to the middle way The foundation for understanding the Bloomers approach begins with Prince Siddhartha Gautama’s transformative journey 2,500 years ago. His path from extreme luxury through extreme asceticism to the revolutionary discovery of the middle way provides a timeless template for navigating complex challenges. The boomer extreme: palace optimization Siddhartha’s early life represented the ultimate in optimized comfort and acceleration of pleasure. As recorded in the Pali Canon, he lived in “refinement, utmost refinement, total refinement” with lotus ponds designed specifically for his enjoyment, sandalwood from Varanasi, and three palaces for different seasons. His father deliberately maintained this paradise to prevent exposure to suffering that might lead to spiritual seeking. Yet this extreme of luxury left Siddhartha profoundly unfulfilled. The encounter with the Four Sights - an aged man, a diseased person, a corpse, and a wandering ascetic - shattered his sheltered existence and revealed the fundamental inadequacy of pure optimization for pleasure and comfort. The doomer extreme: ascetic rejection Siddhartha’s turn to extreme asceticism represented the opposite pole. For six years, he practiced severe self-mortification, surviving on single grains of rice and suppressing his breath until near death. The Mahā Saccaka Sutta provides graphic detail: “My body became extremely emaciated… my spine stood out like a string of beads… The skin of my belly became stuck to internal organs.” This represents the doomer extreme - the belief that only through complete rejection of worldly engagement, through radical restriction and pessimistic withdrawal, could truth be found. The bloomer realization: neither extreme works The breakthrough came when Siddhartha realized the futility of both approaches. He remembered a peaceful meditative state from childhood - sitting in the cool shade of a rose-apple tree - and recognized this natural, balanced state as pointing toward awakening. The pivotal moment arrived when he accepted rice milk from a villager. This simple act represented his rejection of extreme asceticism and acceptance of the middle way. His five ascetic companions, seeing this as abandonment of their spiritual practice, left him in disgust - much like how contemporary safety purists or acceleration maximalists often react to balanced approaches. The first bloomer teaching: articulating the middle path In his first sermon at Sarnath, the Buddha articulated the principle that would become central to addressing complex challenges: “There are these two extremes that are not to be indulged in by one who has gone forth… That which is devoted to sensual pleasure… base, vulgar, common, ignoble, unprofitable; and that which is devoted to self-affliction: painful, ignoble, unprofitable. Avoiding both of these extremes, the middle way… leads to calm, to direct knowledge, to self-awakening.” Scholar Y. Karunadasa emphasizes that the middle way “does not mean moderation or a compromise between the two extremes” but rather a transcendent third alternative that goes “without entering either of the two extremes.” This distinction proves crucial for contemporary applications - the middle path is not splitting the difference, but discovering a fundamentally different approach. The AGI landscape: acceleration versus safety extremes The contemporary AGI development landscape perfectly illustrates the dynamics between extremes and the emerging Bloomers path. The acceleration extreme: pushing forward at all costs For example, the accelerationist position, embodied in some governments attempting to “remove barriers to leadership in artificial intelligence,” prioritizes AI dominance over global collaboration or regulatory oversight. This approach criticizes “engineered social agendas” in AI systems, adopts unilateral stances over international cooperation, and eliminates extensive equity protections from AI governance. This extreme mirrors Siddhartha’s palace period - optimization for immediate gratification (economic advantage, technological supremacy) while avoiding uncomfortable realities about potential consequences. The safety extreme: paralysis through precaution As an example on the opposite pole, the Machine Intelligence Research Institute (MIRI) underwent a dramatic strategic pivot in 2024, shifting from technical alignment research to advocating for complete suspension of frontier AI research. MIRI’s statement reflects deep pessimism: “We now believe that absent an international government effort to suspend frontier AI research, an extinction-level catastrophe is extremely likely.” This position mirrors Siddhartha’s ascetic period - the belief that only through complete rejection of the problematic activity (AI development) can safety be achieved. Emerging Bloomers approaches in AGI governance Some signs of middle-ground positions are emerging in the AI community that transcend the acceleration-versus-safety binary. The EU AI Act may be viewed as a step in the right direction for a Bloomers approach. Officially entering force on August 1, 2024, it establishes risk-based rules that neither prohibit AI development nor allow unrestricted progress. The framework creates specific obligations for high-risk AI systems while preserving innovation space for beneficial applications. Industry collaboration on safety has reached unprecedented levels, with

    44 min
  5. 11/06/2025

    When AI Meets Culture

    A Conversation with History Last week, I found myself in an unexpectedly intimate conversation with a 19th-century Peranakan kamcheng pot. Not metaphorically - literally. At a presentation during ATxSG, AskMona and the OpenAI Forum demonstrated their groundbreaking collaboration with Singapore's Peranakan Museum, and I was among the fortunate few invited to witness what might be the future of cultural engagement. The setup seemed deceptively simple: scan a QR code next to a museum artifact with your phone, and suddenly you're chatting with an AI that embodies the cultural knowledge surrounding that piece. I started with the kamcheng - a delicate porcelain container traditionally used for storing precious items in Peranakan households. Within seconds, my phone screen came alive with responses about the pot's significance in wedding ceremonies, its symbolic role in family heritage, and the intricate trade networks that brought such Chinese ceramics to the Straits Settlements centuries ago. Next, I moved to a stunning kebaya, the traditional blouse that represents the elegant fusion of Malay, Chinese, and European influences that defines Peranakan culture. The AI spoke about the embroidery techniques, the social status conveyed by different fabrics, and how the garment evolved across generations of Peranakan women. When I pointed my phone at historical photographs of Peranakan families, the AI wove stories about the individuals pictured, their roles in Singapore's colonial society, and the cultural traditions they preserved and transformed (See short video below for a glimpse of this experience). It was mesmerizing, educational, and somehow deeply moving. Yet as I walked away from that presentation, a nagging question followed me: Was I genuinely connecting with Peranakan culture, or was I experiencing an algorithmic approximation of cultural meaning, dressed up in conversational interfaces and multilingual accessibility? This question has haunted me because it strikes at the heart of perhaps the most profound challenge facing us as we develop artificial general intelligence: How do we build AI systems that honor the irreducible specificity of human cultures while creating tools that can serve our shared humanity? The more I've reflected on my conversation with that kamcheng pot, the more I've come to see it as a perfect microcosm of the tensions that will define the next phase of AI development. The Architecture of Understanding To understand why this tension matters, we need to examine what happens when artificial intelligence encounters culture. At its core, modern AI - including the generative models that powered my museum conversation - operates on statistical architectures. These systems learn by identifying patterns across vast datasets, finding correlations and connections that allow them to generate contextually appropriate responses. When I asked about the kamcheng pot's role in Peranakan weddings, the AI didn't "know" about weddings in any human sense. Instead, it recognized statistical patterns between words like "kamcheng," "wedding," "ceremony," and "tradition" that had appeared together frequently enough in its training data to suggest meaningful relationships. This statistical approach has proven remarkably powerful. The AI could seamlessly switch between discussing the pot's practical uses, its symbolic significance, and its historical context because its training had exposed it to texts that connected these different domains. When I asked follow-up questions, it adapted gracefully, demonstrating the kind of linguistic flexibility that makes such systems feel almost magical. But here's where the complexity begins: Culture isn't just information that can be extracted and recombined statistically. It's lived experience, embodied knowledge, and intergenerational wisdom that exists in the spaces between words. When a Peranakan grandmother teaches her granddaughter about the proper way to arrange offerings during Hungry Ghost Festival, she's not just transmitting data points about ritual practice. She's passing on an understanding of relationships - between the living and the dead, between tradition and adaptation, between individual identity and collective memory - that emerges from decades of participation in a cultural community. The AI I conversed with could tell me that kamcheng pots were used to store wedding gifts, but could it understand the way a young bride might have felt touching her grandmother's kamcheng on her wedding morning? It could explain the symbolic meaning of different kebaya colors, but could it capture the pride and anxiety of a teenage girl wearing her first adult kebaya to a family gathering? These emotional and relational dimensions of culture resist statistical capture not because our AI systems aren't sophisticated enough, but because they operate on fundamentally different principles of meaning-making. The Universality Imperative Yet we can't dismiss the statistical approach to cultural AI as inherently inadequate, because it serves a crucial democratizing function. Before my conversation with that kamcheng pot, my knowledge of Peranakan culture was embarrassingly superficial - limited to what I'd absorbed from food blogs and heritage tourism. The AI didn't just provide me with information; it created an accessible entry point into a rich cultural world that might otherwise have remained closed to me. This accessibility isn't trivial. Traditional cultural education often requires significant cultural capital: knowing the right people, speaking the right languages, or growing up in the right communities. The barriers can be particularly high for cultural traditions that developed in specific geographic or social contexts, like Peranakan culture's emergence among Chinese diaspora communities in the Straits Settlements. By making cultural knowledge conversational and multilingual, AI systems like the one I encountered can break down these barriers in ways that traditional museum exhibits never could. The economic logic of AI development also pushes toward universal rather than culturally specific solutions. Building and maintaining thousands of culturally distinct AI systems would be exponentially more expensive than developing a single system capable of engaging with multiple cultural contexts. From a resource allocation perspective, it makes sense to focus on the shared cognitive and emotional patterns that unite human experience across cultures rather than the distinctive features that separate us. This universalizing tendency isn't necessarily problematic. Some aspects of human experience genuinely transcend cultural boundaries. The emotions evoked by family heirlooms, the pride associated with traditional craftsmanship, or the complex feelings surrounding cultural preservation in changing societies - these experiences resonate across cultural contexts even when their specific expressions vary dramatically. An AI system that can recognize and respond to these universal patterns might actually achieve more authentic cultural engagement than one narrowly trained on culture-specific datasets. The question is whether this universal approach can maintain enough cultural specificity to avoid what I call the "McDonald's-ization" of cultural AI - systems that provide globally accessible but culturally generic experiences that sacrifice authenticity for reach. Where Patterns Meet Meaning The tension between statistical accuracy and cultural authenticity becomes most visible when we examine how AI systems handle cultural context collapse. During my museum conversation, the AI could explain that kamcheng pots symbolized prosperity and family continuity, but it struggled with more contextual questions about when such symbolism would or wouldn't be appropriate to invoke in contemporary Peranakan families. It knew that kebaya embroidery patterns had regional variations, but it couldn't help me understand how a Peranakan woman today might navigate the politics of choosing between traditional and modernized kebaya styles for different social occasions. These limitations reflect a deeper challenge: Culture exists not just in explicit knowledge but in implicit understanding of context, relationship, and appropriateness. A Peranakan elder doesn't just know facts about cultural traditions; they understand the delicate social dynamics that determine when and how those traditions should be practiced, modified, or respectfully set aside. This contextual intelligence emerges from years of participation in cultural communities, from learning through embodied experience how cultural meaning shifts across different social situations. Current AI architectures struggle with this kind of situated knowledge because they rely on patterns extracted from text rather than patterns learned through social participation. When the AI told me about the significance of family photographs in Peranakan households, it was drawing on documentary sources rather than lived understanding of how families actually use such photographs to negotiate questions of identity, belonging, and cultural continuity across generations. This limitation becomes particularly problematic when we consider power dynamics in cultural representation. The AI systems that mediated my museum experience were trained primarily on English-language sources about Peranakan culture, which means they inevitably reflect the perspectives of scholars, tourists, and cultural institutions rather than the voices of Peranakan community members themselves. Even when these systems incorporate community perspectives, they tend to formalize and standardize cultural knowledge in ways that may not reflect how that knowledge actually circulates within cultural communities. The risk isn't just inaccuracy - it's the possibility that AI-mediated cultural experiences might gradually replace more authentic forms of cultural engagement. If future visitors to Singa

    37 min
  6. 01/05/2025

    Computer Empathy

    While other teenagers kicked soccer balls across sun-drenched fields during lunch breaks at my high school in Italy, I found sanctuary in the cool darkness of the physics lab. There, among oscilloscopes and circuit boards, I built a world I could understand. My soldering iron became an extension of my hand, and electronic components - with their predictable behaviors and clear rulebooks - felt more comprehensible than the bewildering social dynamics unfolding in the courtyard outside. I wasn't antisocial; I was differently social. Human emotions seemed like a foreign language - one with no dictionary, where the rules changed without warning. Technology, by contrast, followed logical patterns. If you understood the principles, you could predict the outcomes. When a circuit worked, it was because you'd connected things correctly, not because it arbitrarily decided to cooperate that day. I can't be the only one who has found technology more approachable than the seemingly enigmatic landscape of human connection. For many of us, the digital world offers clarity where human interaction brings confusion. But what if technology could serve not as an alternative to human connection, but as a bridge toward better understanding it? What if the very precision that makes technology accessible to minds like mine could be harnessed to decode the subtle complexities of human emotion? And what if these tools could then help us build stronger connections not just between individuals, but across the chasms that separate cultures, political systems, and socioeconomic realities? This is the promise of Computer Empathy. The Vision That Started It All In the early 1960s, computer scientists embarked on what they believed would be a relatively straightforward summer project: teaching machines to see. They predicted it might take a season to solve. Six decades later, computer vision remains a vibrant, evolving field that has transformed everything from healthcare to autonomous vehicles. What these pioneers underestimated was not just the technical complexity of vision, but the profound depth of human visual perception - a system refined through millions of years of evolution to not merely capture pixels, but to understand the world. Today, we stand at a similar threshold with a new frontier: Computer Empathy. Just as computer vision moved beyond simple edge detection to deep scene understanding, Computer Empathy represents a paradigm shift from basic emotion recognition toward machines that truly understand the rich, contextual, and dynamic nature of human emotional experience. It is the leap from simply detecting a smile to comprehending the complex emotional narratives that unfold in every human interaction. The term "Computer Empathy" deliberately echoes "Computer Vision," suggesting a parallel evolutionary path. While today's affective computing focuses primarily on classifying emotions into discrete categories from limited signals, Computer Empathy aspires to develop systems that can perceive, interpret, and respond to human emotions with nuance and depth comparable to human empathetic capabilities. It aims to make the same transformative leap that machine learning provided to computer vision - moving from rule-based, symbolic approaches to contextually aware, data-driven understanding. This article explores how the pioneers of computer vision can inspire a similar revolution in emotional intelligence for machines, how such systems might develop, and what impact they could have on society. Drawing from the historical trajectory of computer vision, we will map out a future where machines don't just detect our emotional states but understand them in the full complexity of human experience. Perhaps most importantly, we'll examine how this technology can be developed responsibly to become a force for good, enhancing human connection rather than diminishing it - potentially transforming not just personal relationships but the very fabric of global understanding. From Rule-Based Vision to Deep Learning: The Pioneer's Journey The Vision Revolution: A Path of Discovery The story of computer vision reads like a classic hero's journey, offering profound lessons for our quest toward Computer Empathy. In those early days of the 1960s, luminaries like Seymour Papert and Marvin Minsky at MIT approached vision with the same structured logic I once applied to my circuit boards in that Italian physics lab - they believed the world could be parsed through explicit rules and symbolic logic. Their "Summer Vision Project" aimed to teach machines to see through programmed instructions, much like following a recipe or wiring diagram. But nature proved far more complex than circuitry. These brilliant minds quickly discovered that vision - something humans do effortlessly from infancy - resisted being reduced to programmatic rules. The world wasn't a schematic; it was a living, breathing, ever-changing canvas of light and shadow, context and meaning. For nearly three decades after this humbling realization, computer vision advanced through a patchwork of specialized approaches. Researchers worked on edge detection to find object boundaries, feature extraction to identify key visual patterns, motion analysis to track movement through space. It was progress, but fragmented and limited - vision systems that worked perfectly in laboratory settings would fail spectacularly when confronted with the messy reality of the outside world. The transformative spark came from Yann LeCun, who in the late 1980s and early 1990s developed convolutional neural networks (CNNs). Rather than programming explicit rules for vision, LeCun's approach allowed systems to learn visual patterns directly from examples. It was a fundamentally different philosophy - instead of telling machines how to see, researchers began showing them what to see and letting them discover the patterns themselves. Yet LeCun's revolutionary ideas initially faced significant constraints. Computer processing power was limited, and examples were few. The watershed moment arrived when Fei-Fei Li created ImageNet in 2009 - a vast library of over 14 million labeled images spanning thousands of categories. For the first time, machines had enough examples to learn the rich visual patterns that humans intuitively grasp. The 2012 ImageNet competition became computer vision's Promethean moment. Alex Krizhevsky, Ilya Sutskever, and Geoffrey Hinton unveiled AlexNet, a deep learning system that slashed error rates nearly in half compared to traditional approaches. This wasn't just incremental improvement; it was a paradigm shift that transformed the entire field. Within a remarkably short span, vision systems began exceeding human performance on specific tasks, from diagnosing certain medical conditions to identifying microscopic manufacturing defects. Learning from Vision's Legacy: The Path Toward Emotional Understanding This remarkable journey from rule-based systems to deep learning offers us a narrative blueprint for developing Computer Empathy. The parallels are not just technological but philosophical, revealing how we might transcend current limitations in machine understanding of human emotions. The most profound lesson concerns the inherent limitations of rule-based thinking. When early computer vision researchers tried to program what makes a chair a chair or a face a face, they discovered the infinite variations that defy simple categorization. Similarly, our current emotion recognition systems, which might equate a smile with happiness or lowered brows with anger, fail to capture how emotions blend and transmute across contexts. The teenager who smiles while receiving criticism might be expressing embarrassment rather than joy; the furrowed brow might indicate concentration rather than anger. The ImageNet moment for Computer Empathy will require not just more emotional data, but richer, more contextually nuanced data. Where ImageNet cataloged objects, we need expansive libraries of emotional expressions that capture how emotions manifest across cultures, situations, and individual differences. These won't be simple facial expression datasets but complex, multimodal records combining facial movements, vocal tones, linguistic content, bodily gestures, and - crucially - the contextual situations in which they unfold. Just as convolutional neural networks were specifically designed to handle the peculiarities of visual data - recognizing that visual patterns maintain their identity regardless of position in an image - Computer Empathy will require architectures tailored to the unique nature of emotional expression. These systems must understand that emotions unfold over time rather than existing in static moments, that they blend and transform, and that they manifest differently across modalities. The computational demands of processing this emotional complexity will likely require breakthroughs similar to how GPUs accelerated deep learning for vision. Processing multiple streams of data - facial expressions, voice tone, linguistic content, physiological signals - while maintaining their temporal relationships and contextual meaning presents computational challenges beyond current capabilities. Perhaps most importantly, the development of foundational models of emotional understanding could mirror how pre-trained vision models became the basis for specialized applications. Once systems develop core emotional comprehension, they could be fine-tuned for specific contexts - from mental health support to educational environments to cross-cultural communication. As Yann LeCun presciently observed, natural signals from the real world result from multiple interacting processes where low-level features must be interpreted relative to their context. This principle, which proved transformative for vision, becomes even more crucial for emotions, where context isn't just helpful - it's essential. A tear can signal

    1h 6m
  7. 15/04/2025

    Human Agency in a World of Chaos

    On July 19, 1989 at 37,000 feet above America's heartland, in the cockpit of United Airlines Flight 232, Captain Al Haynes was enjoying a routine flight when a catastrophic failure changed everything. Without warning, the DC-10's tail engine exploded, severing all three hydraulic systems - the aircraft's entire control mechanism. No commercial airliner had ever survived such complete control failure. The flight manual offered no procedures. There was no playbook. "I have no control," First Officer Bill Records announced as the aircraft began an unstoppable right turn. No rudder. No ailerons. No elevators. No flaps. No landing gear control. By all conventional wisdom, the 296 people aboard were doomed. Yet in this moment of absolute chaos, the crew discovered something remarkable. Though they had lost all conventional controls, they still had thrust levers - the ability to adjust each wing engine's power independently. By carefully increasing power to one engine while decreasing it to the other, they found they could crudely steer the crippled aircraft. What followed was a masterclass in human ingenuity, collaboration, and grace under pressure. For nearly 45 minutes, the crew performed an aerial ballet with blunt instruments, using only engine power to create a semblance of control. Against overwhelming odds, they managed to bring their aircraft to the Sioux City runway. While the crash landing was devastating, claiming 112 lives, 184 people survived what should have been certain death for all. Flight 232 offers us a powerful metaphor for our current moment. We live in times where traditional systems and institutions seem to be failing simultaneously. The climate crisis, technological disruption, political polarization, and global pandemic have shattered our illusion of stability. Like those pilots, we may feel we've lost our normal control surfaces. But what if, like them, we still have thrust levers? What if, in the midst of overwhelming complexity and chaos, we still retain powerful forms of agency we've overlooked? This is not about false optimism or denying the gravity of our challenges. It's about finding meaningful control where possible and recognizing that even under severe constraints, our choices still matter - perhaps more than ever. The Human Need for Control: Hard-Wired for Predictability The pilot's first instinct when Flight 232's controls failed wasn't acceptance - it was disbelief, followed by a frantic search for some way, any way, to reassert control. This response wasn't just professional training; it was deeply human. Our brains are exquisitely engineered prediction machines. From our earliest ancestors watching for predator patterns to modern humans checking weather forecasts, we're constantly seeking to anticipate what comes next. This isn't merely a preference - it's a neurological imperative. Neuroscientist Lisa Feldman Barrett explains that our brains are constantly creating models of the world, making predictions to conserve precious metabolic energy. When reality matches our predictions, we experience the comfort of confirmation. When it doesn't, our brains generate anxiety, forcing attention to the mismatch. This explains why uncertainty isn't just intellectually challenging - it's physically distressing. Studies show that unpredictable negative events trigger significantly more stress than predictable ones, even when the outcomes are identical. We'd rather know bad news is coming than wonder if it might. Control, then, isn't just something we want - it's something we need. Without it, we experience what psychologists call "cognitive entropy" - a disorienting state where mental energy dissipates into worry rather than focused action. Prolonged uncertainty depletes our cognitive resources, impairs decision-making, and in extreme cases, manifests as depression, anxiety, or learned helplessness. In the face of overwhelming global complexity, many of us feel what sociologist Alvin Toffler predicted decades ago as "future shock" - the dizzying disorientation that comes when change outpaces our ability to adapt. We feel control slipping away because, in many traditional senses, it is. But the human spirit has always found ways to navigate chaos. Like the pilots of Flight 232, our salvation lies not in denying reality but in discovering the controls that remain available to us - the thrust levers still responding to our touch. Our World in Overdrive: Change at Dizzying Speed The pace of change today would be unrecognizable to previous generations. While humans have always experienced change, never has it occurred at such velocity or scale. Consider technology's exponential trajectory. In 1965, Gordon Moore observed that the number of transistors in a dense integrated circuit doubles approximately every two years - a pattern that has held remarkably consistent. What does exponential growth mean in human terms? It means the smartphone in your pocket contains more computing power than all of NASA had during the moon landing. It means technologies that seemed like science fiction a decade ago - artificial intelligence writing essays, editing genes, or creating photorealistic images from text prompts - are now everyday realities. This acceleration isn't confined to silicon chips. The global economy has transformed from relatively distinct national markets to an interconnected ecosystem where trillions of dollars change hands daily in currency markets alone. The most recent comprehensive data from the Bank for International Settlements placed this figure at around $6.6 trillion daily in 2019, though more recent reports suggest volumes may have changed since then. Supply chains wrap around the planet, making the production of even simple objects dependent on dozens of countries. A disruption anywhere - a pandemic in China, a war in Ukraine, a ship stuck in the Suez Canal - creates ripples everywhere. Meanwhile, social norms that once evolved over generations now transform within years or even months. Attitudes toward marriage, gender, work, and personal identity have shifted dramatically in our lifetimes. Institutions that provided stability for centuries - religious organizations, civic groups, extended families - have weakened as organizing forces, leaving many adrift in a sea of individual choice. Add to this the background drum of climate change - ecosystems stressed beyond historical patterns, weather growing more extreme, and the carbon clock ticking toward dangerous thresholds - and we face a perfect storm of disruption. Yet this dizzying pace contains a paradox. While change accelerates in the aggregate, our individual days often feel remarkably unchanged. We wake, work, eat, scroll, sleep, repeat. This creates a dissonance - intellectually, we know the world is transforming rapidly, but experientially, we feel stuck in routines while forces beyond our control reshape our world. This dissonance breeds a dangerous fatalism. When change seems too vast and rapid to comprehend, we're tempted to disengage completely. We retreat into private pleasures, cynical detachment, or nihilistic doom-scrolling. "What could I possibly do?" becomes the rhetorical question that absolves us of responsibility. But this is precisely when our choices matter most. At inflection points in history, small forces applied at the right leverage points can cascade into transformative change. Like the pilots of Flight 232 discovering that subtle adjustments to engine thrust could influence their trajectory, we need to recognize the controls still available to us. The Control Paradox Here's the great irony of our age: we simultaneously overestimate and underestimate our control. We obsess over optimizing our personal productivity while ignoring our influence on larger systems. We meticulously track our fitness metrics while feeling powerless about climate change. We curate our social media presence while accepting political dysfunction as inevitable. This control paradox manifests in curious ways. Many of us experience intense anxiety about personal decisions - which career to pursue, where to live, whom to date - while accepting collective outcomes as fixed and immutable. We're control freaks about our daily schedules but fatalistic about humanity's future. The truth is more nuanced. In some domains, we have far less control than we imagine. Despite our best intentions, much of our behavior is governed by unconscious processes, environmental cues, and biological predispositions. Behavioral economists have thoroughly documented how predictably irrational we are, making the same cognitive errors repeatedly despite our best intentions. For instance, we consistently overestimate our ability to resist temptation (the "planning fallacy"), believe we're less vulnerable to bias than others (the "bias blind spot"), and attribute our failures to circumstances while attributing others' failures to their character (the "fundamental attribution error"). These humbling findings suggest that even our core sense of agency is somewhat illusory. Yet paradoxically, we drastically underestimate our collective influence. Throughout history, small groups of committed individuals have repeatedly changed seemingly immovable systems through coordinated action. From civil rights movements to environmental regulations, from consumer boycotts to technological adoption curves, human society regularly transforms based on shifting behaviors and expectations. The British Empire never imagined that a slender man in homespun cloth could challenge their colonial rule through nonviolent resistance. Record executives didn't foresee how file-sharing would completely restructure their industry. And oil companies didn't anticipate how rapidly renewable energy could become cost-competitive once scaled. The lesson is clear: while our personal control may be more constrained than we'd like to admit, our collective agency is far more powerful than we gene

    52 min
  8. 27/02/2025

    Metal Axolotl

    In today's rapidly evolving technological landscape, a new form of artistic expression is emerging - one that blurs the line between human creativity and artificial intelligence. This intersection, frequently referred to as human-AI co-creation, is redefining our understanding of the creative process and challenging our perceptions of artistic authorship. As AI tools become increasingly sophisticated, artists, designers, and creators like myself are discovering novel ways to collaborate with these technologies, producing works that would have been impossible through human effort alone. The Renaissance of "Art for Art's Sake" The concept of "art for art's sake" (l'art pour l'art) emerged in the 19th century as a reaction against the notion that art must serve some moral or didactic purpose. Today, this philosophy is experiencing a renaissance in the context of AI-assisted creation. In a world dominated by commercial imperatives and market-driven content, many creators are turning to AI tools not to maximize productivity or profit, but simply to explore new creative horizons. This shift is something I experienced firsthand in a recent creative experiment. After watching a presentation organized by OpenAI featuring Manuel Sainsily and Will Selviz about using early versions of Sora for cultural art projects, I was inspired to prioritize spending time on something creative with no commercial intent. Coincidentally, one of the AI art groups I follow on LinkedIn called #artgen prompted followers to create artwork with the theme "Beat goes on." This made me think of a children's song that went viral on TikTok called "Ask an axolotl" by Doctor Waffle. It had become a comfort song for many people in today's turbulent times, and I wanted to re-imagine these same words expressed in a much more aggressive, enraged tone to reflect the current state of the global psyche. Having experimented with many AI music generation tools like Udio and Suno, I knew that I could probably come up with something that matched my vision with a bit of tweaking. After countless trials, I ended up with elements I felt I could work with. Using more manual tools familiar to me like Adobe Audition, I put together a song that started growing on me. Then I went on to make an equally nonsensical music video to go with it. What's particularly fascinating about this process was how it mirrored my traditional creative workflows while simultaneously transcending their limitations. Inspired by Manuel and Will's explanation of how they used AI to see what happens and approaching it with the classic Bob Ross mentality of embracing happy accidents, I generated hundreds of visuals to see what I would end up with. Using LLMs to rewrite and revise these long text-to-image and text-to-video prompts made the process a bit less tedious. The fact I could iterate on these visuals without the need for practical video shooting made a huge difference. One thing I noticed during this process was how I seemed to almost out of muscle memory mimic some of the approaches to making videos I've taken in the past. Typically, I would have a loosely defined concept and a tentative shot list with storyboard and framing snippets, go out on location or work with a studio setup to gather a large amount of footage and b-roll elements, and then work with them in Adobe Premiere to come up with a plausible sequence. I took a similar approach to put together the resulting music video, wanting to make the visuals get increasingly bizarre as the music intensified. Historical Parallels: New Technologies and Artistic Expression The relationship between technology and art has always been complex and multifaceted. Throughout history, new technological developments have repeatedly transformed artistic practice, often triggering initial resistance before becoming incorporated into the artistic mainstream. From Camera Obscura to Photography The development of the camera obscura in the 16th and 17th centuries revolutionized how artists approached visual representation. Artists like Vermeer likely used this technology to achieve the photorealistic effects that characterize their work. When photography emerged in the 19th century, it was initially dismissed as a mechanical process rather than a true art form. Painters feared it would render their skills obsolete. Instead, photography liberated painting from the burden of realistic representation, helping to catalyze movements like Impressionism, which focused on capturing light, atmosphere, and subjective experience rather than precise visual details. The parallel with AI art is striking: just as photography didn't replace painting but pushed it to explore new territories, AI tools aren't replacing human creativity but extending its boundaries. In my own experience, the process of creating with AI still involves very human decisions about selection, curation, and aesthetic judgment. Algorithmic Art and Computer-Generated Creativity The roots of AI art stretch back further than many realize. Algorithmic art dates back to at least the 1960s when artists like Vera Molnár (who began implementing algorithmic programs by hand as early as 1959 and started using computers in 1968) and Manfred Mohr (who transformed from abstract expressionism to computer-generated algorithmic geometry in the late 1960s) began using computers to generate visual works based on mathematical algorithms. The AARON program, developed by Harold Cohen in the early 1970s, was one of the earliest AI systems designed to create original artworks. Cohen began developing this pioneering program after a period as visiting scholar at Stanford's Artificial Intelligence Laboratory in 1971. These early experiments laid the groundwork for today's more sophisticated AI art tools. What distinguishes our current moment is not just the increased technical capability of AI systems but their accessibility. Tools like Adobe Firefly, Midjourney, DALL-E, Stable Diffusion, Sora for video, and Suno and Udio for music have democratized access to AI-assisted creation, allowing artists without technical backgrounds to experiment with these new forms of co-creation. The Evolution of Human-AI Co-Creation Human-AI co-creation represents a significant evolution in the creative process, one that challenges traditional notions of authorship and originality. From Tools to Collaborators Historically, artists have always used tools - from brushes and chisels to cameras and computers. What makes AI different is its capacity for autonomous generation based on learned patterns. Unlike traditional tools, which passively respond to human input, generative AI systems actively contribute to the creative process, suggesting possibilities that might not have occurred to the human artist. Manuel Sainsily, a futurist, artist, TED speaker, and instructor at McGill University who pioneers advancements in Mixed Realities and AI, describes this as a shift from "tools to collaborators." In his work with Will Selviz through their community Protopica, they explore how emerging technologies can drive positive cultural change, emphasizing that AI doesn't replace human creativity but amplifies it. Their collaborative project "Protopica" uses AI tools like Sora to demonstrate how artificial intelligence can be used for cultural preservation and storytelling. The Creative Process Reimagined The process of creating with AI involves what researchers term "exploratory creativity" - a back-and-forth dialogue between human and machine. The artist inputs prompts or parameters, the AI generates outputs, the artist selects promising directions, refines the prompts, and the cycle continues. This iteration process resembles traditional artistic methods but with a crucial difference: the machine can generate variations and possibilities at a scale and speed impossible for humans. In my music video creation process, I generated hundreds of visuals and used LLMs to rewrite and revise these long text-to-image and text-to-video prompts to make the process less tedious. This approach paralleled my previous experience with traditional video production, where I would gather a large amount of footage and b-roll elements before editing them into a coherent sequence. This resemblance to traditional creative processes is important, as it suggests that AI isn't replacing creativity but transforming how it's expressed. The fundamental human impulses toward creative expression remain, but the means of realizing those impulses are evolving. Expert Perspectives on Human-AI Co-Creation The rise of AI art has sparked intense debate among artists, critics, and researchers. Opinions range from enthusiastic embrace to strong skepticism, with many nuanced positions in between. The Optimistic View: AI as Creative Amplifier Proponents of AI art, like Manuel Sainsily and Will Selviz, see these technologies as tools for expanding human creative capabilities. They emphasize that AI allows artists to transcend technical limitations, visualize ideas more quickly, and explore creative directions that might otherwise remain unexplored. A study published in Scientific Reports suggests that AI tools can enhance perceptions of human creativity by providing contrast. When viewers are aware that a work is created through human-AI collaboration, they often perceive the human contribution as more significant and valuable, suggesting that AI might actually heighten our appreciation for human creative input. The "Sora Selects" program, featuring ten artists who created short films using OpenAI's text-to-video generator, demonstrates how artists can use AI tools to realize ambitious visions that would be impractical or impossible with traditional production methods. These artists approach AI not as a replacement for their creativity but as a medium through which to express it. The Cautionary View: Concerns and Criticisms Critics raise important concerns about AI art, particula

    28 min

About

Technology, curiosity, progress and being human. thekush.substack.com