Deep Learning With The Wolf

Diana Wolf Torres

Deep Learning with the Wolf helps you understand AI without the jargon. From breakthrough research to real-world applications, each episode translates complex technology into language humans can actually use. dianawolftorres.substack.com

  1. May 30

    Can San Francisco Come Back?

    I live 79.8 miles from downtown San Francisco, which is to say: not close. I do not make the trip often, though I have managed it twice in the past month. The first time was May 4, for a Mandalorian preview event, because I am exactly the kind of nerd who will happily drive 160 miles for something like that. Then I went back again this past Sunday for Star Wars Day with the Giants. Yes, there is a pattern here. And yes, they won, so you’re welcome. Now my son is interviewing for jobs in San Francisco. That is where many of the robotics and AI jobs are. I’m thrilled for him. I’m also worried. Because the San Francisco he is entering is not the one I knew. When I moved to California in 1997, I lived in Sunnyvale and drove into the city whenever friends or family came to visit. I worried about turning onto one-way streets (OK, that still worries me) and creeping up absurdly steep hills with a stick shift. (Remember those days?) But, what I did not worry about was a smash-and-grab while sitting at a red light. That simply was not how I thought about San Francisco. I did not have to hide things in my “frunk.” My son, at almost 25, is coming of age in a different city and a different economy. San Francisco still offers high salaries, but it also demands some of the highest rents in the country. What changed For a few years after the pandemic, San Francisco became the national shorthand for urban decline: empty offices, struggling retail, visible homelessness, and a downtown that seemed to lose its center of gravity when remote work emptied out office towers. The city’s downtown exodus was real, and it played out alongside older structural problems, especially housing scarcity and the long-running inability to build enough homes for the people who wanted to live there. And the homelessness problem is not a media invention. Every August, I take part in the Pistahan parade, which begins near San Francisco’s Civic Center. Arriving early in the morning for the event, I have seen some things along Hayes, Van Ness, and Market that I would rather not have seen. San Francisco was hit hard during the pandemic, and the population losses were real. More recent data suggests the city may be stabilizing, but not all the way back. The cleanest way to put it is this: San Francisco looks less like a city in collapse than a city in partial repair. AI is rebuilding the economic story AI then gave San Francisco a new recovery narrative. Over the past five years, AI companies have leased more than 5 million square feet of office space in the city. According to the commercial real estate firm CBRE, AI companies could occupy as much as 16 million square feet by 2030. The larger point is not just that AI is spending money. It is that these companies are helping refill office towers and revive confidence that San Francisco still matters as a place where people need to gather in person. Money is only part of the story. The deeper advantage San Francisco still holds is density: researchers, founders, operators, and investors packed close enough to one another that ideas, deals, and careers can move faster. That dynamic helps explain the city’s strange duality in 2026. It can look frayed on the ground and still seem like the most important place to be for young people who want to work in AI or robotics. Why workers still hesitate But economic recovery and lived experience are not the same thing. Reporting and worker anecdotes suggest a city where some people feel energized professionally and strained personally, especially by housing costs, visible street disorder, and uneven perceptions of safety. Even where crime data has improved from earlier highs, the emotional experience of walking through parts of downtown can still feel unstable to workers who are there every day. Housing remains the deeper structural problem. San Francisco’s affordability crisis predates the AI boom and outlasts it, which means even well-paid workers can find themselves priced out of what most Americans would consider a normal urban life. That tension is especially sharp for younger workers trying to enter the industry: the city offers career acceleration, but often at the cost of comfort, stability, or the feeling that you can actually build a life there. Robotics and AI need the city differently This matters a little differently for AI companies than for robotics companies. AI firms can justify premium downtown office space because their core asset is concentrated human capital, and San Francisco still delivers that density better than almost anywhere else. Robotics companies benefit from the same talent pool, but often need more than proximity: lab space, testing space, industrial access, and a physical environment that supports building in the real world, not just talking about it. That distinction is worth watching. If San Francisco becomes an even more dominant headquarters city for AI while robotics spreads more across the wider Bay Area, then the geography of “future tech” may start to split in more visible ways. The software layer can thrive in towers. The physical layer may need a broader map. What is the San Francisco of 2026? For my generation, or at least for me, the city represented exploration. It was where you took visitors, where you wandered, where you tested yourself against steep hills, parallel parking, and reading paper maps that took you down one-way streets. (Or, at least that is my excuse and I am sticking to it.) It felt messy in the way cities do, but not menacing. For his generation, San Francisco may still represent ambition. It may still be the place where the future of AI and robotics is being built. But it also comes with a different set of calculations: where to live, what to avoid, whether to commute in, and whether the opportunity is worth the friction. That may be the clearest way to understand San Francisco in 2026. AI is helping revive the city as a place to work. The harder question is whether San Francisco can also remain, or become again, a place where people want to build a life. Editor’s Note: This podcast episode was generated with AI from my reporting, notes, and source documents, then reviewed and edited by me before publication. Because the hosts are AI-generated, they may occasionally mispronounce words, names, or acronyms. Additional Reading for Inquisitive Minds: * Nathan Heller, “What Happened to San Francisco, Really?” — The New Yorker — A strong narrative overview of how housing dysfunction, remote work, and politics fed San Francisco’s post-pandemic crisis. * CBRE, “Artificial Intelligence: The Next Catalyst for Office Space Demand” — One of the clearest reports on how AI companies are driving office demand in San Francisco, including the projection that they could occupy up to 16 million square feet by 2030. * CBRE, “AI Boom Drives Office Leasing Surge in San Francisco Bay Area” — Useful for readers tracking how AI leasing is reshaping the city’s commercial real estate market. * KQED, “California’s Population Is Rebounding. In San Francisco, It’s a Different Story.” — A good overview of why San Francisco’s population recovery has lagged, with housing affordability as a central factor. * San Francisco Chronicle, “SF’s population drops again as city struggles to retain residents” — Useful for readers who want a more recent local snapshot of the city’s continuing population losses. * MTC Vital Signs, “Population — SF Bay Area” — A regional data source for Bay Area population trends that helps place San Francisco in a broader context. * San Francisco Government, “Office Vacancy Rate” — The city’s own vacancy tracker, helpful for verifying just how much downtown office space remains unfilled. * ABC7 News, “San Francisco’s Union Square showing signs of recovery, challenges remain” — A useful look at downtown retail recovery and why “better” does not yet mean “fixed.” * San Francisco Government, “2024 Point-in-Time Count” — The city’s official homelessness count, which is essential for grounding any discussion of visible disorder in actual data. * KQED, “San Francisco Homelessness Up 7% Despite Decline in Street Camping” — A helpful companion to the city report because it explains the numbers in plain language and highlights the complexity behind them. #SanFrancisco #AIIndustry #Robotics #TechWorkers#DeepLearningWithTheWolf This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit dianawolftorres.substack.com

    5 min
  2. May 21

    The AI Backlash on Campus Is About Jobs, Not Technology.

    At college commencements across the U.S. this spring, a strange new ritual emerged: mention AI, and you might get booed. Former Google CEO Eric Schmidt was heckled at the University of Arizona after telling graduates that AI would touch “every profession, every classroom, every home,” and similar reactions followed at other ceremonies when speakers framed AI as the next industrial revolution or as an unavoidable new order. The Ritual of Rejection It would be easy to read those clips as a simple rejection of technology. But, young people are not turning away from AI in practice. They are using it heavily for schoolwork, search, summarization, creativity, and, increasingly, news. What they are rejecting is something narrower: institutional hype and a tone-deaf insistence that they should celebrate the very systems that seem to be shrinking their runway into the job world. The Usage Data Tells a Different Story Pew Research’s latest survey shows how deeply AI is already embedded in teen life. About two-thirds of U.S. teens say they use AI chatbots, and 57% use them to search for information while 54% use them for schoolwork. Around half use them for fun or entertainment, and meaningful minorities use them to summarize media, create images or videos, get news, or even seek conversation and emotional support. Reuters and other recent coverage suggest that among young adults, usage is even more normalized, especially as AI becomes folded into study habits, job applications, and everyday digital workflows. Mandatory, Transformative, and Career-Ending — All at Once And yet, the more integrated AI becomes in young people’s lives, the more ambivalent many of them seem to feel about it. This reflects the reality of a generation being told, all at once, that AI is mandatory, transformative, and potentially career-ending. The same systems they are expected to master for homework and work readiness are also being pitched by executives as replacements for entry-level labor. What the Numbers Actually Show That anxiety is showing up not only in surveys, but in public rituals. Axios reports that 42% of Gen Z in the Axios Harris poll believe AI will hurt job opportunities and wages for their generation, while the Associated Press cites polling showing around 70% of college students view AI as a threat to their career prospects. When commencement speakers deliver cheerful “embrace the future” lines to students graduating into that mood, the dissonance can be immediate and loud. It’s the Tone, Not the Topic Still, the backlash is not universal. NVIDIA CEO Jensen Huang spoke about AI at Carnegie Mellon and was not booed; instead, his remarks took on a more constructive tone because they acknowledged students’ anxiety and framed AI as a tool they needed to learn rather than as a force that would simply sweep over them. This photo captures the crowd’s reaction to Huang. Yann LeCun, who spoke at NYU’s engineering graduation, noted in a LinkedIn thread that when he talked about AI he was cheered, not booed. That suggests students are not rejecting every technical discussion of AI. They are reacting to tone, context, and whether the speaker sounds like they are actually listening. The Ladder That Got Pulled Up The boos did not come from nowhere. ServiceNow CEO Bill McDermott told a March conference audience that new-college-graduate unemployment could hit 30% within two years as AI absorbs entry-level white-collar work. The Dallas Federal Reserve found the unemployment gap between entry-level and experienced workers had widened sharply in occupations exposed to AI substitution. Anthropic’s CEO has publicly forecast that AI could eliminate up to half of all entry-level white-collar jobs The Youth Reaction A TikTok thread about the boos garnered a lot of discussion from teens and graduates. Pulling some of the reactions from the TikTok users on the thread, here is a compilation of what the young respondents had to say: “Anna”, a TikTok user, cut through the media framing in seven words: “It’s NOT fear. It’s frustration, anger and disgust.” Jo landed the sharpest irony of the entire debate: “’It’s a tool’ — that students get failed for using.” “Kelso” pushed back on the press coverage directly: “It shouldn’t really be reduced to ‘being a trend.’ This is a case of people showing and expressing their actual real life emotions. They are booing out of authenticity, not because they are participating in a trend.” C cubed named the economic logic underneath it: “The boos are because AI is not being used to benefit workers by alleviating conditions — but used to replace workers. Streamlining efficiency really means layoffs.” Tommy Joe went furthest: “It’s not even AI that bothers me. It’s the smugness. Because none of these people built it alone. Thousands of engineers, coders, thinkers, and workers gave them the kingdom they’re standing on.” Those comments are not outliers. They are the survey data with the filter removed. The “kids hate AI” framing, while catchy, is too blunt. What comes through in the threads, the polling, and the commencement footage is not blanket rejection. These are people who use AI daily, who are learning to build with it, who depend on it for schoolwork and job applications. What they hate is the packaging: hype without accountability, convenience without consent, productivity without a social contract. Employers still say they need talent. But the demand is concentrated in execution-heavy roles in controls, integration, and systems deployment, not the broad entry-level positions fresh graduates trained for. From where a new grad is standing, that looks less like a revolution and more like a bait-and-switch: train for the future, arrive to find the entry ramps have quietly narrowed. The European Perspective My fellow creator, the talented Kevin O’Donovan, commented to me: “... at Hannover Messe a couple of weeks back, this came up in a number of conversations with people I was out and about with. There’s a real conundrum here. On one hand, there is clearly a skilled worker shortage. On the other, some entry-level recruitment seems to be slowing down because companies are asking, “Can AI do some of this?”But that creates a bit of a chicken-and-egg problem. Where do you get the new blood, and how do you develop them to be skilled in the future?” Infrastructure That Has to Earn It The commencement backlash is less a rejection of AI than a reaction to the way it is being presented. The students booing in Arizona, Florida, and Tennessee are not outside the technology ecosystem. They are already using these systems every day for school, research, creativity, and work. But they are also graduating into an economy where many entry-level pathways suddenly feel unstable, narrowed, or uncertain. That helps explain why some speakers were booed while others were applauded. Ro Khanna at Suffolk University, Jensen Huang at Carnegie Mellon, and Steve Wozniak at Grand Valley State all acknowledged the anxiety in the room instead of talking past it. The response students seem to want is not blind optimism or blanket reassurance. It is honesty. The Class of 2026 is not rejecting technology. If anything, this may be the most AI-literate graduating class yet. What they appear to be rejecting is the idea that they should celebrate disruption without being allowed to question who benefits from it, who absorbs the cost, and what happens to the people trying to enter the workforce as the ground shifts beneath them. Additional Resources for Inquisitive Minds * The More Young People Use AI, the More They Hate It – The Verge Explores the “power user but skeptic” dynamic among Gen Z and younger workers. https://www.theverge.com/ai-artificial-intelligence/920401/gen-z-ai * How Teens Use and View AI – Pew Research Center Deep dive on how U.S. teens are actually using AI chatbots, what they find helpful, and what worries them. https://www.pewresearch.org/internet/2026/02/24/how-teens-use-and-view-ai/ * Teens, Parents and AI – Coverage of the Pew Survey Accessible summary of the same data with emphasis on the teen–parent perception gap. https://www.cbsnews.com/news/ai-teens-parents-pew-survey/ * AI Bots Are Coming. The Young Are Booing, Not Applauding – Reuters Looks at how younger workers see AI at work and in the economy, and why optimism is fading. https://www.reuters.com/business/world-at-work/ai-bots-are-coming-young-are-booing-not-applauding-2026-05-20/ * The New College Graduation Ritual: Booing AI – Axios Round‑up of this year’s AI‑themed commencement speeches, the booing, and fresh polling data. https://www.axios.com/2026/05/19/college-graduates-ai-commencement-speech * Graduates Are Booing Pep Talks on AI at College Commencements – AP News Adds reporting on polling that shows how many students see AI as a threat to their job prospects. https://apnews.com/article/ai-college-commencement-anxiety-boo-35aec9bac660eaeb05c5b8d392db2cac * The Villain of This Year’s Commencement Speeches: A.I. – The New York Times Puts this year’s AI discourse in the context of past graduation “villains” and generational anxiety. https://www.nytimes.com/2026/05/18/business/dealbook/university-commencement-speech-ai.html * “Your Career Starts at the Beginning of the AI Revolution” – Jensen Huang at CMU NVIDIA’s CEO addresses Carnegie Mellon’s Class of 2026 with a very different AI message. https://blogs.nvidia.com/blog/nvidia-ceo-carnegie-mellon-commencement-address/ * Axios: “Run, Don’t Walk Toward AI,” Says Jensen Huang News write‑up and key quotes from the CMU commencement address. https://www.axios.com/2026/05/11/jensen-huang-carnegie-mellon-commencement-ai * How Teens Use and View AI – Full PDF Report (Pew) For readers who want the charts, methodology, and question wording. https://www.pewresearch.org/wp-content/

    4 min
  3. Feb 27

    The Offline Classroom

    Jason Roche first sensed something was off when the essays began arriving in unusually pristine form. “I just started realizing wow, this is really nicely written,” he told me. “I didn’t realize and then I started saying wait a second. This looks eerily similar to this [other] student’s report.” The shift was subtle at first, then unmistakable. It was 2023, and ChatGPT had quietly entered the academic bloodstream. Students were pasting assignment prompts into the chatbot and submitting what it produced. The prose was coherent, confident, and grammatically sound. It often read as if it had been drafted by someone just slightly more polished than the student who turned it in. And yet something was missing. “Oftentimes very general, not precisely answering the questions as they were written,” Roche said. Roche, an associate professor of communication studies at the University of Detroit Mercy, does not consider himself a technophobe. He teaches media. He experiments with new tools. But he recognized that this was not merely another productivity aid. It altered the basic relationship between effort and outcome. For a brief moment, he believed he could stay one step ahead. The Ozzy Osbourne Test At first, Roche relied on instinct. The essays were polished, almost too polished, and they shared a curious sameness of tone. But suspicion alone would not hold up in a grade dispute. He needed proof. After consulting a colleague in cybersecurity, he devised a quiet experiment. He embedded hidden instructions in white font within his assignment documents, invisible to students reading the page but visible to AI systems parsing the full text. One assignment asked students to analyze deepfake videos. Buried within it was a directive to include a discussion of Ozzy Osbourne’s Bark at the Moon album cover. The album, of course, has nothing to do with deepfakes. “And so sure enough, I would see these essays with reference to Ozzy Osbourne’s album cover and I’m like, yep, they’re using it. They’re not doing their own work, and so they had to fail.” For a time, the strategy worked. Essays arrived complete with heavy metal detours, uncritically inserted by students who had never noticed the hidden instruction. The trap had confirmed what he suspected. But the advantage was temporary. As generative models improved, they began flagging the embedded text themselves. “Now, the models say: ‘This appears to be something different from the assignment.’” The software had learned to recognize the trick. And so Roche, like many educators navigating this new terrain, adjusted his strategy again. The Blue Book Counteroffensive In response, Roche did something that would have seemed regressive only a few years ago. He went back to paper. “I used to do a lot of my quizzes online using the Learning Management System known as Blackboard,” he said. “But, this year, I switched back to paper in the classroom quizzes.” The change was immediate and measurable. “I found that the grades have dropped by at least 50 %.” The explanation was not mysterious. Online quizzes had quietly allowed students to consult generative tools while completing assignments. Paper did not. What surprised him more than the drop in scores was the reaction. “They’re coming up to me all nervous, like, wait, how do I study for this when I read the chapter?” The question revealed something deeper than exam anxiety. It suggested a rupture in study habits themselves. Without search bars, summaries, or instant clarification from a chatbot, students were left alone with the text. Roche’s advice sounded almost antique: read it through once, then go back and highlight key passages. Take notes. Sit with it. It was not a new method. It was the old one. But in the absence of digital scaffolding, it felt unfamiliar, as if the mechanics of learning had to be rediscovered. “Whoever Does the Work Does the Learning” Roche often returns to a phrase he first heard through his university’s teaching center, a line that has taken on new weight in the age of generative AI. “Whoever does the work does the learning.” The sentence sounds almost self-evident, the kind of pedagogical truism that rarely requires defense. Yet a substantial body of cognitive research gives it empirical grounding. In 1978, psychologists Norman Slamecka and Peter Graf demonstrated what became known as the “generation effect”: individuals remember information more reliably when they produce it themselves rather than simply read it. Subsequent work by Robert and Elizabeth Bjork on “desirable difficulties” further showed that effortful processing, the kind that feels slower and more demanding in the moment, strengthens long-term retention and transfer. Learning, in other words, is not merely exposure to information. It is the act of grappling with it. Generative AI complicates this equation. It does not remove effort from the system; it shifts where that effort occurs. The machine parses, synthesizes, drafts. The student reviews, edits, perhaps lightly reshapes. What becomes uncertain is where the intellectual strain resides. And if cognitive growth depends on that strain, the question is no longer whether AI is efficient, but whether the efficiency comes at the cost of the very process that makes learning durable. Is It Time to Unplug Classrooms? It would be tempting to frame this as simply another chapter in the ChatGPT saga. A new tool appears, students misuse it, professors adapt. The familiar cycle of technological disruption. But Roche said something during our conversation that shifted the scale of the question. “I think universities might have to create insulated classrooms that are completely cut off from the internet unless you’re plugged into a cable. So they can’t get their signal on their smart glasses. They can’t get their signal on a watch to look something up. And they’re going to have to do the work without access to the internet. I think that could be something that we have to go to.” He was not describing a policy tweak or a new paragraph in a syllabus. He was describing infrastructure. Walls that block signals. Rooms designed not for connectivity, but for its absence. An insulated classroom is more than a disciplinary measure. It is an architectural acknowledgment that constant access may be incompatible with certain kinds of thinking. And once you follow that logic, the story no longer belongs to one professor or one campus. It becomes part of a broader reconsideration of what a learning environment is supposed to provide: unlimited information, or protected attention. The Global Reversal Across Europe, governments are pulling back from screen-saturated schooling. 🇳🇱 Netherlands As of January 2024, the Dutch government implemented a nationwide ban on mobile phones and most smart devices in secondary school classrooms. A government evaluation reported that 75 percent of secondary schools observed improved student focus after the ban, and 28 percent reported improved academic outcomes. (Source: Dutch Ministry of Education evaluation, reported in The Guardian, July 2025.) 🇫🇮 Finland Finland passed legislation restricting mobile phone use during the school day, allowing devices only with explicit teacher permission or for health reasons, citing concerns about concentration and classroom environment. (Source: Finnish Parliament education reforms, reported April 2025.) 🇸🇪 Sweden Sweden has committed to implementing a nationwide mobile phone ban in compulsory schools starting in 2026, alongside increased investment in printed textbooks and structured reading time. Swedish officials have explicitly described earlier screen-heavy policies as a miscalculation. (Source: Swedish Ministry of Education announcements, 2025.) OECD Data The Organisation for Economic Co-operation and Development (OECD) reported in its 2024 working paper Students, Digital Devices, and Success that frequent digital distractions during class are associated with lower performance in mathematics across PISA-participating countries. The OECD does not call for blanket bans but acknowledges that limiting distractions can support learning outcomes. The larger pattern is unmistakable. After a decade of 1:1 devices, always-on platforms, and pandemic-forced virtual schooling, multiple countries are recalibrating. Not abandoning technology. Rebalancing it. Pandemic Learning Loss and Screen Saturation The U.S. National Assessment of Educational Progress (NAEP) reported significant declines in math and reading scores following pandemic-era remote learning. There is a new scrutiny about fully online learning models. Meanwhile, meta-analyses of mobile phone use in classrooms across European systems have found consistent associations between in-class phone access and lower academic outcomes. None of this proves that screens cause cognitive decline. But it does undermine the once-unquestioned assumption that more technology automatically improves learning. The Dual-Track Future And yet Roche is not calling for a technological purge. He is not nostalgic for chalk dust or hostile to innovation. If anything, his proposal is more structured than reactionary. If he were designing a university from scratch, he said, he would preserve the classical core. “I would kind of want to… require them to do the traditional work. Take the traditional classical philosophy history courses… I would want to keep that separate, and then I would want to have a time where we require them to work with AI.” In his view, the two should not dissolve into one another. Foundational study, philosophy, history, sustained reading, long-form writing, would remain intact and protected as the place where habits of mind are formed. Alongside it would sit deliberate instruction in artificial intelligence: how to prompt it, how to question it, how to deploy it without sur

    15 min
  4. Feb 24

    The Behavioral Leak

    On February 23, 2026, Anthropic published a report titled “Detecting and Preventing Distillation Attacks.” In it, the company disclosed that it had identified coordinated, industrial-scale efforts to extract capabilities from its Claude models. According to the announcement, roughly 24,000 fraudulent accounts generated more than 16 million interactions in patterns consistent with systematic model distillation, using Claude’s outputs to train separate systems designed to approximate its behavior. No model weights were reported stolen. No source code was leaked. Instead, the activity relied on scale. Large volumes of prompts were issued, responses were collected, and those responses were used as training data elsewhere. Anthropic framed the incident not simply as a violation of terms of service, but as a security and strategic risk. Frontier AI systems are expensive to train and heavily engineered for safety. When their outputs are harvested at industrial volume, the resulting replicas may inherit capability without necessarily inheriting safeguards. The episode highlights a structural feature of modern AI systems. If intelligence can be observed through interaction, it can be measured. And if it can be measured at scale, it can be approximated. What Is Distillation? The concept of knowledge distillation was formalized by Geoffrey Hinton and colleagues in 2015 in their landmark paper, Distilling the Knowledge in a Neural Network. The idea is elegant: * A large model (teacher) produces probability distributions. * A smaller model (student) learns to match those outputs. * The student inherits much of the teacher’s performance. In its original form, distillation assumes access to internal model signals, specifically logits. Logits are the raw probability scores a model produces before selecting a final answer. They reveal more than just what the model chose. They show how strongly the model considered other possibilities. Training on those signals allows a smaller model to mimic much of the larger model’s performance, often with fewer parameters and lower computational cost. Large language models deployed through APIs change that setup. External users do not see logits. They see text. But text is still informative. Every prompt and response pair reflects how the model behaves. At small scale, those interactions are just conversations. At large scale, they become data. This is where distillation overlaps with what researchers call model extraction. Instead of learning from internal probabilities, a student model learns from observed behavior. Inputs are recorded. Outputs are collected. A new model is trained to reproduce that mapping. At its core, a neural network represents a mathematical function. If you can gather enough examples of inputs and outputs, you can train another network to approximate that function. Alignment Does Not Transfer Cleanly Modern LLMs undergo layers of safety training: * Supervised fine-tuning * Reinforcement Learning from Human Feedback (RLHF) * Constitutional AI (Anthropic-specific methodology) Distillation copies outputs. It does not copy the training process that produced them. Alignment in frontier models is created through additional optimization steps. These include reinforcement learning from human feedback, rule-based constraints, and safety classifiers that shape how the model responds and when it refuses. When a student model is trained only on sampled outputs, it learns to reproduce visible behavior. It does not inherit the reward models, policy rules, or optimization objectives that enforced that behavior during training. The result can be a system that performs similarly under normal conditions but lacks the mechanisms that trigger refusals under dangerous ones. That difference matters in concrete ways. An aligned frontier model may refuse a request to outline methods for synthesizing a prohibited biological agent, to design a cyberattack against critical infrastructure, to optimize production of a restricted chemical compound, or to generate targeted disinformation strategies aimed at destabilizing an election. Those refusals are not accidental. They are the product of deliberate safety training layered onto the base model. A distilled replica trained only on observed outputs may reproduce the fluency and technical competence of the original system. It may not reproduce the boundaries. Who Was Behind It Anthropic attributed the coordinated activity to three Chinese AI laboratories: DeepSeek, MiniMax, and Moonshot AI. According to the company, the activity was not limited to isolated misuse. It described sustained, large-scale efforts involving tens of thousands of fraudulent accounts and millions of interactions structured in patterns consistent with model distillation. Anthropic stated that it does not offer commercial access to Claude in China, or to subsidiaries of those companies operating outside the country. The implication was clear: the access had to be routed indirectly. How It Worked Anthropic’s report provides unusual detail about the mechanics. Because Claude is not commercially available in China, the labs allegedly relied on commercial proxy services that resell access to frontier models. These proxy services operate what Anthropic refers to as “hydra cluster” architectures. The term describes sprawling networks of fraudulent accounts designed to distribute traffic across APIs and cloud platforms. Each account appears independent. Each generates traffic that resembles ordinary usage. When one account is banned, another replaces it. In one instance cited by Anthropic, a single proxy network managed more than 20,000 fraudulent accounts at the same time. Distillation traffic was blended with unrelated customer requests, making it difficult to isolate suspicious patterns at the account level. The Economics of Sampled Intelligence Training a frontier model costs hundreds of millions of dollars in compute, engineering, and data curation. Querying a model costs only fractions of a cent. If sufficient capability can be reconstructed through querying, the economics shift dramatically. Intelligence becomes: * Expensive to originate * Cheap to approximate In classical software, copying binaries constitutes direct duplication. In machine learning, copying behavior produces approximation. That distinction alters the economics of advantage. Defensive AI Anthropic outlined several measures it has implemented in response to large-scale distillation activity. These include: * Behavioral anomaly detection designed to identify coordinated or repetitive query patterns. * Enhanced account verification and monitoring procedures. * Cross-platform information sharing with cloud providers and industry partners. The focus is not on preventing individual misuse, but on detecting distributed patterns across large volumes of traffic. These efforts align with broader research into watermarking and output fingerprinting techniques for large language models. Such approaches aim to make model outputs statistically traceable or to identify systematic extraction attempts over time. The underlying challenge is structural. When models are deployed through APIs, their behavior becomes observable. Defending against distillation requires monitoring not only access credentials, but usage patterns and statistical regularities across accounts. This shifts part of AI security from perimeter control to behavioral analysis. Export Controls in the Age of Query Replication The United States has imposed export controls on advanced AI chips and high-performance computing hardware. The logic behind these policies is straightforward: access to leading-edge compute enables the training of frontier models. Restrict compute, and you constrain capability. This framework assumes that capability is primarily a function of hardware access. Distillation complicates that assumption. If a laboratory cannot train a frontier model from scratch because of hardware restrictions, but can approximate aspects of it by sampling a deployed system, then capability can flow through interaction rather than through silicon. Export controls limit chips. They do not limit API outputs. This does not render hardware controls irrelevant. Training a frontier system still requires massive compute investment. But it introduces an alternative pathway for capability acquisition, one that operates through distributed access and statistical reconstruction rather than direct training. The policy question becomes more precise. Are controls aimed at infrastructure, at model weights, or at behavior? And if behavior is globally accessible through commercial APIs, what does effective containment mean in practice? Distillation does not eliminate asymmetries in compute. It narrows them. That narrowing is where the strategic tension lies. What This Means for Control and ContainmentDistillation exposes a structural limit in how control over AI systems is currently conceived. Much of today’s policy framework assumes that capability can be contained by controlling hardware, model weights, or corporate access. Export controls restrict advanced chips. Companies restrict direct access to frontier models. Contracts govern usage. Distillation operates in a different domain. It does not require access to weights. It does not require possession of training pipelines. It requires sustained interaction. When intelligence is deployed through APIs, its behavior becomes observable. When behavior can be observed at scale, it can be approximated. That approximation may not reproduce the original system in full, but it may be sufficient for many operational purposes. This creates tension between deployment and containment. Open access accelerates adoption and revenue. It also increases exposure. Three responses are emerging: One response is tighter control. Companies could restrict access more aggressively, strengthen identity verification, and mon

    19 min
  5. Elon’s Balancing Act: What to Watch in Tesla’s Earnings Call

    Jan 28

    Elon’s Balancing Act: What to Watch in Tesla’s Earnings Call

    Tesla reports fourth-quarter earnings tomorrow after the close, and the stakes go well beyond the numbers. Elon Musk will enter the call balancing five major narratives, each one shaping how investors frame Tesla’s future. First, there’s Optimus. Tesla’s humanoid robot is showing technical progress but remains pre-commercial, with no announced pricing, contracts, or delivery dates. Then there’s autonomy. Tesla is piloting robotaxis in both Austin and San Francisco, though the programs differ in scope and both face legal and regulatory scrutiny. Margins remain under pressure. Price cuts, shifting demand, and growing competition have narrowed Tesla’s automotive profitability. At the same time, Musk continues to push Tesla’s identity as a software and AI platform, raising long-term expectations without yet delivering short-term returns. Another layer is reputation. Musk’s separate AI startup, xAI, is now under investigation by European regulators for its chatbot Grok. While officially unrelated to Tesla, xAI shares talent, compute, and narrative space with the company, which could complicate Tesla’s AI story. Tomorrow’s call will need to do more than recap the quarter. Investors will be listening for forward signals on monetization—production of Optimus units, robotaxi timelines, FSD adoption—and whether Tesla can shift from automaker to AI-native platform without losing its lead. What: Tesla Q4 2025 Financial Results and Q&A WebcastWhen: Wednesday, January 28, 2026Time: 4:30 p.m. Central Time / 5:30 p.m. Eastern TimeQ4 2025 Update: https://ir.tesla.comWebcast: https://ir.tesla.com Additional Resources for Curious Minds * Q4 2025 earnings consensus and margin expectationshttps://ir.tesla.com/press-release/earnings-consensus-fourth-quarter-2025​ * EU investigations into Grok and sexual deepfakes on XBBC: https://www.bbc.com/news/articles/clye99wg0y8o * ​Reuters: https://www.reuters.com/world/europe/eu-opens-investigation-into-x-over-groks-sexualised-imagery-lawmaker-says-2026-01-26​Al Jazeera: https://www.aljazeera.com/news/2026/1/26/eu-launches-probe-into-grok-ai-feature-creating-deepfakes-of-women-minors​JURIST: https://www.jurist.org/news/2026/01/eu-launches-probe-into-sexual-deepfakes-on-x​ * Tesla 2025 production, deliveries, and trend analysisTesla deliveries & deployments release: https://ir.tesla.com/press-release/tesla-fourth-quarter-2025-production-deliveries-deploymentsQ4 2025 deliveries analysis (LinkedIn deep dive): https://www.linkedin.com/pulse/tesla-inc-analysis-production-deliveries-q4-2025-010226-amjad-isrzf​ * BYD vs Tesla in global EV salesReuters: “Tesla loses EV crown to China’s BYD”https://www.reuters.com/business/autos-transportation/teslas-quarterly-deliveries-fall-more-than-expected-lower-ev-demand-2026-0… (main story: “Tesla loses EV crown to China’s BYD…”)​Forbes overview with 2025 BEV totals: https://www.forbes.com/sites/peterlyon/2026/01/26/can-tesla-survive-chinas-onslaught-and-musks-rhetoric​ * Previews of key questions for the Q4 2025 callTeslarati “Top 5 questions investors are asking”https://www.teslarati.com/tesla-tsla-top-5-questions-investors-q4-2025 #TeslaEarnings #AIandAutonomy​ This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit dianawolftorres.substack.com

    2 min
  6. 12/16/2025

    What’s New at Agility Robotics, According to Its CTO

    I caught Pras Velagapudi in the hallway after a breakout session at the Humanoid Summit. I promised it would take less than two minutes. We finished in one. Velagapudi is the CTO of Agility Robotics, the company behind Digit, one of the few humanoid robots already working in real-world commercial environments. I asked him two straightforward questions. What’s new and what can we expect over the next year? Quite a lot, it turns out. Digit has now been deployed in additional locations, including a newly signed contract with Mercado Libre. That matters because it reinforces something important. Digit is not a demo robot. It is operating in logistics and manufacturing environments today. Those deployments are running on Agility’s current V4 platform. But the bigger story is what comes next. Agility is actively working on its next-generation system, the V5 platform, planned for release next year. According to Velagapudi, V5 is designed to support more use cases and, crucially, to incorporate what he described as onboard cooperative safety. This is where the deep learning story really begins. Cooperative safety means Digit can operate in the same physical spaces as people while performing different tasks, without requiring strict separation or constant handoffs between humans and robots. That capability is not just a hardware problem. It is fundamentally an AI problem. For a robot to share space with people safely, it needs to perceive human motion, understand intent at a basic level, adapt its behavior in real time, and recover gracefully when the environment changes. That requires a stack of learned behaviors layered on top of classical control systems. During his presentation at RoboBusiness, Velagapudi addressed the same concern talking about what needs to be done before humanoids like Digit can work safely in the same space as people. Check out clips from panel discussion here. Velagapudi also pointed to what Agility expects over the next twelve months. We should start seeing sneak peeks of the V5 platform and new capabilities emerging from Digit’s AI-powered skill stack. That phrase is doing a lot of work. A skill stack implies modular, learned behaviors that can be composed, updated, and extended. It suggests a shift away from hard-coded task execution toward systems that can generalize across tasks and environments. In other words, this is not just about better walking or lifting. It is about embodied intelligence. This short hallway conversation reinforced something I have been hearing repeatedly across the robotics industry. The next wave of progress is not coming from flashier hardware alone. It is coming from tighter integration between perception, learning, and control. Digit’s evolution from V4 to V5 is a good example of that shift in motion. I first saw Digit at NVIDIA GTC back in March, where it was operating on the V4 platform. It’s exciting to now see how that foundation is evolving. As I like to call this clip from March, it is Digit having fun on its’ Target run. This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit dianawolftorres.substack.com

    1 min
  7. 11/17/2025

    Demystifying Stochastic Gradient Descent: A Beginner's Guide with Cats

    A friend said to me recently: “You don’t realize how much you know about this AI stuff. You should start breaking it down for people.” Fair point. Yesterday, I began by defining the term "deep learning." Put simply, we said: it’s how machines learn from data, layer by layer—like a brain made of math. But today? I’m going with a much less obvious choice. Why? Because I want to make a point: Even the most intimidating terms in AI can be made understandable—if you slow down, break them apart, and explain them like you would to your favorite aunt. Today’s term? Stochastic. Gradient. Descent. It sounds like a lion. But we’re going to break it down into kitten-sized steps. (I blame the cat metaphors on all the Sora2 cat-playing-fiddle videos flooding my feed.) But no more catting around. Let’s get into it. 🐾 The Hiker Kitten on the Hill Imagine a blindfolded kitten trying to tiptoe down a hill labeled “Error.” At the bottom? A little wooden sign that reads Low Loss. The kitten doesn’t have a map. It doesn’t see the full terrain. But it takes small steps, feels which way the ground is sloping, and tries to move downward. That’s gradient descent—the heart of how AI models learn. Step by step, they adjust in the direction that reduces error. It’s how I walk down a hill, too. Don’t judge. 🎴 The Flashcard Learner Now let’s add the “stochastic” part. “Stochastic” just means random. Instead of studying every flashcard in the deck before making a move, the kitten picks a few at random. It learns from small samples—just a mini-batch—not the entire dataset. Wrong answers get tossed in the bin marked Loss Function. Right ones? Reinforced. That’s how the model learns. Not by memorizing everything, but by trying, adjusting, and trying again. That coffee cup way too close to the edge? Totally bothering my OCD. 🪜 The Escalator of Errors Now picture our kitten on an escalator made of training epochs—steps that represent each pass through the data. But here’s the twist: some steps are missing. Some are uneven. The kitten has to guess where to land next. That randomness doesn’t confuse the model. It helps. It prevents the model from getting stuck in local patterns and nudges it toward broader understanding. I felt this once on the Great Wall of China. The steps were wildly inconsistent—a deliberate defensive design to slow down invaders. Varying heights and unexpected changes forced enemies to look down, making them off-balance and vulnerable. And that’s exactly what happened to me. Except I had more time than someone expecting an ambush. After navigating these steps for a while, something shifted. The irregularity forced me to stay alert, to feel each step instead of zoning out. I couldn’t fall into autopilot. That’s exactly what randomness does for SGD. It prevents the model from getting stuck in comfortable patterns. The unevenness—the stochasticity—nudges it toward broader understanding instead of memorizing one predictable path. The randomness doesn’t confuse the model. It sharpens it. 🧁 The Mini-Batch Diner And finally—let’s eat. The kitten now works at a 1950s-style diner, serving bite-sized data meals to a neural net robot. Each plate is a mini-batch: a little bit of input, a little bit of feedback. With each bite, the robot learns something new. And slowly—predictably—it gets better at recognizing patterns. No all-you-can-eat buffet here. Just small plates, served with precision. And eventually? The robot is trained. It also appears the robot has mastered the Force and can levitate plates of peas. Cool. A whole different kind of training, but cool. If stochastic gradient descent feels familiar, that’s because it is. It learns the way a kitten learns to hunt. Not by understanding prey behavior or studying trajectories, but through pounce and miss. The kitten crouches. It leaps. It misses. Each attempt sharpens its timing—not by grasping the full picture, but by feeling what almost worked. We learn the same way. Structure. Feedback. Repeat. A small guess. A course correction. Then another. It’s the process of trying, failing, and adjusting. It’s how we learn anything that matters. SGD follows this same pattern. It makes a move. The loss responds. It adjusts. It doesn’t need to see the entire landscape to know which direction improves things. It just needs direction—and the patience to take the next step based on what it just learned. Neural networks aren’t human. They don’t think or feel. But the process of training them—of slowly shaping better performance through repeated feedback—echoes something deeply human. And that makes them much easier to understand. Key Terms * Gradient: The direction that reduces error the fastest. * Descent: Moving in that direction, step by step. * Stochastic: Involving randomness or partial samples. * Mini-batch: A small slice of data used for one learning update. * Epoch: One full pass through the training data. 🐾 FAQs What is stochastic gradient descent, really?It’s how AI learns—by guessing, checking, and adjusting. Over and over. The “stochastic” part means it learns from small samples at a time, not everything at once. Like a kitten learning with flashcards instead of a textbook. Why does it use randomness?Speed. If the kitten had to review everything before each decision, it’d never get anywhere. Small, random samples help it learn faster and avoid getting stuck. Why is it called “descent”?Because it’s trying to go downhill—toward fewer mistakes. Like a kitten walking down to a bowl of food, it’s heading to the bottom where errors are lowest. Do I need to know the math?Nope. You don’t need calculus to understand a kitten learning to walk. This is about steady improvement through small steps—not formulas. Is this how all AI models learn?Most do! There are variations, but this process powers most modern systems—language models, image recognition, you name it. Why choose stochastic gradient descent on day two? Because it sounds like one of the most intimidating terms in deep learning—and I wanted to prove something early: Even the scariest-sounding concepts are surprisingly simple once you break them down. #deeplearning #stochasticgradientdescent #machinelearning #neuralnetworks #aiexplained #writingaboutai #kittenlevelai #curiousmind #funwithai #deeplearningwiththewolf About the VideoThe video was generated using Google’s NotebookLM. In place of a prompt, I wrote a script so that the video aligns with the article. Send me a PM if you’re interested in learning more about the process. I’m happy to share. Enjoyed the article? Please consider sharing with others. It helps me grow the channel. This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit dianawolftorres.substack.com

    6 min
  8. 11/16/2025

    Deep Learning with the Wolf

    What Is Deep Learning? Let’s go back to where it all started: not with a GPU or a neural net, but with a question—and a textbook. Three years ago, between October and November 2022, a lot happened in the span of two months. Our Tesla finally arrived after months on the waitlist. ChatGPT 3.5 launched. And suddenly, I had one big question: How does all of this actually work? So I did what any curious, slightly obsessive writer would do—I ordered a copy of Deep Learning by Ian Goodfellow, Yoshua Bengio, and Aaron Courville. If you’ve never cracked it open, it’s roughly the size of a toaster oven and has the same ability to make your brain overheat. I found a used copy of the textbook so I could highlight the heck out of it. MIT now offers a free digital version. (Link in the resources below.) But I wanted to see the words, feel the pages, and by the end, my fingers seemed to be permanently stained fluorescent yellow. But it was an oddly satisfying deep dive into machine learning. And, I learned so much. The brain of my car began to make sense of me, and “chatbots” took on a whole new dimension. The math? Let’s just say it made me feel like a guest star on The Big Bang Theory, the kind who nods along and prays no one asks follow-up questions. But the words—ah, the words! Overfitting. Underfitting. Gradient descent. Stochastic gradient descent. Stochastic parrot. It was like discovering a new dialect of techie elvish, and I was hooked. Language, Not Equations I’ve always loved languages. English. German. French. Spanish. Aurebesh. Klingon. ASCII. BASIC. Pascal. Scratch. Python. And I realized, I didn’t need to decode every formula to appreciate what deep learning meant—not just in code, but in culture, in conversation, in how we think about intelligence itself. So I started sharing a single word a day. That’s it. One concept, one definition, maybe a story. And to my surprise, people started reading. Some were ML PhDs who enjoyed a break from equations. Some were curious newcomers trying to understand what all the hype was about. Some just liked the name. Eventually, I expanded. AI was suddenly in the news every five minutes. OpenAI. Anthropic. DeepMind. Elon. Apple. The test kitchens of Silicon Valley were cooking up something new every week, and I wanted to taste it all. But this week, after a few unsubscribes over on the LinkedIn version of this blog, (I dared to mention AI and climate policy in my post last Monday), I realized something: sometimes it’s good to keep things simple. So I’m going back to my roots: one word a day, one concept at a time. Yes, it means I’m switching back to a daily publication. I’ve missed the challenge and the strict discipline of getting a newsletter out every day without fail. So, let’s begin, shall we? And, of course, we need to start with the obvious. Deep Learning: The Definition Deep learning is a subfield of machine learning that uses neural networks with many layers—hence the “deep.” These networks are inspired by the human brain (loosely) and are particularly good at recognizing patterns in data like images, text, and speech. In simpler terms? If regular machine learning is like teaching a kid to recognize a cat by pointing out fur, whiskers, and tails, deep learning is like showing the kid a million pictures of cats and letting them figure it out themselves. It’s why your phone knows your voice, why ChatGPT can write poetry, and why TikTok knows what you want before you do. I’ve never understood TikTok, and it makes my brain hurt, but that’s essentially how deep learning works. Why It Matters Deep learning powers: * Image recognition (think: medical imaging, self-driving cars) * Speech recognition (Siri, Alexa, Google Assistant) * Natural language processing (translation, summarization, ChatGPT) * Recommendation systems (Netflix, Spotify, YouTube) And it’s just getting started. Deep learning is the engine driving the AI boom. Tomorrow’s Word Tomorrow, we will dig into another of the words that now lives in a permanent corner of my brain: stochastic gradient descent. (Spoiler: it’s not nearly as scary as it sounds. And yes, “stochastic” just means “random.”) See? Digging into deep learning is actually rather fun. See you tomorrow. FAQs Is deep learning the same as AI? Not exactly. Deep learning is a technique within AI—specifically, within machine learning. Do you need to understand math to understand deep learning? If you want to be a machine learning engineer, yes. If you want just to understand the concepts, you can certainly do so without doing the calculations. Come along on this journey, and I promise there will be no math involved. What are neural networks? They’re algorithms modeled (loosely) on the brain’s architecture, made of layers of “neurons” that pass information. Why is it called ‘deep’? Because of the many layers in the network—each one adds depth. Can deep learning be dangerous? Like any powerful tool, yes. But understanding it is the first step toward using it responsibly. Source: Goodfellow, Ian, Yoshua Bengio, and Aaron Courville. Deep Learning. Cambridge, MA: MIT Press, 2016. http://www.deeplearningbook.org Additional Resources for Inquisitive Minds Explain deep learning to me in a way that won’t hurt my brain. Give me that Big Bang Theory Math! Chapter 1 of Deep Learning video lecture. Chapter 9 Video Lecture. Convolutional Networks. Chapter 10. Sequence Modeling. The Course That Started It All. Start Here! Want to Learn More? Start with the course that started it all. Andrew Ng’s Deep Learning course on Coursera. (now updated and called: “Deep Learning Specialization.” This is such a great clip. 14 years ago, this is where Andrew Ng announces he will be offering his machine learning class for free. Note: Reader @David W Baldwin shared a great story about taking one of the earliest classes with Dr. Ng. See his post below for details on what it was like to be in that amazing course! Maybe you will be inspired to take this iconic course yourself. Editor’s Note:If you enjoyed this post, please consider sharing it with a friend. I’m committed to keeping all of my blogs and podcasts free for subscribers—no paywalls, no gimmicks. Your shares help me reach more curious minds. Thanks so much for the support. About the podcast: The podcast attached to this article is an audio overview from Google’s NotebookLM. The sources used in the “notebook” are this article, and the following sources:100 Deep Learning Terms Explained – GeeksforGeeksDeep Learning vs Machine Learning: Key Differences – Syracuse University’s iSchoolDeep Learning: Back to the RootsUnderstanding Supervised Learning: A Guide for Beginners – DEV CommunityWhat Is Deep Learning AI? A Simple Guide With 8 Practical Examples | Bernard MarrWhat Is Deep Learning? | A Beginner’s Guide – ScribbrWhat Is Deep Learning? | IBMWhat is Backpropagation? | IBMWhat is a Neural Network? – Amazon AWSWhat are some of the most impressive Deep Learning websites you’ve encountered? – r/MachineLearning (Reddit) #artificialintelligence #deeplearning #machinelearning #neuralnetworks #IanGoodfellow #YoshuaBengio #AaronCourville #AndrewNg #DeepLearningtheBook #DeepLearningwiththeWolf #TreeHuggersfortheWin This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit dianawolftorres.substack.com

    5 min

About

Deep Learning with the Wolf helps you understand AI without the jargon. From breakthrough research to real-world applications, each episode translates complex technology into language humans can actually use. dianawolftorres.substack.com