Please support this podcast by checking out our sponsors: - SurveyMonkey, Using AI to surface insights faster and reduce manual analysis time - https://get.surveymonkey.com/tad - KrispCall: Agentic Cloud Telephony - https://try.krispcall.com/tad - Lindy is your ultimate AI assistant that proactively manages your inbox - https://try.lindy.ai/tad Support The Automated Daily directly: Buy me a coffee: https://buymeacoffee.com/theautomateddaily Today's topics: AI matches doctors in ER - A Science study found OpenAI’s o1-preview matched or beat attending physicians on ER triage notes, raising questions about clinical evaluation, bias, and accountability. AI drug discovery hits immunology - ByteDance’s Anew Labs presented a preclinical IL-17 inhibitor designed with generative AI, spotlighting the race to crack ‘undruggable’ biology and produce oral immunology drugs. Browser development meets AI agents - Paul Kinlan argues faster models and agent tooling could shift browsers toward ‘spec to unit test’ workflows, even hinting at future intent-generated browsers with major security and reproducibility concerns. Coding with specs and tests - Two essays converge on the same lesson: when AI makes code cheap, the scarce resource is clear requirements—using acceptance-criteria IDs, guardrails, and verifiable tests to prevent ‘lost requirements.’ AI goes deeper into defense - The US Department of Defense says it’s bringing advanced AI into classified cloud environments with multiple vendors, intensifying debates over reliability, human control, and ethics in military planning. Google staff push back on AI - More than 600 Google employees urged leadership to block Pentagon use of Google AI, echoing past protests and highlighting how worker influence has weakened as defense contracts grow. Alphabet challenges Nvidia valuation crown - Barron’s reports Alphabet is rapidly closing the market-cap gap with Nvidia, driven by AI-fueled momentum across Search, YouTube, and Cloud as investors watch the next earnings catalysts. Power grids strain under AI - US electricity demand is surging due to data centers, electrification, and reshoring, widening an ‘electricity gap’ and pushing prices up while solar-and-battery buildouts race permitting and policy headwinds. Meta faces child-safety restrictions - New Mexico prosecutors are seeking child-safety restrictions on Meta’s recommendation systems after jurors ordered major penalties, setting up a pivotal fight over algorithm regulation and free speech. Starlink smuggling breaks internet blackouts - An underground network is smuggling Starlink terminals into Iran amid a prolonged internet shutdown, showing how satellite connectivity is reshaping censorship, risk, and information access. Starship costs reshape SpaceX IPO - Reuters says SpaceX has spent over $15B on Starship, a key pillar of its IPO narrative—making cadence, reliability, and near-term test flights central to investor confidence. Episode Transcript AI matches doctors in ER We’ll start in healthcare, with a study that’s going to fuel a lot of debate. Researchers reported in Science that an AI “reasoning model” evaluated on real emergency-department triage notes matched or outperformed attending physicians in diagnostic accuracy. In a small set of Boston ER cases, the model produced the exact—or very close—diagnosis more often than two doctors did. The catch is the most important part: passing a diagnostic test is not the same as practicing medicine. The AI didn’t examine patients, order labs, respond to changing symptoms, or handle the human side of care. But the result still matters, because it suggests that clinical decision support may be nearing a point where accuracy isn’t the only question. The hard questions become oversight, bias, when to trust it, and who’s accountable when it’s wrong. AI drug discovery hits immunology Staying in biology, ByteDance’s drug discovery group, Anew Labs, publicly presented its first AI-designed therapy candidate—an oral small molecule aimed at inhibiting IL-17, which plays a major role in autoimmune disease. The interesting wrinkle: they’re aiming at a kind of protein interaction that’s been notoriously difficult for small molecules, the sort of target people often label as “undruggable.” Anew also released a preprint describing a generative framework trained on millions of biomolecular complexes. It’s a crowded field now, with big names and big budgets trying to turn AI training infrastructure into drug pipelines. For ByteDance, this is a statement of intent—but it’s still preclinical. The real scorecard will be clinical data, where drug development’s failure rate is brutally high. Browser development meets AI agents On the neuroscience front, researchers at Baylor College of Medicine reported a striking Alzheimer’s result in mice: increasing a single protein called Sox9 appeared to supercharge astrocytes—the brain’s support cells—so they cleared more amyloid-beta plaques. In mouse models that already had plaques and measurable memory problems, boosted Sox9 was linked to less plaque accumulation and better memory performance over months. It’s early, and translating to humans is a long road. But the angle is notable: instead of only targeting plaques directly or focusing purely on neurons, this points to strengthening the brain’s own cleanup crew. If that holds up, it could broaden how researchers think about Alzheimer’s interventions. Coding with specs and tests Now to AI and the web—where two different threads are starting to weave into the same story. First, Chrome and Edge are experimenting with small language models running directly in the browser. That opens the door to private, offline features—summarizing, rewriting, quick assistance—without shipping your data off-device and without usage fees. But there’s a web-standards dilemma here. If browsers standardize AI features too early, developers may end up tuning experiences to one vendor’s model behavior, recreating the bad old days of browser-specific sites—except this time it’s prompt-specific sites. And because model outputs can be unpredictable, there are legitimate questions about whether this belongs in a stable, standardized API surface before governance, safety tooling, and fallback behavior are mature. AI goes deeper into defense That uncertainty ties into a bigger, more speculative idea from Paul Kinlan: what if browser development itself gets reinvented by AI? Kinlan argues that as AI-assisted coding improves and models get faster—along with the hardware running them—browser vendors could shift from hand-building features toward a workflow driven by clearer specs and vastly more automated tests. In his vision, comprehensive test suites become the guardrails: if the spec is precise and the tests are exhaustive, AI systems can implement features more reliably, and vendors spend more time fixing failures and tracking spec changes than writing everything from scratch. Further out, he even imagines “instant generation” browsers that assemble capabilities in real time from intent plus device constraints—potentially shrinking the web platform into a minimal, secure runtime. It’s a fascinating future, but it comes with heavy baggage: security, privacy, provenance of generated behavior, and even whether a URL still means a consistent experience across regions and devices. Even if the far horizon never arrives, the near-term point stands: expect a lot more AI inside browser teams, and a lot more pressure on standards and tests to be unambiguous. Google staff push back on AI And if you build software for a living, here’s the practical companion to that idea: the new bottleneck isn’t writing code—it’s keeping the requirements from evaporating as assistants and agents churn through changes. One essay argues that the main failure mode is no longer “bad code,” but “lost requirements,” especially when context windows reset and handoffs happen. The proposed fix is simple but powerful: stable, numbered acceptance criteria that can be referenced across implementation and tests, so teams can talk about coverage of intent, not just file diffs. Another related push comes from Addy Osmani, who’s been promoting “agent skills”—structured checklists for agents that force the unglamorous steps: planning, tests, trust boundaries, reviewable pull requests, and evidence that changes are correct. The theme across both: if agents are going to write more of the code, humans will need sharper guardrails around what ‘done’ actually means. Alphabet challenges Nvidia valuation crown Speaking of AI moving from experiments to high-stakes environments: the US Department of Defense says it’s integrating advanced AI capabilities into sensitive—and even classified—cloud systems, with support from a roster of major tech providers. The Pentagon framed it as part of an “AI-first” acceleration strategy, with potential uses ranging from intelligence sorting to simulations and planning. This is significant not because it’s surprising the military wants AI, but because it signals operationalization inside the most restricted environments. That raises the bar for reliability, auditability, and human control—especially when decisions can escalate fast and consequences are irreversible. Power grids strain under AI Inside Google, that shift toward defense work is clearly not universally popular. More than 600 employees reportedly signed a letter urging CEO Sundar Pichai to block Pentagon use of Google’s AI in classified operations. It echoes the Project Maven protests from 2018, but the company’s posture appears different this time—more aligned with national-security contracting than pulling back. The report also paints a picture of a tighter internal climate, with workers describing restrictions on political discussi