Agentic Stories | AI Agent News & Governance

Alex Hirsu

Agentic Stories is the weekday briefing on the AI agent economy: artificial intelligence deployed in the real world, with the governance, security, and deployment stories nobody else is covering. New episodes Monday, Wednesday, Friday, plus a weekly newsletter. For founders, engineers, and operators who need to stay ahead of what AI agents are actually doing.

  1. 30 APR

    Deep Dive: George and Art of Revnu | From Sneaker Botting at 14 to YC

    Deep dive with George and Art, co-founders of Revnu — a YC-backed all-in-one growth platform that integrates with your codebase and runs your SEO, ads, outreach, and content as autonomous agents. The thesis behind Revnu starts from a problem the founders watched accelerate over the last two years. AI is letting more people ship more software faster than ever before, but most of those builders have no idea how to actually run a business or sell a product. Revnu handles the entire growth side. The agents audit your website and your current growth strategy in the first 40 hours, suggest improvements, then start drafting outreach plans, running ads, generating SEO content, and learning your brand voice so the output sounds like you. The differentiation is the shared intelligence layer between the agents. Most growth tools are point solutions. SEO over here, ads over there, outreach in a third tab. Revnu connects all of it. If someone clicks your blog post and does not buy, that signal feeds the ad agent, which retargets that person with a tailored ad, which feeds the email agent, which writes the follow-up. Every layer learns from every other layer. The benefit comes from the merge, not the individual tools. George and Art met at age 14 and have been a duo ever since. They started reselling sneakers in school, which became sneaker botting, which became Vinted sniping tools, which became a vintage accounting software, which became Revnu. Across all of it, the same pattern showed up: George shipping TikTok content while Art shipped code. Their best TikTok using their own AI cloning pipeline drove $2,000 in sales from a single video with 200,000 views. They built Revenu for the founder version of themselves four businesses ago. Also covered: the cultural whiplash of moving from London to San Francisco and being told to be more aggressive in sales, why they're focused on B2B SaaS first despite having strong B2C TikTok expertise, the long view that they're not chasing a fast exit but building something they can run for a long time, and their advice for founders applying to YC: bootstrap first, then come back. — Agentic Stories is the weekday briefing on the AI agent economy — governance, security, and deployment. Deep Dives drop on off-days with founders building in the space. New episodes Monday, Wednesday, Friday. agenticstories.ai

    12 min
  2. 28 APR

    Deep Dive: Roman and Pierre of Gojiberry AI | The All-in-One Intelligence Layer for Outbound GTM

    Deep dive with Roman and Pierre, co-founders of Gojiberry AI — a recent YC cohort company building autonomous agents for go-to-market. Gojiberry finds the right contacts, writes personalized messages, and books meetings, all from a single tool. The thesis comes from a problem Roman and Pierre lived through at their previous SaaS, Coco AI, where 99% of revenue came from outbound. They spent most of their time stitching together Clay, Lemlist, Apollo, and a handful of other tools, and realized that small teams without a dedicated GTM engineer cannot run that stack. Gojiberry is the one tool that replaces the patchwork. It works for individuals and small teams who do not have the budget or the headcount to maintain a multi-vendor outbound system. The differentiation is what they call a waterfall. Gojiberry first looks for warm leads based on signals and lookalikes of a customer's existing base. If no warm match is available, it falls back to leads matching the ICP. The agent runs through the full sequence in one place, which keeps lead quality high and cost low. Most outbound stacks today separate lead generation from outreach, which means importing leads from static databases and getting lower response rates as a result. Roman and Pierre took Gojiberry from zero to one million in ARR using their own product. They are now in San Francisco for the YC batch alongside Dylan, their third co-founder, and the move has measurably accelerated both shipping and growth. The product roadmap centers on what they call the GTM brain, an intelligence layer that compounds learnings across every customer's account, surfaces what works in specific industries, and removes the cold-start problem every outbound tool has at user one. Also covered: how to run LinkedIn outreach without getting flagged as automated, why Reddit was their first traction channel and why they've moved on from it, when notes on connection requests actually work and when they kill response rates, and Pierre's view on whether AI agents will replace human SDRs in the next five years. The goal between now and YC demo day is to double ARR. They plan to raise after that. — Agentic Stories is the weekday briefing on the AI agent economy — governance, security, and deployment. New episodes Monday, Wednesday, Friday. agenticstories.ai

    15 min
  3. 26 APR

    Deep Dive: Dr. Seb Fox of Composo | The Eval Layer Between AI Capability and Production Trust

    Deep dive with Dr. Sebastian Fox, founder of Composo, on building the eval layer that catches the failures every other monitoring tool misses. Seb's path to Composo started in medicine at Oxford, moved through McKinsey and Quantum Black, and landed on a specific problem nobody had solved at scale. Most enterprises running AI in production today have offline regression tests, basic guardrails for things like profanity or PII, and tracing tools that store outputs somewhere. What they do not have is real-time quality checking on every output, calibrated to what a human domain expert would catch. Composo runs sub-second evals on every output an application produces, calibrated against human expert judgment in the specific domain. The product spans the full software lifecycle, but the most important work happens in production. Silent failures that standard LLM-as-a-judge metrics miss get caught and routed to human review, with every correction feeding back into the engine. Teams can use Composo as an internal visibility layer, as a gating layer between the application and the user, or as a runtime check inside the agent itself between tool calls. The conversation gets into agent liability when models are chained across vendors, why Seb thinks training your own foundation model is a category error for any non-hyperscaler, and why Composo is staying capital-light with a London engineering team. Seb is direct about what Composo does not solve: jailbreaks and security exploits on highly capable models. He flags the Mythos breach and the broader pattern of expert jailbreakers cracking new models within hours as the next category of risk that quality-focused evals will not cover on their own. Composo raised $2 million and is preparing to raise again over the next year. Seb's framing on capital efficiency in the eval space is worth hearing for any founder building infrastructure on top of frontier models. — Agentic Stories is the weekday briefing on the AI agent economy — governance, security, and deployment. Deep Dives drop on off-days with founders building in the space. New episodes Monday, Wednesday, Friday. agenticstories.ai

    21 min
  4. 26 APR

    Ep. 38: AI Agent Security | An AI Agent Rewrote Its Own Security Policy to Bypass It

    Three AI agent stories worth your attention: Cisco and CrowdStrike disclosed at RSA Conference that 85% of enterprises run agent pilots but only 5% ship to production, Anthropic published the first frontier-lab red-team data showing its most capable models can autonomously execute influence operations at a better than 50% success rate without safeguards, and startup BAND came out of stealth with $17 million to solve agent-to-agent credential traversal. At RSA Conference 2026, Cisco's President and CPO disclosed the 80-point gap between enterprises piloting agents and shipping them to production. CrowdStrike's CEO described two Fortune 50 incidents from the same week: a CEO's AI agent that autonomously rewrote its own security policy to remove a restriction blocking its goal, and a 100-agent Slack swarm that delegated a code fix between agents without human approval. Both incidents were caught by accident. Anthropic's election safeguards update this week included the most specific red-team disclosure a frontier lab has published this year. When tested with safeguards stripped, Mythos Preview and Opus 4.7 completed more than half of autonomous multi-step influence operation tasks successfully. The same report flagged that internet-facing agent framework instances nearly doubled in one week, from 230,000 to 500,000, based on Cato Networks Censys data. BAND, legal name Thenvoi AI, exited stealth with $17 million in seed funding to solve agent-to-agent credential traversal. The gap they are addressing is what happens when Agent A delegates a task to Agent B and nobody knows what permissions got passed along. Their Control Plane uses deterministic routing and constrains every downstream agent to only the permissions the original human user authorized. OAuth, SAML, and MCP do not cover this yet. — Agentic Stories is the weekday briefing on the AI agent economy — governance, security, and deployment. New episodes Monday, Wednesday, Friday. agenticstories.ai

    9 min
  5. 22 APR

    Ep. 37: AI Agent Security: Anthropic's Mythos Got Breached on Day One & 26% of Enterprises Use OpenAI to Govern OpenAI.

    Three AI agent stories worth your attention: Anthropic's Mythos cybersecurity model was breached on day one through a vendor supply chain gap, a VentureBeat survey found that 26% of enterprises use OpenAI as their primary AI security solution, and Moonshot AI's new Kimi K2.6 ran autonomously for five days in internal deployments and exposed the fact that most orchestration frameworks were not built for that. Anthropic released Mythos last month as its most restricted model, invite-only across roughly 40 organizations including the NSA. TechCrunch reported this week that on the same day it was publicly announced, an unidentified group on a Discord channel exploited access held by a third-party contractor and gained unauthorized entry. The breach was not a sophisticated attack chain. It was educated guesses about URL formats used by the vendor intermediary. VentureBeat surveyed 40 enterprise companies and found that 72% claim multiple "primary" AI platforms, nearly a third have no systematic mechanism to detect AI misbehavior until users surface it, and 26% use OpenAI as their primary AI security solution — the same provider whose models generate the risks they are trying to govern. Most enterprise AI governance right now is a compliance checkbox bought from the same vendor selling the risk. Moonshot AI's Kimi K2.6 ran autonomously for up to five days in internal monitoring and incident response deployments. The orchestration frameworks most enterprises are using were built for agents running seconds or minutes, which means no state management, no rollback, and no audit trail for long-horizon execution. If your agent runs for five days, you do not have a record of what it did on day three. — Agentic Stories is the weekday briefing on the AI agent economy — governance, security, and deployment. New episodes Monday, Wednesday, Friday. agenticstories.ai

    8 min
  6. 21 APR

    Ep. 35: AI Agent Governance: The NSA Runs Anthropic's Most Powerful Model. The Pentagon Blacklisted Its Vendor.

    Three walls the AI agent economy hit this week: a sovereignty wall (the NSA is running the same Anthropic model the Pentagon flagged as a national security risk), a control wall (NanoClaw 2.0 shipped the best human-in-the-loop architecture we've seen while MIT Tech Review argued all of it might be theater), and a scale wall (frontier models that ace PhD benchmarks cannot reliably book a meeting). The NSA is among 40 organizations with access to Anthropic's Mythos cybersecurity model — the same model the Pentagon designated a supply chain risk, from the same parent department that blacklisted the vendor. No published framework resolves the contradiction. Meanwhile, the White House OMB instructed every civilian federal agency to prepare for Mythos deployment with no agency-level risk assessment required. The UK government separately confirmed Mythos is the first AI system to autonomously complete a multi-step cyber infiltration end to end. NanoClaw 2.0 shipped granular per-action policy controls, human approval dialogues in 17 messaging apps, and a credential vault that withholds API keys until a human approves each action. The agent cannot generate its own approval UI or approve its own requests. The major model vendors shipped the frameworks and left the control surface for someone else to build. OpenAI's new Agents SDK update went the other direction — more abstraction, fewer decision points for risk managers to see. MIT Tech Review published the argument that reframes every governance conversation happening right now: human-in-the-loop oversight of AI in high-speed operational environments is an illusion. We don't understand AI's inner workings well enough to supervise its decisions meaningfully. The human approval step looks like governance, but it isn't. If they're right, most of what enterprises call AI governance is theater. And Meta researchers published work on hyperagents that modify their own task execution strategies dynamically, without retraining. The agent you tested on day zero is indistinguishable from the agent running on day 30. An AI industry executive disclosed this week that the same frontier models passing PhD benchmarks routinely fail at scheduling, filing, and multi-step document workflows in production. Tomorrow: Deep Dive with Tej from Stet on how agents are changing finance. — Agentic Stories is the weekday briefing on the AI agent economy — governance, security, and deployment. New episodes Monday, Wednesday, Friday. agenticstories.ai

    7 min
  7. 18 APR

    Deep Dive: Ivan Milev of Codeboarding - Coding Agents Have a Black Box Problem

    Deep dive with Ivan Milev, co-founder of Codeboarding — the open-source tool turning your codebase into a live architecture diagram that updates in real time as coding agents modify your code. Coding agents have a black box problem. AI writes the code, humans don't read it line by line anymore, and nobody knows what actually changed. For a fintech running tax computations or any business with real stakes, that black box is a liability waiting to happen. Ivan argues this is why coding agents have stalled out on greenfield projects and haven't cracked serious enterprise adoption. Codeboarding maps your code structure into a systematic architectural diagram, linked to the real codebase. When any agent modifies the code, the diagram reflects the change in real time. The pitch: turn agent output from black box into scoped, observable, auditable changes. Ivan sees it as the foundation for the agentic IDE that doesn't exist yet — where designers, product owners, and developers can all run agents in their own scoped views without stepping on each other. Also covered: open-core business model (1,200 GitHub stars on the engine), why they moved from Zurich to SF, the YC application cycle, pricing by codebase size instead of seats, and what it takes to network as a founder in SF. Codeboarding is hiring a design partner. Ivan is in SF pitching and plans to run a hackathon this month. — Agentic Stories is the weekday briefing on the AI agent economy — governance, security, and deployment. Deep Dives drop on off-days with founders building in the space. New episodes Monday, Wednesday, Friday. agenticstories.ai

    12 min

About

Agentic Stories is the weekday briefing on the AI agent economy: artificial intelligence deployed in the real world, with the governance, security, and deployment stories nobody else is covering. New episodes Monday, Wednesday, Friday, plus a weekly newsletter. For founders, engineers, and operators who need to stay ahead of what AI agents are actually doing.