The Chief AI Officer Show

Front Lines

The Chief AI Officer Show bridges the gap between enterprise buyers and AI innovators. Through candid conversations with leading Chief AI Officers and startup founders, we unpack the real stories behind AI deployment and sales. Get practical insights from those pioneering AI adoption and building tomorrow’s breakthrough solutions.

  1. AI Won't Break Your Security Program. Your Gaps Will.

    MAR 26

    AI Won't Break Your Security Program. Your Gaps Will.

    Most security leaders treat AI as a new threat category requiring new defenses. Rohit Parchuri, SVP and Chief Information Security Officer at Yext, pushes back hard on that. His argument: if your foundational controls are solid, AI does not require you to rebuild anything. What it does is amplify whatever you already have, gaps included, which makes the real question not "what new controls do we need?" but "how well are we actually executing on what we already built?" Rohit walks host Ben Gibert through how Yext is operationalizing this at scale: threat-modeling AI as just another system with inputs, processing, and outputs; building AI security testing directly into the existing CI/CD pipeline rather than standing it up as a separate track; investing heavily in data classification and taxonomy to solve DLP before deploying any AI tool internally; and establishing an AI Excellence Committee with cross-functional representation to run a single governance funnel across every AI request in the company. He also makes the case that the CISO who earns a seat at the AI strategy table is the one who deeply understands the business value chain, not just the threat landscape. Topics discussed: Threat-modeling AI as a system instead of a threat category Why existing security controls are sufficient for AI today Integrating AI security testing into CI/CD without adding process overhead Data classification and taxonomy as prerequisites for safe internal AI adoption Using an AI Bill of Materials as a transparency mechanism How Yext's AI Excellence Committee runs a single governance funnel Build vs. buy decision-making for AI security tooling What separates strategic CISOs from tactical operators in the age of AI The CISO's role in enabling AI adoption rather than blocking it

    45 min
  2. Building AI agents that fix production incidents before engineers wake up

    MAR 12

    Building AI agents that fix production incidents before engineers wake up

    Diamond Bishop has spent 15 years building AI systems at Microsoft (Cortana), Amazon (Alexa), and Facebook (PyTorch) before founding an AI DevOps startup that Datadog acquired. Now running Datadog's AI Skunk Works, a deliberately small interdisciplinary team modeled on Lockheed's original, he's focused on a question most enterprise AI teams aren't asking yet: what does your product look like if humans are no longer the primary customer? That question drives everything from Bits AI, their production SRE and security agent, to a set of longer-range bets organized around three pillars: personalized agent learning, enterprise agent infrastructure, and eval. Diamond breaks down how he structures each one, why the demo-to-production gap comes down to data and eval rather than model capability, and where the real unsolved problems in agent development still sit. Topics discussed: Bits AI's capabilities in production across SRE incident response, security analysis and code generation Three-pillar agent development framework: personalized learning, enterprise infrastructure and eval LoRA-style adapter architecture for layering custom per-user agents on top of first-party agents Why SRE agent startups without proprietary observability data face a structural disadvantage at production scale Service graph and entity relationship context as a structured alternative to RAG for DevOps agents Skunk Works team design: staying small and interdisciplinary to move like a startup inside a public company The shift from human-operated cloud services to ambient AI-native services built to run with fewer humans over time Crawl-walk-run path for enterprise agent adoption: from LangGraph-based Python agents to continuously learning systems Why concentrating AI research investment in transformer scaling creates long-term architectural risk Building agent-native tooling rather than repurposing interfaces designed for humans

    42 min
  3. How Xoriant ties compensation to AI metrics: The revenue, margin, and brand multiple framework

    FEB 26

    How Xoriant ties compensation to AI metrics: The revenue, margin, and brand multiple framework

    Most enterprise AI initiatives die in pilot purgatory because organizations chase peripheral use cases instead of embedding AI into core business processes. Vineet Moroney, Chief Transformation Officer at Xoriant, a 6,000-person engineering services firm, has built a measurement system that eliminates this problem: tie AI directly to three financial metrics (revenue, margin, brand multiple) and make 50% of performance bonuses dependent on them. His framework separates AI revenue into two categories: "with AI" (AI-led service transformation like platform modernization) and "for AI" (building AI capabilities on customer platforms). AI margin captures efficiency gains from tool usage that improve project delivery economics. AI multiple quantifies brand value and downstream revenue from innovative deployments. This structure forces teams to distinguish between projects that matter and expensive experiments. When Xoriant's CFO wanted to reduce Days Sales Outstanding, Vineet built an invoice payment prediction model at 87% accuracy that eliminated a five-person AR team and cut DSO by two days. The solution required no expensive models, just strategic business case selection. For manufacturing clients, he's deploying edge AI on legacy sensor infrastructure for predictive maintenance without sensor replacement, creating new service revenue streams from installed equipment bases. Topics discussed: Three-part AI revenue model distinguishing "with AI" service transformation from "for AI" capability building on customer platformsCompensation structure allocating 50% of performance bonuses across AI revenue generation, margin improvement, and brand multipleThe EXB framework quantifying AI returns through efficiency gains, experience improvements via customer lifetime value, and business impact from downstream revenueTwo-week POC to 90-day production methodology with AI assurance testing protocols for non-deterministic system validationFive prerequisite elements for POC survival: strategic alignment, C-suite sponsorship, urgent business need, allocated budget, and core process focusEdge AI monetization on legacy sensor infrastructure for predictive maintenance and service offering creation without hardware replacementInvoice payment prediction at 87% accuracy reducing five-person AR teams to single-person operations while cutting DSO by two daysWhy golden dataset POCs fail at scale due to latency, inconsistency, and infrastructure readiness gapsSales approach for skeptical executives: lead with customer pain points, prove with similar completed work, commit to rapid production timelinesMiddle management resistance as the primary adoption barrier despite CEO enthusiasm and junior staff willingness to adopt AI tools

    47 min
  4. The infrastructure mistake that kills AI pilots: Why sandboxes can't reach enterprise data centers

    FEB 12

    The infrastructure mistake that kills AI pilots: Why sandboxes can't reach enterprise data centers

    Lenovo cut parts planning from six hours to 90 seconds by treating infrastructure architecture as a first-class constraint, not an afterthought. Linda Yao, VP and GM of Hybrid Cloud and AI Solutions, has deployed AI across manufacturing, healthcare diagnostics, and enterprise operations. Her core thesis: most organizations fail at scale not because of use cases or data quality, but because they architect pilots in sandboxes that can't translate to production enterprise data centers. Through Lenovo's internal deployments and customer implementations, Yao has built a systematic approach to moving past experimentation. Her team developed what they call an AI library of battle tested use cases with proven deployment architectures, from computer vision systems that augment special education therapists to diagnostic tools preventing blindness in underserved regions. The methodology centers on a critical insight: ongoing monitoring and model management represents the capability gap causing implementations to plateau after initial deployment. Topics discussed: Five-stage methodology where ongoing monitoring of drift, model updates, and agent evolution separates successful deployments from stalled pilots Infrastructure architecture coherence requirement between pilot and production environments to enable actual scaling Enterprise planning agents orchestrating across personal wellness, workload management, and digital employee experience using full device stack ownership AI factory model for rapid diagnostic tool development and field distribution in resource constrained healthcare settings Hybrid deployment trend reversing decade long cloud first mentality due to data governance and compliance requirements Four pillar readiness assessment covering security, data quality, people capability, and technology infrastructure before deployment Build leverage partner philosophy for full stack integration with pre tested component validation and reference architectures Liquid cooling technology deployment addressing GPU energy consumption and data center sustainability constraints at scale

    44 min
  5. How incident.io built AI agents that draft code fixes within 3 minutes of an alert

    JAN 29

    How incident.io built AI agents that draft code fixes within 3 minutes of an alert

    Lawrence Jones, product engineer at incident.io, describes how their AI incident response system evolved from basic log summaries to agents that analyze thousands of GitHub PRs and Slack messages to draft remediation pull requests within three minutes of an alert firing. The system doesn't pursue full automation because the real value lies elsewhere: eliminating the diagnostic work that consumes the first 30-60 minutes of incident response, and filtering out the false positives that wake engineers unnecessarily at 3am. The core architectural decision treats each organization's incident history as a unique immune system rather than fitting generic playbooks. By pre-processing and indexing how a specific company has resolved incidents across dimensions like affected teams, error patterns, and system dependencies, incident.io generates ephemeral runbooks that surface the 3-4 commands that actually worked last time this type of failure occurred. This approach emerged from recognizing that cross-customer meta-models fail because incident response is fundamentally organization-specific: one company's SEV-0 is an airline bankruptcy, another's is a stolen laptop. The engineering challenge centers on building trust with deeply skeptical SRE teams who view AI as non-deterministic chaos in their deterministic infrastructure. Lawrence's team addresses this through custom Go tooling that enables backtest-driven development: they rerun thousands of historical investigations with different model configurations and prompt changes, then use precision-focused scorecards to prove improvements objectively before deploying. This workflow revealed that traditional product engineers struggle with AI's slow evaluation cycles, while the team succeeded by hiring for methodical ownership over velocity. Topics discussed: Balancing precision versus recall in agent outputs to earn trust from SRE teams who are "hardcore AI holdouts" Pre-processing incident artifacts (PRs, Slack threads, transcripts) into queryable indexes that cross-reference team ownership, system dependencies, and historical resolution patterns Model selection strategy: GPT-4.1 for cost-effective daily operations, Claude Sonnet for superior code analysis and agentic planning loops Backtest infrastructure that reruns thousands of past investigations with modified prompts to objectively validate changes through scorecard comparisons Building ephemeral runbooks by extracting which historical commands and fixes worked for similar incidents, filtered by what the organization learned NOT to do in subsequent incidents Prioritizing alert noise reduction over autonomous remediation because the false positive problem has clearer ROI and lower risk Why AI engineering teams fail when staffed with traditional engineers optimized for fast feedback loops rather than tolerance for non-deterministic iteration Building entirely custom tooling in Go without vendor frameworks due to early ecosystem constraints and desire for native product integration The evaluation problem where only engineers who invested hundreds of hours building a system can predict how prompt changes cascade through multi-step agentic workflows

    45 min
  6. Building AI agents for infrastructure where one mistake makes Wall Street Journal headlines

    JAN 16

    Building AI agents for infrastructure where one mistake makes Wall Street Journal headlines

    Alexander Page transitioned from sales engineer to engineering director by prototyping LLM applications after ChatGPT's launch, moving from initial prototype to customer GA in under four months. At Big Panda, he's building Biggy, an AIOps co-pilot where reliability isn't negotiable: a wrong automation execution at a major bank could make headlines. Big Panda's core platform correlates alerts from 10-50 monitoring tools per customer into unified incidents. Biggy operates at L2/L3 escalation: investigating root causes through live system queries, surfacing remediation options from Ansible playbooks, and managing incident workflows. The architecture challenge is building agents that traverse ServiceNow, Dynatrace, New Relic, and other APIs while maintaining human approval gates for any write operations in production environments. Page's team invested months building a dedicated multi-agent system (15-20 steps with nested agent teams) solely for knowledge graph operations. The insertion pipeline transforms unstructured data like Slack threads, call transcripts, and technical PDFs with images into graph representations, validating against existing state before committing changes. This architectural discipline makes retrieval straightforward and enables users to correct outdated context directly, updating graph relationships in real-time. Where vector search finds similar past incidents, the knowledge graph traces server dependencies to surface common root causes across connected infrastructure. Topics discussed: Moving LLM prototypes to production in months during GPT-3.5 era by focusing on customer design partnershipsEvaluating agentic systems by validating execution paths rather than response outputs in non-deterministic environmentsBuilding tool-specific agents for monitoring platforms lacking native MCP implementationsArchitecting multi-agent knowledge graph insertion systems that validate state before write operationsImplementing approval workflows for automation execution in high-consequence infrastructure environmentsDesigning RAG retrieval using fusion techniques, hypothetical document embeddings, and re-representation at indexingScaling design partnerships as extended product development without losing broader market applicabilitySeparating read-only investigation agents from write-capable automation agents based on failure consequence modeling

    47 min
  7. ACC’s Dr. Ami Bhatt: AI Pilots Fail Without Implementation Planning

    12/18/2025

    ACC’s Dr. Ami Bhatt: AI Pilots Fail Without Implementation Planning

    Dr. Ami Bhatt's team at the American College of Cardiology found that most FDA-approved cardiovascular AI tools sit unused within three years. The barrier isn't regulatory approval or technical accuracy. It's implementation infrastructure. Without deployment workflows, communication campaigns, and technical integration planning, even validated tools fail at scale. Bhatt distinguishes "collaborative intelligence" from "augmented intelligence" because collaboration acknowledges that physicians must co-design algorithms, determine deployment contexts, and iterate on outputs that won't be 100% correct. Augmentation falsely suggests AI works flawlessly out of the box, setting unrealistic expectations that kill adoption when tools underperform in production. Her risk stratification approach prioritizes low-risk patients with high population impact over complex diagnostics. Newly diagnosed hypertension patients (affecting 1 in 2 people, 60% undiagnosed) are clinically low-risk today but drive massive long-term costs if untreated. These populations deliver better ROI than edge cases but require moving from episodic hospital care to continuous monitoring infrastructure that most health systems lack. Topics discussed: Risk stratification methodology prioritizing low-risk, high-impact patient populations Infrastructure gaps between FDA approval and scaled deployment Real-world evidence approaches for AI validation in lower-risk categories Synthetic data sets from cardiovascular registries for external company testing Administrative workflow automation through voice-to-text and prior authorization tools Apple Watch data integration protocols solving wearable ingestion problems Three-part startup evaluation: domain expertise, technical iteration capacity, implementation planning Real-time triage systems reordering diagnostic queues by urgency

    45 min
5
out of 5
11 Ratings

About

The Chief AI Officer Show bridges the gap between enterprise buyers and AI innovators. Through candid conversations with leading Chief AI Officers and startup founders, we unpack the real stories behind AI deployment and sales. Get practical insights from those pioneering AI adoption and building tomorrow’s breakthrough solutions.

You Might Also Like