Techy Surgeon Podcast

Christian Pean MD, MS

Decoding AI, health tech & policy transforming healthcare—practical playbooks for clinicians, operators, & builders, from the OR to the boardroom. techysurgeon.substack.com

  1. Jun 6

    You're Prompting Wrong: A Claude Cowork session with Dr. Christian Péan

    Thank you to everyone who tuned into my live video! Join me for my next live video in the app. Prompt With CARE What watching someone else prompt taught me about the skill we all think is basic—plus two prompt packs that set up your Claude Cowork in one afternoon Yesterday I asked a student to sit at my computer and prompt Claude Cowork to build a presentation. He typed a single sentence, hit enter, and waited. Watching him, I realized two things: the conversational interface with these tools is a genuine skill, and I’d been doing it so long I’d forgotten it had to be learned. I teach people AI tools constantly—I had never once sat down and watched someone else prompt. It was the most instructional thing I’ve done in months. So this piece is the session I ran afterward, in writing. One framework, one habit loop, and—because readers keep telling me prompt packs are the most useful thing I publish—two packs you can paste into Claude Cowork today: a Cold Start Interview that sets up your workspace from scratch, and five Workflow Interviews that teach it your voice, your week, and your standards. Nothing here is specific to medicine. If you have a job and a desk, this applies. Housekeeping: I don’t type my prompts—I speak them. You will never type as fast as you talk, and the more descriptive your language, the better your output. I use Wispr Flow (function + spacebar, then just talk—you can literally whisper). If you want to try it, downloading through my affiliate link supports this newsletter. The Frappuccino Problem A close friend of mine has a habit that drives me a little crazy. He’ll say, “I don’t like the way AI writes X.” or he’ll say “AI never makes my images or powerpoints right”. And every time, I think: it’s not that you don’t like what AI makes. It’s that you don’t like what AI makes for you…because you’re giving it poor instructions. Think of these tools as a very smart intern. Maybe a PhD-level Harvard graduate of an intern. If I tell that intern “go get me coffee,” and they come back with a Frappuccino when I wanted an Americano with oat milk, that’s on me. The intern did their best with what I gave them. A language model is the same: every detail you leave out, it fills in with the most average assumption available. Vague prompts don’t produce bad output. They produce median output, which for any real professional task is the same thing. The fix is a framework I call prompting with CARE: C — Context. Who you are and what situation this output lives in. A — Audience. Who actually reads, sees, or uses this. R — Role. Who the AI should be—”you are a patient educator,” “you are a skeptical CFO”—so it pulls the right body of knowledge into the task. E — End product. The exact artifact you want: format, length, tone, what it should be usable for the moment it’s done. In my live session I demonstrated this with two prompts about the same topic. “Tell me about hip fractures” got me a long, jargon-heavy wall of text. The CARE version: “I’m a trauma surgeon, fewer than a third of my patients ever start bone-protecting treatment after a fracture, you’re a patient educator, write for families at a sixth-grade reading level, one-page handout, warm and non-alarmist, end with three questions to ask your doctor” This prompt instead produced something I could physically hand to a patient that afternoon. Same model. Same topic. The difference was entirely in the briefing. Your First Output Is the Input The second principle, and the one that separates people who get compounding value from people who quit: the first output is not the final output. It’s the input for the next cycle. You will be very frustrated if you try to one-shot these tools. The entire power of a large language model is the compression of iteration cycles: you get unlimited at-bats, and each swing takes seconds. When my bone-health handout came back and I didn’t love it, I didn’t start over. I said what needed to change. Make it Duke-branded, add interactive components, also give me a PowerPoint I can click through. And the output became the raw material for something better. Later I dragged that same HTML into a fresh Cowork task and rebranded it for RevelAi in one prompt. Output becomes input. That’s the loop. Two force multipliers on that loop. First, speak instead of typing. Being verbose is usually a weakness of mine; it turns out it’s a superpower with these models, because descriptive, rambling, context-rich instruction is exactly what they reward. You’re not on a timer, and nobody’s grading your dictation. Second, meta-prompting: when the end product demands a better description than you can produce—a detailed image prompt, say—don’t write the prompt yourself. Ask the tool to write the best-practice prompt for you, then carry it wherever you need it. The model is better at describing what the model needs than you are. The Step Most People Never Take Everything above makes a single task go well. The better approach is making sure you never start from zero again. This is what I think of as building sustainable context systems. The low-hanging fruit is Projects: a folder where your chats, files, and outputs accumulate so the tool carries the thread across sessions. I run my academic promotion tracker, cap table, and investor touchpoints this way connected with Live Artifacts. Learn more about Scheduled Tasks and Live Artifacts Below: But the foundation underneath all of it—the thing I do with every person I onboard one-on-one—is to have Claude interview you. Instead of laboring to write the perfect context document about yourself, you flip it: make the model ask you questions, one at a time, until it understands your role, your projects, your people, and your standards. Answering questions is cognitively cheap. Composing instructions is expensive. And in Cowork, the answers don’t evaporate when the chat ends—they become standing infrastructure, so every future task starts with the CARE already half-written. The question is what the interview should cover. That’s the part most people improvise badly. Below the line are the exact packs I give people in one-on-ones—six prompts that stand up your workspace from nothing, and five that teach it your recurring workflows. The two prompt packs below are for paid subscribers. Pack One stands up a Claude Cowork workspace from scratch in about thirty minutes—it’s the same starter kit I use when I onboard people one-on-one. Pack Two teaches your Cowork your writing voice, your calendar, and your definition of done. Copy, paste, answer the questions out loud and you’re all set. As a bonus, in the paid section… It seems like many people are a fan of the one-shot video note that I wrote recently. I also included a video of me installing the Higgsfield MCP. Techy Surgeon is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber. Prompt Pack One: The Cold Start Interview Run these in order, in Claude Cowork, ideally in one sitting. Each prompt instructs Claude to interview you and then write what it learns to memory—so say “save this” when the summary looks right. Total time: about thirty minutes. Less if you dictate. 1 — The Role Brief You’re setting up as my long-term work assistant, and right now you know nothing about me. Interview me, one question at a time, until you understand: what my job actually is day to day (not the title—the verbs), who I’m accountable to, what I produce, and what a great week versus a bad week looks like. Don’t accept vague answers—push for specifics. When you’re confident, write a one-page working profile of me, show it to me for corrections, and save it to memory. 2 — The Project Map Interview me about everything I’m currently working on—projects, initiatives, recurring responsibilities, and the things that are stalled but still mine. One question at a time. For each, capture: what it is, current status, who else is involved, the next milestone, and what I’m most worried about. Then organize it into a project map ordered by what deserves my attention, show me, and save it to memory. 3 — The People Directory I’m going to mention names, nicknames, and abbreviations constantly, and I never want to explain them twice. Interview me about the people and organizations in my work orbit: who they are, their role relative to me, how I refer to them in shorthand, and anything you should know before drafting something they’ll read. One question at a time, and prompt me with categories I might forget—boss, reports, clients, vendors, collaborators, the person I always forward things to. Save the directory to memory. 4 — The Preference Card Interview me about my output preferences so you stop guessing. Cover: how long answers should be by default, bullet points versus prose, how formal my drafts should run, formatting pet peeves, words and phrases I’d never use, and how I want you to behave when you’re uncertain—guess, ask, or give options. Get concrete by showing me two short sample paragraphs in different styles and asking which is closer. Save the result as my standing preference card. 5 — The Definition of Done For each type of deliverable I produce regularly—you have the list from my project map—interview me about what “finished” means. What does a ready-to-send email have that a draft doesn’t? What makes a document ready for my boss versus ready for a client? Build me a short rubric per deliverable type, confirm it with me, and save it. From now on, check your own work against these rubrics before showing me anything. 6 — The Standing Orders Last one. Interview me about the rules of engagement: things you should always do without asking, things you should never do without asking, topics where I want pushback versus execution, an

    35 min
  2. May 2

    The AI Content Flywheel: How I Build an Audience as a Surgeon Without Sounding Like a Robot

    This is a free preview of a paid episode. To hear more, visit techysurgeon.substack.com Thank you Doug Fullington, MD, Alex Rivero, HealthMind Insights, Eric Burgh, MD, and many others for tuning into my live video! Join me for my next live video in the app. I can publish a paper in an upper-tier orthopedic journal and, if I’m being candid, a few hundred to maybe a few thousand people will read it. Some of those people will cite it in their own papers. A small number will change anything in their practice because of it. Then I share an insight about CMS payment model mechanics on Substack, and a health system CFO I’ve never met emails me because she’s been trying to articulate that exact problem to her board. A policy researcher in Geneva follows the thread. Three orthopedic surgeons I’ve never spoken to start a conversation about what I wrote that turns into an ongoing dialogue that has genuinely shaped how I think. That asymmetry in exposure is what motivated me to start writing in public. Not personal branding, not monetization, not the promise of newsletter revenue (though those have their own logic). The primary driver was reach and dialogue. The feeling that the ideas worth communicating in healthcare were reaching a fraction of the people who needed them, mostly because we’d built a science communication infrastructure that made academic publishing the gold standard and everything else secondary. AI changes the equation a bit. Not by doing the thinking. The thinking is still yours, and it’s still the most important part. But by handling enough of the operational needs that someone running a clinical practice, a research agenda, and a startup can also maintain a consistent presence in public. That’s what I want to share here: how I’ve structured that system, in enough detail that you can build your own version of it if it seems useful. What works for me may not work for you. I can’t promise to optimize your content calendar. But I wanted to describe infrastructure that has made the practice of writing in public sustainable for me, and that has produced a lot of unexpected good in the process. Why This Isn’t Primarily About Content Creation A minor but important framing note before the setup details: the value I’ve gotten from Techy Surgeon isn’t primarily the newsletter metrics. It’s the policymakers who’ve reached out. The research collaborators who found me because of a piece on care coordination. The clinicians who wrote to say that an article crystallized something they’d been trying to explain to administrators for years. The people building interesting companies in healthcare who wanted to connect because we were apparently thinking about similar problems from different angles. None of those connections would have happened if I hadn’t been willing to put my thoughts into a form that others could interact with. And it wasn’t a white paper or journal article that did that. It was something closer to a public conversation, where the format invites response and the distribution reaches people outside the academic bubble. I think medicine, and clinical research more broadly, is underinvested in this kind of communication. Not because clinicians don’t have things worth saying (clearly they do), but because the infrastructure for saying them publicly has been either unavailable, considered taboo, or too costly in time. AI is changing that. The skill of generating multimodal media whether written, visual, or video, is becoming something a motivated clinician can build and maintain without a full production team. We probably need more of us doing this. The Flywheel, Briefly Before the setup details, the concept: a content flywheel is a system where each piece of output feeds the next, and where the marginal cost of producing content decreases over time rather than remaining constant. The alternative is what most clinician-writers default to: the brute-force model, where every piece starts from scratch, involves a Sunday afternoon staring at a blank editor, and depends entirely on having energy left over from the clinical and research work. That’s a fragile system. It produces good work occasionally and nothing the rest of the time. The flywheel I’ve built runs on five loops: Ideate (keep a backlog so you never start from nothing), Research (get sources before you start writing, not after), Write (use AI constrained by your voice, not AI in its default state), Distribute (publish once, distribute many times across platforms), and Repurpose (one strong piece seeds two weeks of downstream content). The setup below is how I’ve operationalized each of those loops.

    5 min
  3. Apr 27

    Clinical AI Faceoff: OpenAI's ChatGPT for Clinicians vs OpenEvidence vs DoxGPT

    This is a free preview of a paid episode. To hear more, visit techysurgeon.substack.com Thank you to everyone who tuned into my live video! Join me for my next live video in the app. I went live at 6:45 the other morning to open three tabs, ChatGPT for Clinicians, Doximity GPT, and OpenEvidence, and ask them the same questions. A few dozen clinicians and subscribers joined at that hour on a Sunday, which I did not expect, and I’m grateful for. The headline finding isn’t who won. It’s that it seems soon you won’t be able to tell the three tools apart from the navigation bar. Each one now has an ambient scribe (or form of one). Each one tracks CME. Each one has a “skills” or “dot flows” tab that, today, mostly amounts to baked prompts dressed up as workflows. OpenEvidence has a feature literally called the dialer — Doximity has had a dialer for a decade. The product surface is converging fast. A quick disclaimer before we go further: opinions here are mine alone. I have no financial relationship with any of these companies. I selected these three because they appear to be getting the most traction in the marketplace — not because they’re the only ones worth your time. Up-to-Date Expert AI, Glass Health, Abridge’s embedded answering, and others all deserve their own look. What each tool is best at right now ChatGPT for Clinicians is the new entrant. Verification is rigorous — NPI, photo of a driver’s license, a ClearID face match — which I read as a deliberate credibility signal. Underneath, the experience is polished but the clinical answers were the weakest of the three on the queries I ran. There is a skills surface that hints at where this is going, but most of the entries today function as prompts rather than true agentic workflows. I did not see a Business Associate Agreement presented during signup, and I have not yet found a satisfying answer on PHI handling. Doximity GPT quietly has the best one-off clinical answers right now. Not by a wide margin, the others are good, but on a hip arthroplasty question and a DVT prophylaxis report, Doximity surfaced the PREVENT CLOT trial and the CRISTAL trial at the top of the response, where a domain expert would put them. For a clinician, citation prioritization is trust. Doximity also brings a distribution moat the others can’t replicate quickly — the dialer, fax, telehealth, the news network, and Peer Check (where physician experts grade the answers) — and a redesigned interface that’s the cleanest of the three. OpenEvidence has the lowest friction and the fastest latency. They are clearly throwing serious compute at the answer surface. The differentiator most clinicians never find is Deep Consult. Turn it on, answer two or three follow-up questions, and you get a research-grade brief with embedded figures from JAMA and NEJM, made possible by the licensing partnerships OpenEvidence has signed with NEJM Group and other major publishers. When I asked Deep Consult to brief me on secondary fracture prevention for a quality improvement committee, the output was something I could have walked into a department meeting with that morning. Distribution beats product when the products converge All three are free. All three answer questions credibly (ChatGPT least so). All three are racing to bolt on the same surrounding capabilities. Doximity wins on installed clinician base. OpenEvidence wins on speed, trajectory, raw capability and on Deep Consult. ChatGPT for Clinicians wins, today, on almost nothing — but the verification gate suggests they intend to be taken seriously, and they hold a foundational model and patient facing asset the others don’t. The chat interface is no longer the moat. The moat is whoever first connects grounded clinical evidence to native multimodal output, real workflow extensibility, and physician-earned trust without forcing the clinician to play copy-paste between four tabs to get there. This is the worst these tools will ever be. That should change how we evaluate them: I’m less interested in the question “does it work today?” , and more gravitating to “how do we shape what it becomes?” Date several. Marry none. Use the tool that fits the question in front of you. Send the teams behind them living, breathing feedback. On the Horizon None of these tools is built for patients. The updated guidelines on incidental hepatic steatosis answer ChatGPT gave me this morning was reasonable, and I am a bone surgeon. I read it the way a layperson would, and I would not stake decisions on it without help. The literacy gap between clinical outputs and patient comprehension is not a UX problem. It is a safety problem. Tearing down the gate before we have built tools that respect that gap is how we get harm. The administrative arms race — prior authorization letters, denial appeals, faster note-writing — is a symptom, not a cure. We went deep and fast on the workflows where the money lives, which are the workflows our payer infrastructure forces clinicians to spend their evenings on. That work is valuable, and it is not patient-facing. The places where AI could actually move outcomes — secondary fracture prevention, fall prevention, post-op care navigation, osteoporosis treatment rates that sit around 20% after a fragility fracture when the evidence base for treatment is overwhelming — are still under-resourced. Trust and co-design with clinicians is the unlock. OpenAI has not earned it yet for clinical use. Doximity and OpenEvidence have, in different ways, by being physician-forward from day one. That posture is not optional going forward — it is the moat. The path from clinical intelligence in your pocket to democratized, evidence-based care that actually moves quality and outcomes runs straight through clinicians willing to show up and iterate. That is the dream. We are not there. We can get there. Christian Péan, MD, MHS, is an orthopedic trauma surgeon in Durham, North Carolina. He is core faculty at the Duke-Margolis Institute for Health Policy and CEO and co-founder of RevelAi Health, an AI care management platform for value-based care. Opinions are his own. 🔒 For paid subscribers — the full demo and the operator’s notes The complete screen-share from the morning’s livestream. Side-by-side queries across all three tools, the Deep Consult walkthrough, the prior authorization generation, the acetabular fracture surgical-plan comparison, the live multimodal handoff into a branded HTML committee deck, the connector detour inside Claude, and the clinical trials map I discovered on stage.

    14 min
  4. Mar 22

    Claude Skills in the Clinic with Hadi Javeed, CTO and serial health tech founder

    Thank you Edward M. DelSole, MD, Danny Goldenberg, Audley Mackel III, Darren Michael, and many others for tuning into my live video with Hadi Javeed! Join me for my next live video in the app. The Skill Is the New Workflow Clinical AI won’t scale through better models. It will scale through better instructions. Interested in deploying clinical AI for your practice, value-based care organization, or health system? RevelAi Health partners with clinics and health systems to build AI workflows for CMS models (TEAM, ASM, ACCESS), care coordination, and clinical operations. We bring the software, the clinical expertise, and the AI-fluent staff to deliver outcomes, not just tools. Schedule a demo or reach us directly at hello@revelaihealth.com. A billboard on Market Street in San Francisco advertises “skills,” the hot new paradigm in AI development. Walk three blocks in any direction and you’ll find someone who can explain, in considerable detail, what a skill is, why it matters, and which framework implements it best. Fly to any hospital in the country and ask the same question. You’ll get a blank stare. This gap between what AI can do and what healthcare is doing with it has become the defining tension of clinical AI’s current era. The models are smart. Sixty-six percent of physicians now report using health AI tools, a 78% increase from 2023. Billions have been invested. Yet nearly four years after the ChatGPT moment, most large health systems still haven’t deployed a single patient-facing AI application beyond ambient documentation. The question is why. And the answer, increasingly, points not to the intelligence of the models but to the architecture around them: the instructions, the context, the workflows that translate raw capability into clinical utility. If you like deep dives on clinical AI and health policy, consider becoming a free or paid subscriber to Techy Surgeon! The Context Problem Nobody Wants to Admit This past weekend, my co-founder Hadi and I sat down for what we’ve been calling Founders Coffee, a live conversation on Substack about what we’re seeing in clinical AI, what’s working, what isn’t, and what comes next. Hadi brings a particular vantage point: before we started RevelAi Health together, he was one of the earliest applied AI engineers at Capital One, building voice AI for banking back in 2016, when the technology was, as he puts it, “not that cool and less practical.” The lessons from that era are uncomfortably relevant now. At Capital One, text-based chatbots found product-market fit. Voice did not. It got dates of birth wrong. It misread credit card numbers. And the core insight that emerged, one that the current wave of healthcare AI companies would do well to internalize, was deceptively simple: people hate chatbots. Not because the technology is bad, but because it fails to deliver unique value. Empathy for the sake of empathy, as Hadi noted, does not work. People engage with AI when it solves their problem. They disengage quickly, permanently, when it doesn’t. “People only would chat to a chatbot if it solves their needs,” Hadi said. “As long as the chatbot is not providing unique value, it does not work.” This observation lands differently in 2026 than it would have in 2016. Today, the models are dramatically more capable. But capability without context is just expensive latency. And in healthcare, the context lives behind a walled garden. The data gravity (the patient charts, the encounter histories, the medication lists, the imaging orders) sits in electronic health records. Epic. Cerner. Athena. And without that context flowing securely into AI systems, even the most sophisticated models are left prompting in the dark. As one survey found, hospitals on Epic had roughly 90% AI usage, while those on smaller EHR platforms averaged just 50%, a disparity that reveals how tightly AI adoption is coupled to infrastructure access. “AI is not the bottleneck,” Hadi argued. “It’s the context that’s the bottleneck right now. Models are pretty smart. But if you cannot get patient chart information securely into AI, you can do only enough.” What an AI “Skill” Means for Healthcare A skill, in this context, is a structured set of instructions that teaches AI how to perform a specific task when triggered by specific conditions. Think of it less as a prompt and more as a protocol manual for a very capable but context-dependent assistant. A prompt says: summarize this note. A skill says: whenever a patient mentions diabetes in an encounter, trigger a downstream workflow. Draft dietary counseling documentation for the staff. Generate a glucose monitoring plan. Prepare a patient-facing message at an appropriate reading level. Format all outputs according to this template. Ground clinical recommendations in these evidence-based guidelines. Hadi framed the clinical application nicely: “Healthcare workflows are very if-then-else logic. If BMI is 30, do this. If they have diabetes, go on this path. And traditionally with software systems, it was so hard to scale healthcare because who’s going to build this if-then-else logic? You’re going to rely on your dev team or maybe Epic consultants, and that takes forever.” Skills collapse that timeline. They translate clinical protocols (the ones that live in binders, in the heads of experienced nurses, in institutional memory that evaporates with staff turnover) into executable AI instructions. And critically, they can be built by clinicians, not engineers. You describe your workflow conversationally. The AI interviews you, iterates, produces the skill. You test it against real examples and refine. Looking to understand Claude’s skills better and see real-life examples? Check out this article below on meta-prompting (full article with in depth walkthrough) Consider the practical applications that emerged from our conversation: a pre-clinic screening skill that reviews a panel of patients before Monday morning, flags missing imaging orders, and surfaces relevant history in a style you specify. A prior authorization appeal skill that ingests a denial letter and produces a structured response matching the format that has historically succeeded with a specific payer. An independent medical examination skill that parses 6,000 pages of records into a timeline of treatment, imaging, and interventions, work that currently requires hours of manual review or a dedicated team. These aren’t hypothetical. We’re building and deploying versions of these at RevelAi Health right now, integrated with EHR data through FHIR resources, with the clinical team able to customize and test skills through a user interface rather than filing engineering tickets. The Compliance Reckoning There’s another thread from our conversation worth pulling. Earlier this month, allegations surfaced that Delve, a Y Combinator-backed compliance startup that had raised $32 million, allegedly generated 494 fabricated SOC 2 Type II reports for its clients. The reports were 99.8% identical boilerplate, with pre-written auditor conclusions filed before companies even submitted their evidence. The auditors Delve marketed as “US-based CPA firms” were traced to offshore operations using virtual addresses. The revelation emerged, almost poetically, because someone left a Google spreadsheet open to the internet. For health tech, this extends beyond a compliance scandal to become an ecosystem problem. Hundreds of companies, including health tech startups handling protected health information, may now hold invalid security certifications. The ripple effects will tighten an already rigorous procurement environment at a moment when health system CIOs were only beginning to open the door to smaller vendors. “You can’t outsource security responsibility,” Hadi said. “If someone is trusting you with their patient data, you have a huge responsibility to protect it. Security and compliance is not a cost center. It’s the most important foundational thing you have to do.” We felt the FOMO ourselves at RevelAi. We went through Vanta, checked every box, invested heavily in governance, and watched competitors claim they completed SOC 2 Type II in three weeks. The temptation to move faster was real. But in healthcare, the “move fast and break things” mantra will also break your company. We’ve watched it happen. Babylon, once valued at $4.2 billion, collapsed in 2023. Olive AI, valued at $4 billion, shut down the same year. The outward appearance of success, it turns out, is often inversely correlated with the rigor underneath. Curious about the tools that I use to put together Techy Surgeon and leverage AI to improve my personal productivity? Check out my article below — The Clinician Founder’s AI Stack. Where the Bridges Are Being Built Not everything is stalled. The interoperability landscape is shifting, unevenly but meaningfully. Athena has emerged as an unlikely leader. At HIMSS 2026, the company previewed an industry-first Model Context Protocol server, infrastructure that allows AI agents to securely access patient chart data in real time. They’re building athenaConnect, an intelligent interoperability layer connecting 170,000 providers serving 20% of the U.S. population. This matters enormously. Model Context Protocol (MCP) is what makes skills practical at scale. It’s the plumbing that lets an AI agent not just follow instructions but access the clinical context those instructions require. When Hadi built a FHIR integration with Cerner’s proprietary APIs, it took him one hour using skill-based development. Previously, that work took two weeks. That’s the offline version, engineers using skills to accelerate code. The online version, where skills execute in real time against live patient data, is coming but isn’t here yet in production. Anthropic, notably, has published a FHIR skill on their marketplace, the

    1h 5m
  5. Mar 10

    What a $110 Million ACO Actually Looks Like From the Inside: The VBC Operator's Playbook with Sarah Habeeb, MHA

    Thank you Hadi Javeed, Mike Logan, MD, Rachel, and many others for tuning into my live video with Sarah Habeeb! Join me for my next live video in the app. 🗓️ TEAM Connect Virtual Summit — April 30, 2026 A full-day virtual event for hospital leaders navigating TEAM implementation. Speakers from Mass General Brigham, CommonSpirit, AdventHealth, Duke-Margolis, and more. I’ll be presenting on technology-driven care coordination. Early bird registration closes March 20th. → Reserve your spot at apmconnect.com/virtual-summit Sarah Habeeb is System Director for Medicare Value-Based Care Products at Baylor Scott & White Health and co-founder of APM Connect. Connect with her on LinkedIn or at Dr. Christian Pean is an orthopedic trauma surgeon and faculty member at Duke University School of Medicine, core faculty at the Duke-Margolis Institute for Health Policy, and CEO and Co-Founder of RevelAi Health. He writes Techy Surgeon at the intersection of clinical AI, health policy, and care coordination. You can find him at the TEAM Connect Virtual Summit on April 30, 2026. Sarah Habeeb started pre-med at Texas A&M. Then she drove an ambulance, decided she didn’t want to touch patients, and pivoted into health administration. She was sitting in a grad school classroom around 2014 when someone started explaining the Medicare Shared Savings Program—a program that was, at the time, so new that almost no one understood it. She listened, thought it made intuitive sense, and made a career bet on it. That bet paid off. Today, she oversees a program that generates roughly $110 million in annual savings against a CMS benchmark, retaining about 75 cents of every dollar through MSSP’s Enhanced Track. That’s approximately $77 million flowing back out to physicians annually, for a program covering 120,000–125,000 Medicare beneficiaries across an entire major health system. She also co-founded APM Connect—a free community for hospitals navigating mandatory payment models—and will be speaking at the TEAM Connect Virtual Summit on April 30, 2026. I’ll be there too. If you’re a hospital leader trying to make sense of what TEAM actually requires in practice, this event is where you want to be. But before the summit, here’s what an hour with Sarah Habeeb taught me about what value-based care actually looks like from the inside—and what that means for every clinician and operator who is about to be pulled into it, whether they’re ready or not. The Mechanics Most Clinicians Never Learn Risk, in the value-based care sense, is a word that gets thrown around like clinicians should already know what it means. Most don’t. Here’s the actual structure: When you enter a total cost of care contract, a payer assigns you a benchmark—an expected spend per member, per year, risk-adjusted based on patient complexity. If you keep costs under that benchmark, you share in the savings. If you don’t, you may owe back a portion of the overage. The HCC risk adjustment model is how CMS calibrates those benchmarks—it’s essentially a formula that assigns a risk score to each patient based on documented diagnoses. A patient with diabetes, COPD, and heart failure carries a higher score than one with no documented conditions, so their expected cost is set higher. This is where documentation integrity enters the picture. Sarah is direct about it: “If you’re a cardiologist in the heart failure cohort, you need to be sure that you’re getting credit for the risk of your patients because that directly affects your benchmark, which then directly affects your performance in the contract, which affects your Part B adjustments.” Physicians often experience risk coding conversations as administrative irritation—another box to check, another form to sign. The translation layer is missing. If your patients are genuinely sicker than their documented diagnoses suggest, you’re being benchmarked against a population that looks healthier than yours. The comparison doesn’t hold. You look like you’re mismanaging costs when you’re actually managing a high-acuity panel with inadequate documentation. The fix isn’t gaming the system. It’s accuracy. And it starts with clinicians understanding why it matters. What “Operator” Actually Means on a Monday Morning Baylor Scott & White’s ACO is structured around what Sarah calls product owners—people responsible for specific contracts (MSSP, Medicare Advantage risk agreements, direct-to-employer deals) who identify their contract’s cost drivers and design remediation plans. Care management, quality teams, and marketing function as internal vendors to those product owners, not parallel departments chasing their own initiative lists. This sounds obvious until you’ve seen the alternative, which is how most health systems actually operate: multiple teams, multiple initiatives, no clear accountability, and no way to know what’s actually moving the needle. “Because if everybody’s working on a bunch of things and we’re not talking to each other,” Sarah explains, “you can’t figure out what actually made the difference.” The answer for Baylor was forced prioritization. Pick three initiatives. Measure them monthly. Make every stakeholder meeting about those three things. It’s not sophisticated—but discipline consistently outperforms sophistication in operations. The Inpatient Rehab Problem (Which Is Probably Your Problem Too) If you work in orthopedics, you already know what Sarah is about to say. If you don’t, here’s the version that will make you understand it. At Baylor Scott & White, inpatient rehab facility utilization is—in Sarah’s words—”completely unmanaged.” Against Milliman benchmarks, they’re over-utilizing inpatient rehab while simultaneously running near-zero skilled nursing facility use. They’re tracking 55% of hip fracture patients going to inpatient rehab, with a 30-something percent readmission rate that isn’t meaningfully better than SNF. “So was that the right decision? Were they even ready to discharge from the hospital?” she asks. And the honest answer is: often, no. This is the central tension in any ACO that’s embedded in a health system with joint ventures. The health system may have financial interests in keeping patients flowing through high-cost post-acute settings. The ACO’s job is to reduce unnecessary utilization of those same settings. Getting alignment between those two forces is genuinely hard. Sarah doesn’t pretend otherwise. “You have to find a way.” What Baylor has built: a six-to-eight-person team of post-acute care nurses who follow ACO patients through preferred SNFs, with direct EMR access (a requirement of network participation), weekly interdisciplinary calls with each facility, expected discharge dates set within seven days, and a target average stay under 28 days. They track return-to-acute rates and share scorecards with SNF partners quarterly. It’s brute-force infrastructure, but it’s working better than the alternative—which is patients in a black box beyond hospital discharge. From the orthopedic side, I’ve been doing telephone visits at two weeks for all my hip fracture patients, closing them with absorbable sutures so the visit doesn’t require a trip in. Half the time, I’m chasing down whether they’re still in the SNF, whether they’ve bounced back to a different ED, whether anyone has even talked to them since discharge. The clinical relationship doesn’t just end at the door of the facility. But the information infrastructure does. TEAM, ASM, and the Art of Not Panicking Here’s Sarah’s read on the two mandatory models that are consuming health system bandwidth right now: On TEAM: Most hospitals are in year one, upside only, and either don’t know they’re in it or aren’t taking it seriously. This is a mistake. The regional benchmarking structure means your performance is being measured against peers in your geography. “If your regional peers are paying attention and you’re not,” Sarah says, “that affects your benchmark.” By the time year two arrives with downside risk up to 20%, you’ll be starting from behind, not from neutral. Only four hospitals voluntarily opted into TEAM early from prior CJR participation—which tells you something about the expected economics. But ignoring it isn’t a viable option. On ASM: CMS’s Ambulatory Specialty Model is designed as physician-level accountability—NPI-specific participation, measuring cardiologists on heart failure costs and a range of specialists on low back pain. The design is conceptually provocative: specialists competing against each other within the same market for Part B adjustments. The implementation, however, came in lighter than expected. At Baylor, the initial projection was 200–400 physicians selected. The actual list: 51. Nationally, organizations are reporting five or six physicians per system. The cost of the required quality reporting infrastructure may exceed the penalty exposure for smaller lists. Baylor’s response: build a shadow bundle internally. Treat ASM as an MSSP workstream. Develop heart failure and low back pain strategies that produce dividends in 2027 regardless of what the formal adjustment looks like. It’s the right call. These conditions were selected because they represent genuine opportunities to improve care and reduce low-value utilization—guideline-directed medical therapy for heart failure, fewer unnecessary MRIs and high-risk opioid prescriptions for low back pain. The model may be imperfect, but the clinical direction is sound. Where AI Actually Fits—and Where It Doesn’t Sarah told me upfront: she doesn’t spend much of her day thinking about AI. Her ACO generates north of $100 million in savings the old-fashioned way, through data discipline, care management operations, and physician engagement. “I try not to [use AI],” she said.

    1 hr

About

Decoding AI, health tech & policy transforming healthcare—practical playbooks for clinicians, operators, & builders, from the OR to the boardroom. techysurgeon.substack.com