Show Me The Evidence

Anthony G. Gallagher, Flux Learning Ltd

Most training is sold on confidence. Show Me The Evidence is built on data. In every episode we take a single study, clinical trial, or systematic review and work through what it found, how it was designed, and what it means for the way we teach and assess skill. We focus on metrics-based training and proficiency-based progression, the approach that asks learners to demonstrate measurable competence before moving on, and we trace its results across surgical, medical, and professional education. This is a podcast for learning professionals and medical educators who want more than opinion. Expect plain-language breakdowns of the research, honest discussion of what the evidence does and does not support, and conversations with the people behind the studies. If you make decisions about how people are trained, we think you deserve to see the evidence first.

Episodes

  1. May 29

    Dr. Richard Angelo: From Apprenticeship to Proficiency — Rethinking How We Train Surgeons

    Episode 4 — Dr. Richard Angelo Guest Dr. Richard (Rick) Angelo — Arthroscopic surgeon based in Seattle; former President of the Arthroscopic Association of North America (AANA). Holds a PhD in proficiency-based progression training. Host Tony (relationship with Rick spans ~15 years, originating from a chance meeting at a conference in Sweden) Episode Overview A deep-dive conversation on the fundamental failures of traditional surgical training and how proficiency-based progression (PBP) training offers a scientifically rigorous alternative. The discussion centres on the landmark Copernicus Study — the first study in medicine to use proficiency demonstration as an outcome measure. Key Topics Covered 1. Limitations of the Traditional Apprenticeship Model The "see one, do one, teach one" model lacks objective assessmentDespite decades of training and significant investment, AANA could not verify whether skill acquisition was actually occurringComplication rates and suboptimal outcomes weren't improving with existing training efforts2. The Founding Question Rick, during his time in the AANA presidential line, asked: "Is there a better way to train surgical skills?"This led to engagement with Tony's work on proficiency-based progression training3. Proficiency-Based Progression (PBP) Training — Core Principles Define a clear target: what does quality performance of a procedure look like?Deconstruct tasks into discrete, trainable componentsDevelop objective, binary metrics (did it occur or not?) rather than global rating scalesEstablish inter-rater reliability between assessorsTrainees must demonstrate a benchmark at each stage before progressing (including cognitive pre-course material — 83% threshold)Errors and deviations from optimal performance are trained explicitly — not just steps4. The Bankart Repair — Why It Was Chosen Common procedure with a broad, transferable skill setSuited to task deconstruction and partial task simulationChosen by Rick and endorsed by the AANA core group5. Curriculum Before Simulation A critical insight: the curriculum and metrics must be developed first; simulation is chosen to match, not the other way aroundContrast with the wider medical field's focus on "eye candy" VR simulators that lack meaningful metricsThe FAST model (Fundamentals of Arthroscopic Surgery Training) was developed with Rob Pedowitz for knot tying — a low-cost, highly accurate partial task trainerEven a simple conical nail punch from a garage became an effective tool for measuring loop elongation6. The Copernicus Study — Design & ResultsThree study groups: Group A (Traditional): Lectures, open-access knot-tying lab, cadaver session — standard AANA approachGroup B (Simulator only): Access to the simulator without the PBP curriculum or metricsGroup C (PBP): Proficiency benchmarks at every stage — cognitive, knot-tying, and shoulder modelResults: Group B was 1.4× more likely than Group A to meet the benchmark (marginal)Group C participants (assigned to PBP, even without passing all benchmarks): 5.5× more likely than Group AGroup C participants who met all proficiency benchmarks: 7.5× more likely to meet the final benchmarkError reduction: ~56% decrease in Bankart errors; ~58% for rotator cuff repairIn one follow-up weekend cohort of 18 trainees: 89% demonstrated proficiency in Bankart repair; 83% in rotator cuff repair7. Key Finding: The Deficiency is in Training, Not Trainees Pre-study concern about a "weed-out process" proved unfoundedWith quality training, almost all trainees can master the required skillsReferenced Frank Lewis (former Chair, American Board of Surgery) sharing the same observationStefano Pogliani's study demonstrated near-universal proficiency is achievable8. The Role of Errors in Surgical Training Distinguishing novice from expert performers is best predicted by error enactment, not step completionEach deviation from optimal performance creates a cascade risk — even if consequences aren't immediateUpcoming study expected to show errors are the best predictor of patient outcomes9. Broader Applicability to Procedure-Based Medicine Principles apply across disciplines — cardiology, robotics, and beyondContrast drawn with VR simulator manufacturers at the European Heart Rhythm Association Conference (Paris), where most simulations had no metricsChicken tissue models used successfully in robotic surgery training at €5 per chicken — effective without being high-tech10. Credentialing and Quality Assurance Discussion of whether PBP methodology could or should underpin credentialing for new procedures or devicesDevice failures in the field often attributable to inadequate clinician preparation, not device defectsPractical challenges for societal credentialing (procedure selection, remediation pathways, cost of metric development, legal defensibility)European Commission is moving toward micro-credentials for technical skills — awarded by universities, recognised across EU member statesBoth speakers agree: medicine must develop objective, procedure-based performance assessment for the public goodAnalogy: demonstrating more skill is required to get a driver's licence than is currently required of surgeons in terms of objective performance assessment

    48 min
  2. May 25

    From the FDA to the operating theatre: how proficiency rewrote the rules of surgical training

    Guest: Professor Anthony G Gallagher Host: Patrick Kiely Episode focus: two landmark studies, the 2004 JAMA carotid stenting paper and the Copernicus arthroscopy trial Episode summary Professor Tony Gallagher, the founder of Proficiency-Based Progression (PBP), joins Patrick Kiely to revisit two studies that changed how we think about surgical competence. The first is the 2004 JAMA paper describing a closed-door meeting at which the US Food and Drug Administration agreed, for the first time, that simulation training should form part of how doctors are approved to perform a procedure. The second is the Copernicus trial in shoulder arthroscopy, which showed that a simulator only improves training when it is paired with validated metrics and a clear proficiency benchmark. Together they make a simple, evidence-led case: competence should be measured by the skill a clinician can demonstrate, not by years served or cases counted. Chapters 0:00 Introduction 0:50 Inside the 2004 closed-door FDA meeting on carotid stenting 4:43 Why carotid stenting forced the conversation 7:25 Skill over specialty: ending the turf war 9:44 The FDA precedent: simulation becomes part of credentialing 12:03 Why procedure volume is a crude proxy for competence 14:17 Why the argument had to appear in JAMA 16:22 A homogeneous skill set, devices, and patient safety 20:42 The Copernicus Initiative: a paradigm shift in training 22:53 Three groups, one lesson: a simulator alone is not enough 25:44 The results: 56 per cent fewer errors and the 7.5 times finding 27:53 The trainees who did not pass, and distributed training 30:37 Pass the cognitive exam before the skills lab 32:33 Task deconstruction: 45 steps and 77 possible errors 36:04 Errors versus sentinel errors: why minor errors matter most 39:09 Why fidelity is not the point 41:59 Why a multi-site trial mattered 44:20 Where to start: begin with the metrics Key points Carotid artery stenting is high risk and crossed three specialties, so the FDA needed a way to be sure each clinician was safe to perform it. PBP simulation let credentialing rest on demonstrated skill rather than on specialty or case numbers.In 2004 the FDA accepted virtual reality simulation as part of the training package for a new device. This was the first time a regulator tied device approval to a training standard.Procedure volume and hours logged are weak indicators of skill. Demonstrated proficiency is a far better one.In the Copernicus arthroscopy trial, traditional training performed worst, adding a simulator alone helped only slightly, and PBP plus the simulator produced the strongest and safest performance.The PBP group made roughly 56 per cent fewer errors, and residents who met every benchmark were 7.5 times more likely to reach the final standard.Minor errors, not only critical ones, predict poor outcomes, so trainees are taught to avoid every avoidable error.To build PBP: find people who are genuinely good at the task, define and validate the metrics, choose simulations that let trainees practise the key steps, train faculty on the metrics first, and require a pass on the online didactic before anyone enters the skills lab.Studies referenced Gallagher AG, Cates CU. Approval of virtual reality training for carotid stenting: what this means for procedural-based medicine. JAMA. 2004;292(24):3024-3026. Read on JAMA NetworkAngelo RL, Ryu RKN, Pedowitz RA, et al. A Proficiency-Based Progression Training Curriculum Coupled With a Model Simulator Results in the Acquisition of a Superior Arthroscopic Bankart Skill Set. Arthroscopy. 2015;31(10):1854-1871. Read on the Arthroscopy journalConnect and follow Professor Tony Gallagher on LinkedInProfessor Tony Gallagher on Google Scholar

    46 min
  3. May 25

    The VR-OR Study — Proof That Simulation Training Transfers to the Operating Room & The Methodology of Proficiency-Based Progression

    Guest: Professor Anthony G GallagherTopic: The VR-OR Study — Proof That Simulation Training Transfers to the Operating Room & The Methodology of Proficiency-Based Progression Episode Summary In this episode, Patrick Kiely sits down with Professor Tony Gallagher to examine two landmark papers that transformed simulation-based surgical training. The first — the 2002 Yale VR-OR study — provided the first prospective randomised blinded proof that virtual reality simulator training transfers directly to improved operating room performance. The second — a 2005 Annals of Surgery paper — provided the field with the recipe for how to actually implement it. Together, they form the scientific and methodological backbone of Proficiency-Based Progression. Tony explains why the design decisions that made these studies credible — blinding, objective metrics, proficiency benchmarks, construct validity — are the same decisions most training programs still fail to make today. Key Topics Covered 1. The Problem VR Training Was Designed to Solve — 0:00 The apprenticeship model and why laparoscopic surgery broke itThe fundamental cognitive challenge of moving from direct vision to a monitorThe fulcrum effect: why instrument manipulation on a monitor creates a proprioceptive conflict the brain must automateRick Satava's proposal: acquire basic skills outside the OR, on simulators2. The Simulator That Changed Things — 3:21 Johnson & Johnson's Ethicon simulator: an emulator, not a physics-based modelWhy abstract psychomotor tasks work better than tissue simulationThe surgical community's scepticism — and why Yale provided the opportunity to test it properly3. The Proficiency Benchmark: How It Was Set — 4:51 Rejecting time and trial number as training endpointsUsing objectively assessed performance of experienced (not world-class) surgeons as the benchmarkMean vs. median performance, and how to handle outlier experts (>2 SD from mean are excluded)Frank Lewis (American Board of Surgery) on why the benchmark is deliberately high — and why that's fine4. The Results: What Happened in the OR — 6:57 VR-trained residents: six times fewer errors in the ORControl group: nine times more likely to fail to progress during a procedure5. Failure to Progress: What It Reveals — 7:23 Defining the metric: instruments moving but the procedure not advancingWhy it indicates the person was not ready to perform the task independentlyHow it predicted the need for online didactic preparation before the skills lab6. Why the Study Had to Be Prospective, Randomised, and Blinded — 13:11 The gold standard language clinicians understandWhy senior figures in surgery said it wasn't doable — and why they were wrongHow double-blinding protected the integrity of intraoperative assessmentThe study design that subsequently became the default methodology for evaluating simulation tools in medicine7. Objective Metrics vs. Likert Scales — 15:22 Why Likert scales fail for technical skill assessmentInter-rater reliability below .8 invalidates any assessment tool by defaultThe subjectivity problem: two surgeons from the same year, same school, scoring the same video differentlyWhy errors are the most sensitive measure of change as a result of trainingSteps vs. errors: trainees learn what to do; what they don't learn systematically is what not to do8. The 2005 Annals Paper: The Recipe for PBP — 27:33 Why the VR-OR paper alone wasn't enough — Randy Halleck: "You assume we know how to use the methodology"What the 2005 paper added: how to develop metrics, who to involve, how to set the benchmark, how to validateThe core principles of PBP that remain unchanged todayPublication: Gallagher, A.G. & Seymour, N.E. (2002). Virtual reality training for laparoscopic surgery. Annals of Surgery, October 2002.https://journals.lww.com/annalsofsurgery/abstract/2002/10000/virtual_reality_training_improves_operating_room.8.aspx 9. Education vs. Training: Why the Distinction Matters — 29:05 Education = knowledge transmission; Training = skill acquisitionWhy medicine has done excellent education for centuries but apprenticeship-based training no longer fits the 21st centuryThe online didactic benchmark: trainees don't enter the skills lab until they've demonstrated knowledge to the level of experienced practitionersWhat this saves in skills lab time — and what it tells supervisors about where to direct help10. The Pre-Trained Novice and Attentional Capacity — 31:31 Chunking: how the brain compresses discrete information units into automated sequencesWhy unautomated technical skills consume attentional capacity that should be available for situational awarenessThe bicycle analogy: looking at the handlebars vs. seeing the potholeWhy automation must occur outside the OR — stress in the operating room compounds cognitive load11. Case Volume as a Surrogate for Skill — 37:04 Why procedure numbers are a weak and noisy predictor of surgical competenceThe Berkmar study: intraoperative performance, not experience, predicts patient outcomesBuilding wisdom vs. accumulating numbersWhy you'd use procedure numbers when you can actually measure skill12. Simulators Can Teach Bad Behaviour — 40:44 Buying the wrong simulator is a fundamental and common mistakeSimulators are built by engineers, not clinicians — metrics must precede procurementTwo concrete examples: fluoroscopy pedal use with no consequence; syringe plunger speed in mechanical thrombectomy training teaching dangerous injection techniqueHow insisting on metric-aligned design led a simulation company to patent an improved device13. Why the Benchmark Should Not Be Set on the Top 1% — 45:20 Setting on top 1% means almost no trainee reaches itThe top 1% may not always be who you think — statistical identification of outliersThe Monday-to-Friday surgeon doing a first-class job is the right modelTrainees can develop beyond the benchmark; the goal is safe, competent, timely performance14. Why Time Alone Is a Dangerous Metric — 47:10 Historical roots of speed as a surgical measure: pre-anaesthesia amputationSpeed-accuracy trade-off: faster = more errorsThe stroke thrombectomy example: speed matters in triage, but a fast operator who lacerates a vessel causes a worse outcomeTraining for skill automation produces speed as a downstream consequence — not the other way around15. Where the 2005 Prediction Has Landed — 49:34 PBP applied across: laparoscopic surgery, robotic surgery, endovascular procedures, cardiology, radiology, anaesthetics, intensive care, communication skills~60% improvement in performance outcomes consistently across domainsThe PLOS ONE ut...

    54 min
  4. May 23

    The Crisis in Surgical Training & Proficiency-Based Progression (PBP)

    Guest: Professor Anthony G GallagherTopic: The Crisis in Surgical Training & Proficiency-Based Progression (PBP) Episode Summary In this inaugural episode, Patrick Kiely sits down with Professor Tony Gallagher — founder of Proficiency-Based Progression and one of the world's leading researchers in surgical skills assessment and simulation-based training — to examine a deeply uncomfortable truth: that professional credentialing in medicine tells us almost nothing about actual clinical competence. Tony shares 30 years of evidence challenging the assumptions underpinning surgical and procedural training worldwide, and makes the case for Proficiency-Based Progression (PBP) as the superior — and inevitable — alternative. Key Topics Covered 1. The Competence Problem in Surgery — 0:00 Why credentials don't equal competenceThe Halsted training paradigm — developed in the late 19th/early 20th century — and why it's still in useHow to actually find out if your surgeon is good (hint: ask the theater sister)2. Why Current Training Metrics Are Failing — 5:08 Procedure volume and hours logged as proxies for competence — and why they're wrongThe misuse of Likert-type scales in surgical assessmentReduced work hours legislation (Europe/US) and its impact on trainee experienceThe Libby Zion case (New York) and how it changed US residency hours3. The Yale Study That Changed Everything — 9:53 The landmark 2002 Yale study showing simulator-trained residents made 60% fewer errorsWhy it became a citation classic — and why change was still slow Publication:  Gallagher & Seymour (2002). Virtual reality training for laparoscopic surgery. Annals of Surgery, October 2002. (Presented at American Surgical Association, April 2002) https://journals.lww.com/annalsofsurgery/abstract/2002/10000/virtual_reality_training_improves_operating_room.8.aspx 4. The American College of Surgeons Response — 12:29 Gerry Healy's pivotal leadership shift at the Boston meetingThe establishment of Accredited Educational Institutes (2006)Why 100+ accredited simulation centers still aren't producing the training outcomes expected5. The Experience ≠ Competence Myth — 16:47 Why procedure volume is a noisy surrogate for surgical skillHow some practicing consultants perform worse than residents in trainingObjective intraoperative performance assessment as the gold standard6. Proficiency-Based Progression: How It Works — 20:30 The mechanics of PBP: phases, steps, errors, critical errors, and the benchmarkEstablishing benchmarks from experienced — not world-class — practitionersConstruct validity, inter-rater reliability, and why Likert scales failThe role of deliberate practice (Ericsson) and why explicit, formative feedback accelerates learning Publication: Mazzone, Elio; Puliatti, Stefano MD; Amato, Marco; Bunting, Brendan; Rocco, Bernardo; Montorsi, Francesco; Mottrie, Alexandre; Gallagher, Anthony G. PhD, DSc||. A Systematic Review and Meta-analysis on the Impact of Proficiency-based Progression Simulation Training on Performance Outcomes. Annals of Surgery 274(2):p 281-289, August 2021. | DOI: 10.1097/SLA.0000000000004650 7. Why PBP Hasn't Been Adopted Universally — 35:27 "It's a failure of leadership"Organisations that have adopted PBP: AANA, ERUS, ORSI AcademyIncentive structures in healthcare and medical device manufacturing that slow adoptionThe Center for Medicare Services complication-rate accountability model as a potential lever8. The Economics of PBP — 37:49 Publication: Puliatti, S., Rodriguez Peñaranda, N., Amato, M., De Groote, R., Farinha, R., Bunting, B., van Cleynenbreugel, B., Mottrie, A. and Gallagher, A.G. (2026), Randomised trial on the economic impact of proficiency-based progression vs conventional robotic surgical training. BJU Int, 137: 493-501. https://doi.org/10.1111/bju.70130 https://bjui-journals.onlinelibrary.wiley.com/doi/full/10.1111/bju.70130 — Cost-effectiveness analysis of PBP vs. conventional training. At 500 trainees/year: PBP ~€1.7M vs. conventional ~€3.5M; cost equivalence at just 25 trainees; 100% of PBP trainees reached proficiency vs. 58% conventional 9. Surgeon Skill Predicts Patient Outcomes — 40:24 PBP applied to communication skills: deteriorating patient handover study, Cork University HospitalPBP applied to epidural training: 50%+ reduction in epidural failure rates for non-PBP trained group10. PBP Beyond Medicine: The Utilities Sector — 45:42 Reach Active case study: PBP training for utility workers to safely identify and excavate buried cablesOver €1 million saved in avoided utility strikes in year oneSame methodology, same results — across a non-university workforce11. How to Implement PBP in Your Organisation — 48:25 Start by identifying individuals who are objectively good at the taskWork out the metrics: phases, steps, errors, critical errorsValidate — consensus, not just agreementBuild or select simulation tools around the validated metricsTrain faculty on the metrics firstRequire trainees to pass the online didactic benchmark before entering the skills labTrain to benchmark — not to time, not to hours12. AI, Robotics, and the Future of Training — 1:01:31 Why AI currently measures process, not performanceWhat AI will need to assess: granular, step-level surgical actionsThe endovascular sphere as the likely first domain for AI integrationWhy simulation and AI must be seen as tools, not solutionsThe inevitability of PBP adoption — the question is only when Connect & Follow Show Me The Evidence PodcastTony Gallagher / KU Leuven:  https://www.linkedin.com/in/anthony-g-gallagher/ Google Scholar: https://scholar.google.com/citations?hl=en&user=rNTScRMAAAAJ&view_op=list_works&sortby=pubdateTimestamps TopicTimeThe competence problem: credentials vs. skill | 0:00Why current training metrics are wrong | 5:08The 2002 Yale simulator study | 9:53ACS response and simulation center rollout | 12:29Experience ≠ competence | 16:47Proficiency-based progression — mechanics | 20:30Why PBP hasn't been widely adopted | 35:27The economics of PBP | 37:49Surgeon skill predicts patient outcomes (NEJM) | 40:24PBP in utilities / Reach Active case study | 45:42How to implement PBP in your organisation | 48:25AI, robotics, and the future of training | 1:01:31

    1h 5m

About

Most training is sold on confidence. Show Me The Evidence is built on data. In every episode we take a single study, clinical trial, or systematic review and work through what it found, how it was designed, and what it means for the way we teach and assess skill. We focus on metrics-based training and proficiency-based progression, the approach that asks learners to demonstrate measurable competence before moving on, and we trace its results across surgical, medical, and professional education. This is a podcast for learning professionals and medical educators who want more than opinion. Expect plain-language breakdowns of the research, honest discussion of what the evidence does and does not support, and conversations with the people behind the studies. If you make decisions about how people are trained, we think you deserve to see the evidence first.