AI - Beyond the Hype

Data Quality Part 2: Fixing It - Critical Data Elements, Contracts, and the One Question That Stops Robodebts

Part 2 of 2 in our Data Quality series.

In Part 1, James came in skeptical and walked out sold on the problem. In Part 2, we deliver the fix — the discipline, the architecture, and the eight concrete moves executives can make on Monday morning. This is the episode for leaders who heard last week's case studies and asked "okay, but what do we actually do?"

What we cover:

  • The one question every CEO should be asking this week: what are our Critical Data Elements, who owns each one, and how do we know each is fit for purpose?
  • Why fixing all the data is how data quality programs die — and how ruthless tiering (50-300 fields, not 50,000) is how they survive
  • Data contracts: the quiet revolution in how serious organisations manage producer-consumer relationships, popularised by Andrew Jones at GoCardless and Chad Sanderson
  • The five default checks every Critical Data Element should pass: freshness, volume, schema, distribution, referential integrity
  • The five-layer reference architecture: contracts, validation, observability, lineage, governance — and why governance is where most organisations fail
  • Unity Technologies 2022: how contaminated training data cost $110M in revenue and $5B in market capitalisation in a single day
  • Robodebt: the Australian government program that issued ~470,000 invalid debt notices, ended in a Royal Commission, and cost $1.8B in settlement — and the three-word question that would have stopped it
  • The eight-step Monday-morning move: a complete executive action plan
  • The case study James can't name: a global enterprise (90,000 people, $50B+ revenue) six years into a serious data strategy — with every right concept on paper, an aggressive AI rollout underway, and a green dashboard hiding the reality. Why "the mandate is not the implementation" is the most dangerous gap in enterprise AI today.

The one question that stops Robodebts: "Fit for purpose for what?"

Key references:

  • Wang & Strong (1996), foundational dimensions of data quality: https://doi.org/10.1080/07421222.1996.11518099
  • DAMA UK — Six Core Data Quality Dimensions: https://www.sbctc.edu/resources/documents/colleges-staff/commissions-councils/dgc/data-quality-deminsions.pdf
  • Critical Data Elements Explained: https://www.dataversity.net/articles/critical-data-elements-explained/
  • ISO/IEC 25012:2008 — Data Quality Model: https://www.iso.org/standard/35736.html
  • Sambasivan et al., "Everyone wants to do the model work, not the data work" — data cascades in high-stakes AI (Google Research, CHI 2021): https://research.google/pubs/everyone-wants-to-do-the-model-work-not-the-data-work-data-cascades-in-high-stakes-ai/
  • IBM Institute for Business Value — 2025 CDO Study: https://www.ibm.com/thought-leadership/institute-business-value/en-us/report/2025-cdo
  • BCBS 239 — Principles for effective risk data aggregation and risk reporting: https://www.bis.org/publ/bcbs239.htm
  • Royal Commission into the Robodebt Scheme — Final Report (2023): https://robodebt.royalcommission.gov.au/publications/report
  • Unity Technologies Data Quality Issue: https://www.fool.com/investing/2022/07/17/2-reasons-unity-softwares-virtual-world-is-facing/
  • Andrew Jones — Driving Data Quality with Data Contracts: https://andrew-jones.com/data-contracts-101.pdf
  • Chad Sanderson — The Rise of Data Contracts: https://dataproducts.substack.com/p/the-rise-of-data-contracts
  • Chad Sanderson — Data Products and Contracts (Data Quality Camp): https://www.youtube.com/watch?v=1CSTSdfe0qg


If this series helped, share it with the loudest voice on AI strategy in your organisation. If their AI strategy doesn't have a data quality strategy underneath it, you now know what to ask them.

Better AI still starts with better foundations.

Send us Feedback