Platform Engineering Playbook Podcast

vibesre

The Platform Engineering Playbook Podcast is where AI meets open-source infrastructure knowledge—and you're part of the editorial process. Every episode is researched, scripted, and produced with AI, then reviewed by the community and published on GitHub for anyone to improve. Facing tool sprawl across 130+ platforms? Justifying PaaS costs to your CFO? Navigating the Shadow AI crisis hitting 85% of organizations? We tackle the messy realities of platform engineering that most content avoids, delivering data-backed insights and decision frameworks you can use Monday morning. Built for senior engineers, SREs, and DevOps practitioners with 5+ years in production, we dissect cloud economics, AI governance, infrastructure trade-offs, and career strategy—with the receipts to back it up. Think we got something wrong? Have better data? Open a pull request at platformengineeringplaybook.com. This is infrastructure podcasting as a living document, where the community keeps us honest and the content gets better with every contribution. Read the playbook at https://platformengineeringplaybook.com

  1. VOR 1 TAG

    OpenTofu vs Terraform: What Enterprise Teams Are Actually Doing (2026)

    **Is your infrastructure strategy about to become obsolete?** By 2025, half of all Terraform installations could be running OpenTofu - and the implications for platform engineering teams are massive. In today's deep dive, we break down the OpenTofu vs. Terraform battle that's reshaping infrastructure as code. You'll learn the real mechanics behind migrating between these tools, practical decision frameworks for enterprise teams, and why this choice could define your platform's next five years. **What You'll Learn:** • The technical and business drivers behind the OpenTofu fork • Step-by-step migration strategies and gotchas to avoid   • How to evaluate which tool fits your team's needs • Real-world implications for existing Terraform workflows **Episode Chapters:** 0:00 Cold Open - The OpenTofu prediction 2:15 Today's Platform Engineering News 8:30 Deep Dive Act 1 - Understanding the OpenTofu vs Terraform landscape **Plus:** AWS Elastic Beanstalk's new GitHub Actions integration, Dapr Runtime updates, and scaling secure DevOps practices across enterprise teams. Perfect for platform engineers, DevOps practitioners, and infrastructure leaders navigating the evolving IaC landscape. **Sources & References:** • OpenTofu vs. Terraform Guide: https://www.env0.com/blog/opentofu-vs-terraform-a-practical-guide-for-enterprise-infrastructure-teams • AWS Elastic Beanstalk GitHub Actions: https://aws.amazon.com/about-aws/whats-new/2026/02/aws-elastic-beanstalk-github-action • Dapr Runtime v1.17.0-rc.6: https://github.com/dapr/dapr/releases/tag/v1.17.0-rc.6 • ByteDance AI Video Generator: https://www.theverge.com/ai-artificial-intelligence/877931/bytedance-seedance-2-video-generator-ai-launch • Secure DevOps at Scale: https://devops.com/secure-devops-at-scale-integrating-sre-devsecops-and-compliance/ #PlatformEngineering #DevOps #CloudNative #Kubernetes

    18 Min.
  2. VOR 2 TAGEN

    Why Databases Inside Kubernetes Are Becoming Technical Debt

    **Is running databases in Kubernetes about to become legacy technical debt overnight?** By 2026, the inference cloud revolution is forcing platform engineers to completely rethink database architecture - and the implications are massive. In today's deep dive, we break down the "container paradox" that's reshaping how we think about stateful workloads in Kubernetes. You'll discover why the rise of AI inference is making traditional database-in-K8s patterns unsustainable and what this means for your platform strategy. **What You'll Learn:** • Why the inference cloud demands decoupled database architectures • A practical framework for assessing your statefulness spectrum • How operator complexity is becoming a hidden cost center • Real-world lessons from World Bank's hybrid cloud transformation with Terraform **Key Topics Covered:** • The container paradox driving database architecture changes • Kubernetes cultural shifts enabling AI expansion • CloudFront's new mTLS authentication for zero trust architectures • Latest developments in federated variational inequalities **Timestamps:** 0:00 Cold Open - The 2026 Database Prediction 2:15 Platform Engineering News Roundup 8:30 Deep Dive: The Container Paradox 15:45 Operator Complexity Analysis Perfect for platform engineers, DevOps teams, and infrastructure architects navigating the evolving Kubernetes landscape. **Sources & References:** • The Container Paradox: https://www.digitalocean.com/blog/the-container-paradox-k8s-databases • World Bank Terraform Case Study: https://www.hashicorp.com/blog/how-world-bank-manages-hybrid-cloud-complexity-with-terraform • Kubernetes AI Culture Impact: https://www.infoq.com/news/2026/02/kubernetes-ai-culture-impact/ • CloudFront mTLS Update: https://www.infoq.com/news/2026/02/amazon-cloudfront-mtls-origins/ • Federated Variational Inequalities: https://arxiv.org/abs/2602.09164 #PlatformEngineering #DevOps #CloudNative #Kubernetes

    18 Min.
  3. VOR 3 TAGEN

    47% of CNCF Projects Slowed Down in 2025 — Why That’s Actually Good News

    **Why did 47% of CNCF projects slow down their development velocity in 2025 — and why platform engineers should celebrate this trend?** In today's Platform Engineering Playbook, we decode what declining commit velocity across cloud native projects actually reveals about infrastructure maturity and what it means for your platform strategy. **What You'll Learn:** • How to interpret CNCF project velocity metrics as leading indicators for platform decisions • Why slower development cycles might signal stronger, more stable infrastructure foundations • Strategic insights for platform engineers navigating the evolving cloud native landscape • Breaking analysis of agentic AI transforming DevOps automation and autonomous infrastructure **Episode Breakdown:** 0:00 Cold Open - The 47% velocity drop revelation 2:15 Today's Platform Engineering News Roundup 8:30 Deep Dive Act 1 - Decoding CNCF Velocity Data **Today's News Coverage:** - Harness unveils agentic AI for autonomous infrastructure management - TiDB's emergence as the first truly AI-native database - Oracle Cloud's new DevSecOps-as-a-Service offering - Ex-Google team building revolutionary video data infrastructure Whether you're architecting platforms or optimizing cloud native workflows, this episode delivers the strategic insights you need to stay ahead of infrastructure trends. **Sources & References:** - CNCF Project Velocity Analysis: https://www.cncf.io/blog/2026/02/09/what-cncf-project-velocity-in-2025-reveals-about-cloud-natives-future/ - Agentic AI in DevOps: https://www.harness.io/blog/agentic-ai-in-devops-the-architects-guide-to-autonomous-infrastructure - TiDB AI-Native Database: https://thenewstack.io/tidb-and-the-rise-of-the-ai-native-database/ - Oracle DevSecOps-as-a-Service: https://about.gitlab.com/blog/devsecops-as-a-service-on-oracle-cloud-infrastructure-by-data-intensity/ - Video Data Infrastructure: https://techcrunch.com/2026/02/09/ex-googlers-are-building-infrastructure-to-help-companies-understand-their-video-data/ #PlatformEngineering #DevOps #CloudNative #Kubernetes

    19 Min.
  4. VOR 4 TAGEN

    The Claude Skills That Stop AI From Writing Dangerous Infrastructure as Code

    **Are 87% of DevOps teams unknowingly creating security vulnerabilities with AI-generated infrastructure code?** Today's Platform Engineering Playbook dives deep into the hidden risks of AI in DevOps workflows and reveals the specialized skills that top-performing teams use to harness AI safely and effectively. **What You'll Learn:** • Why AI-generated infrastructure code is creating blind spot vulnerabilities • The 8 Claude skills that actually move the needle for DevOps engineers • How to identify and automate your repetitive workflows with AI guardrails • Breaking news: Cloud complexity becomes the #1 security threat • Cloudflare's new vertical microfrontend template for edge routing • Red Hat's latest Developer Hub integration with OpenShift GitOps **Episode Chapters:** 00:00 Cold Open - The 87% Problem 02:15 Today's Platform Engineering News 08:30 Deep Dive: AI Skills That Actually Work for DevOps 15:45 The Security Reality Check Perfect for platform engineers, DevOps practitioners, and engineering leaders who want to leverage AI without compromising security or reliability. **Sources & References:** • The Claude Skills I Actually Use for DevOps: https://www.pulumi.com/blog/top-8-claude-skills-devops-2026/ • How to integrate Developer Hub with OpenShift GitOps: https://developers.redhat.com/articles/2026/02/09/how-integrate-developer-hub-openshift-gitops • AI makes the easy part easier and the hard part harder: https://www.blundergoat.com/articles/ai-makes-the-easy-part-easier-and-the-hard-part-harder • Cloudflare Launches Vertical Microfrontend Template: https://www.infoq.com/news/2026/02/cloudflare-vmfe-template/ • Cloud Complexity Is the New Security Vulnerability: https://www.darkreading.com/cloud-security/cloud-complexity-is-the-new-security-vulnerability • Agentic DataOps With Guardrails: https://feeds.dzone.com/link/23568/17272926/agentic-dataops-with-guardrails-mcp-mwaa #PlatformEngineering #DevOps #CloudNative #Kubernetes

    19 Min.
  5. VOR 5 TAGEN

    Docker vs Nix: Why Your Builds Aren’t Actually Reproducible

    97% of Docker containers can't reproduce the exact same build six months later—what does this mean for platform engineering, and why should you care? In today's episode of the Platform Engineering Playbook, we delve into the critical issue of reproducibility in Docker containers. Discover why this seemingly technical detail could significantly impact your workflows and productivity. We'll explore the limitations of traditional package managers and discuss how they can be a bottleneck in achieving true reproducibility.  **Timestamps:** - **[00:00] Cold Open:** Dive into the startling statistic about Docker containers. - **[01:15] Intro:** Welcome and overview of today's topics. - **[03:30] Deep Dive - Act 1:** Understanding reproducibility in platform engineering. - **[12:45] Deep Dive - Act 2:** Analyzing the core problems with Dockerfile package managers. Why listen? This episode not only highlights a pressing issue but also provides actionable insights and strategies to tackle reproducibility challenges. Enhance your platform engineering skills and stay ahead of industry trends with our expert analysis. **Sources & References:** - [Docker versus Nix: The quest for true reproducibility](https://thenewstack.io/docker-versus-nix-the-quest-for-true-reproducibility/) - [Qwen3 Coder Next as first "usable" coding model 60 GB for me](https://www.reddit.com/r/LocalLLaMA/comments/1qz5uww/qwen3_coder_next_as_first_usable_coding_model_60/) - [LeakWatch 2026 – Security incidents, data leaks, and IT incidents in the current calendar week 6 - igor´sLAB](https://www.igorslab.de/en/leakwatch-2026-security-incidents-data-leaks-and-it-incidents-in-the-current-calendar-week-6/) - [Big Tech's $650-700 Billion AI Infrastructure Push Reshapes Cash Flow Dynamics - MLQ.ai](https://mlq.ai/news/big-techs-650-700-billion-ai-infrastructure-push-reshapes-cash-flow-dynamics/) - [Goldman Sachs Rolls Out Anthropic's Claude AI to Automate Accounting and Compliance Tasks - MLQ.ai](https://mlq.ai/news/goldman-sachs-rolls-out-anthropics-claude-ai-to-automate-key-accounting-and-compliance-tasks/) #PlatformEngineering #DevOps #CloudNative #Kubernetes

    18 Min.
  6. 7. FEB.

    The Data Canary Pattern: How Netflix Prevents Bad Metadata Deploys

    **What happens when 2 billion daily metadata events could crash Netflix's entire platform with one bad transformation?** Today's Platform Engineering Playbook dives deep into Netflix's Data Canary system - a masterclass in building trust and validation into your data pipelines at scale. Plus, we cover the latest platform engineering news that's reshaping how we deploy and monitor distributed systems. **What You'll Learn:** • How Netflix validates massive data transformations without risking production • Container readiness strategies for Spring Boot in Kubernetes environments   • LinkedIn's redesigned SAST pipeline using GitHub Actions and CodeQL • Why GitOps is becoming essential for platform engineering teams • Datadog's new LLM observability tools with Google's Agent Development Kit **Episode Chapters:** 0:00 - Cold Open: Netflix's 2 billion event challenge 2:15 - Platform engineering news roundup 8:30 - Deep Dive: Netflix Data Canary system breakdown 15:45 - Trust frameworks for platform validation Whether you're scaling data pipelines, improving deployment reliability, or building platform trust frameworks, this episode delivers actionable insights from real-world implementations at companies like Netflix and LinkedIn. **Sources & References:** • Netflix Data Canary: https://netflixtechblog.medium.com/the-data-canary-how-netflix-validates-catalog-metadata-18b699d58e36?source=rss-c3aeaf49d8a4------2 • Spring Boot Container Readiness: https://medium.com/@AlexanderObregon/container-readiness-checks-for-spring-boot-deployments-535ab60ca32a • LinkedIn SAST Pipeline: https://www.infoq.com/news/2026/02/linkedin-redesigns-sast-pipeline/ • GitOps Course: https://platformengineering.org/blog/announcing-new-course-gitops-for-platform-engineering • AWS Revenue Growth: https://techcrunch.com/2026/02/05/aws-revenue-continues-to-soar-as-cloud-demand-remains-high/ • Datadog LLM Observability: https://www.infoq.com/news/2026/02/datadog-google-llm-observability/ #PlatformEngineering #DevOps #CloudNative #Kubernetes

    15 Min.
  7. 6. FEB.

    Claude Opus 4.6: The First AI That Feels Like a Teammate

    **Claude Opus 4.6 just demolished GPT-4 on every coding benchmark - and it's about to reshape how we think about platform engineering automation.** In today's episode, we break down Anthropic's game-changing AI release and what it means for platform teams worldwide. We dive deep into the autonomous capabilities that could revolutionize how we handle infrastructure operations, but also explore the new risks this creates for production environments. **What You'll Learn:** • How Claude Opus 4.6's coding performance impacts platform tooling decisions • Why autonomous AI operations require new safety frameworks • Practical strategies for identifying AI automation opportunities in your platform • Analysis of Resolve AI's $125M funding and the AI SRE market explosion • Key Kubernetes updates affecting platform teams **Timestamps:** 0:00 Cold Open - Claude's benchmark dominance 2:15 Today's platform engineering news roundup 8:30 Deep Dive: Claude Opus 4.6 for platform teams 15:45 Risk analysis: Autonomous AI in production Whether you're evaluating AI tools for your platform team or wondering how to safely implement autonomous operations, this episode gives you the framework to make informed decisions without getting caught up in the hype. **Sources & References:** • Claude Opus 4.6: https://www.anthropic.com/news/claude-opus-4-6 • Anthropic AI upgrade - Reuters: https://www.reuters.com/business/retail-consumer/anthropic-releases-ai-upgrade-market-punishes-software-stocks-2026-02-05/ • Resolve AI $125M funding: https://techcrunch.com/2026/02/04/ai-sre-resolve-ai-confirms-125m-raise-unicorn-valuation/ • Kubernetes OpenAPI updates: https://github.com/kubernetes/kubernetes/pull/136582 • Prow contributors: https://docs.prow.k8s.io/docs/getting-started-develop • Video mesh recovery research: https://arxiv.org/abs/2602.04257 #PlatformEngineering #DevOps #CloudNative #Kubernetes

    16 Min.
  8. 5. FEB.

    Autonomous AI in DevOps Is Here — And Most Teams Are Doing It Wrong

    **Will 87% of DevOps teams really be obsolete by 2026?** As AI agents take control of production infrastructure, we're witnessing the biggest transformation in platform engineering history. In today's episode, we dive deep into **autonomous AI agents in DevOps workflows** and explore how they're reshaping everything from monitoring to incident response. You'll discover real-world examples of AI agents managing production systems, plus critical insights on when and how to safely implement these powerful tools in your own infrastructure. **What You'll Learn:** • How AI agents are revolutionizing observability and SRE practices • Practical implementation strategies for autonomous monitoring systems • Why you should wait at least 4 months before deploying AI agents in production • The latest trends in GenAI and OpenTelemetry integration • Kubernetes IPv6 adoption and what it means for your platform **Episode Chapters:** 0:00 - Cold Open: The AI DevOps Revolution 2:15 - Today's Platform Engineering News 8:30 - Deep Dive: AI Agents in Production (Setup) 15:45 - Real-World Implementation Examples Whether you're a platform engineer, SRE, or DevOps leader, this episode provides actionable insights for navigating the AI-driven future of infrastructure management. **Sources & References:** • MCP-Powered Agentic AI in DevOps: https://devops.com/mcp-powered-agentic-ai-in-devops-building-secure-scalable-multi-agent-pipelines-for-autonomous-sre-and-observability/ • Observability trends for 2026: https://www.elastic.co/blog/2026-observability-trends-generative-ai-opentelemetry • CNCF LFX Mentorship 2025: https://www.cncf.io/blog/2026/02/04/cncf-celebrates-successful-mentees-from-lfx-mentorship-2025-term-3/ • CNCF Fluid with Amazon EKS: https://aws.amazon.com/blogs/containers/build-deep-learning-model-training-apps-using-cncf-fluid-with-amazon-eks/ • Agent-Assisted Intelligent Observability: https://www.infoq.com/articles/agent-assisted-intelligent-observability/ • Kubernetes and IPv6: https://cloudnativenow.com/features/kubernetes-and-ipv6-together-at-last/ #PlatformEngineering #DevOps #CloudNative #Kubernetes

    19 Min.

Info

The Platform Engineering Playbook Podcast is where AI meets open-source infrastructure knowledge—and you're part of the editorial process. Every episode is researched, scripted, and produced with AI, then reviewed by the community and published on GitHub for anyone to improve. Facing tool sprawl across 130+ platforms? Justifying PaaS costs to your CFO? Navigating the Shadow AI crisis hitting 85% of organizations? We tackle the messy realities of platform engineering that most content avoids, delivering data-backed insights and decision frameworks you can use Monday morning. Built for senior engineers, SREs, and DevOps practitioners with 5+ years in production, we dissect cloud economics, AI governance, infrastructure trade-offs, and career strategy—with the receipts to back it up. Think we got something wrong? Have better data? Open a pull request at platformengineeringplaybook.com. This is infrastructure podcasting as a living document, where the community keeps us honest and the content gets better with every contribution. Read the playbook at https://platformengineeringplaybook.com