KubeFM

The Data Engineer's guide to optimizing Kubernetes, with Niels Claeys

Niels Claeys shares how his team at Dataminded built Conveyor, a data platform processing up to 1.5 million core hours monthly. He explains the specific optimizations they discovered through production experience, from scheduler changes that immediately reduce costs by 10-15% to achieving 97% spot instance usage without reliability issues.

You will learn:

  • Why the default Kubernetes scheduler wastes money on batch workloads and how switching from "least allocated" to "most allocated" scheduling enables faster scale-down and better resource utilization

  • How to achieve 97% spot instance adoption through strategic instance type diversification, region selection, and Spark-specific techniques

  • Node pool design principles that balance Kubernetes overhead with workload efficiency

  • Platform-specific gotchas like AWS cross-AZ data transfer costs that can spike bills unexpectedly

Sponsor

This episode is brought to you by Testkube—where teams run millions of performance tests in real Kubernetes infrastructure. From air-gapped environments to massive scale deployments, orchestrate every testing tool in one platform. Check it out at testkube.io

More info

  • Find all the links and info for this episode here: https://ku.bz/hGRfkzDJW

  • Interested in sponsoring an episode? Learn more.