
10 episodes

Break Things on Purpose Gremlin
-
- Technology
-
-
5.0 • 6 Ratings
-
A monthly podcast about Chaos Engineering, presented by Gremlin, Inc. Find us on Twitter at @BTOPpod.
-
Kelsey Hightower
This episode we speak with Kelsey Hightower. Kelsey is a Principal Developer Advocate at Google.
Topics include: Promise Theory, is Kubernetes hard, running databases on Kubernetes, the meat cloud, empathy sessions, how Kubernetes has helped standardize Ops practices, learning from failure at scale at Google, and the importance of the Inclusion part of D&I. -
Kolton Andrus
This episode we speak with Kolton Andrus, the CEO and co-founder of Gremlin. Topics include: The role of a Call Leader in incidents, using Chaos Engineering as runtime validation, FIT and application level fault injection, Jesse Robbins and early experiments at Amazon, oncall training, Lineage Driven Fault Injection (LDFI), the value of looking at real traffic instead of synthetic transactions, and the challenges people face when starting to do Chaos Engineering.
-
Haley Tucker
This episode we speak with Haley Tucker. Haley is a Senior Software Engineer on the Resilience Engineering team at Netflix. Topics include: Running Chaos Engineering experiments as A/B tests, testing dependencies, fallbacks, testing in production, and why Chaos Monkey is less interesting at Netflix now.
-
Matthew Simons
This episode we speak with Matthew Simons. Matthew is a Senior Product Development Manager at Workiva and he leads the Quality Assessment team there. Topics include: Supporting and encouraging reliability at Workiva, why Workiva moved from App Engine to EKS, how to tighten the customer feedback loop, how Chaos Engineering can help folks who are oncall, and fatal optimism and the asteroid that may hit the Earth in the year 2181 (it’s real y’all).
-
Subbu Allamaraju
This episode we speak with Subbu Allamaraju. Subbu is a Senior Technologist at the Expedia Group. Topics include: Learning from incidents, changing culture, Why Complex Systems Fail, drifting into failure, forming a hypothesis, showing value from your reliability work, and the importance of understanding how your business makes money.
-
Adrian Hornsby
This episode we speak with Adrian Hornsby, a Senior Tech Evangelist at Amazon Web Services. Topics include: Curiosity and breaking things, the cost of downtime, Jesse Robbins and early failure injection at Amazon, making the case to management for Chaos Engineering, forming a hypothesis, and random experiments vs Game Days.
Customer Reviews
Great Show
It's been so great being able to hear from different companies and find out what they are working on and lessons learned.
Great idea!
Excellent listen, thanks for sharing these customer stories.
First Chaos Engineering Podcast
Really neat to see this podcast for the wider engineering community to learn about Chaos Engineering and SRE