7 episodes

A show about publicly-facing incidents, resilience, and human factors in computer systems.

Getting There Heavybit

    • News

A show about publicly-facing incidents, resilience, and human factors in computer systems.

    Ep. #7, The March 2023 Datadog Outage with Laura de Vesine

    Ep. #7, The March 2023 Datadog Outage with Laura de Vesine

    In episode 7 of Getting There, Nora and Niall speak with Laura de Vesine of Datadog. Laura shares a unique perspective on the March 2023 Datadog outage, how the incident was handled internally, the resulting damage of the outage, and the many lessons learned.

    • 1 hr
    Ep. #6, The Impacts of the 2022 Twitter Acquisition

    Ep. #6, The Impacts of the 2022 Twitter Acquisition

    In episode 6 of Getting There, Nora and Niall discuss Twitter’s 2022 acquisition by Elon Musk. This talk unpacks the acquisition in terms of the cultural and social implications, the resulting fallout from massive layoffs, and the deprioritization of reliability standards within the company.

    • 39 min
    Ep. #5, The State of SRE and Beyond

    Ep. #5, The State of SRE and Beyond

    In episode 5 of Getting There, Nora and Niall meet for a conversation at SREcon. This talk explores the history of the conference, the state of SRE, the role of company historians, insights on complexity management, and the value of SREs and systems thinkers.

    • 30 min
    Ep. #4, The April 2022 Atlassian Outage

    Ep. #4, The April 2022 Atlassian Outage

    In episode 4 of Getting There, Nora Jones and Niall Murphy discuss the Atlassian outage of April 2022. This talk explores Atlassian’s 20-year history, key takeaways from this 14-day outage, surprising findings from the incident report, and critical discussion of Atlassian’s response.

    • 34 min
    Ep. #3, The October 2021 Roblox Outage

    Ep. #3, The October 2021 Roblox Outage

    In episode 3 of Getting There, Nora Jones and Niall Murphy unpack the Roblox outage of October 2021. Together they review the incident report, discuss the contributing factors and the users affected, and examine the attributes of Roblox’s business model that led to this 73-hour outage.

    • 40 min
    Ep. #2, The December 2021 AWS Outages

    Ep. #2, The December 2021 AWS Outages

    In episode 2 of Getting There, Nora and Niall discuss the socio-technical aspects of the AWS outages that occurred in December 2021. Together they unpack what happened, the inherent implications, and how organizations can learn from outages at such scale.

    • 29 min

Top Podcasts In News

The Rest Is Politics
Goalhanger Podcasts
The News Agents
Global
The Rest Is Politics: US
Goalhanger
Leading
Goalhanger Podcasts
Serial
Serial Productions & The New York Times
The Rest Is Money
Goalhanger Podcasts