44 min

Episode 67 - Dealing With Failure Special AWS TechChat

- Technology

In this themed episode of AWS TechChat we explore how one deals with failure because as we say, everything fails all the time.

The show starts with level setting with some acronyms to ensure we are all on the same page. RTO, RPO’s and will now mean something to everyone by the end of this episode.

Disaster Recovery (DR) is often thought in many organizations as an insurance policy and we discuss about the impact versus risk and how you can put some structure around your decision-making.

We then pivot to various approaches you can use for DR:
• Pilot Light - ensuring you replicate your statement of records and are able to instantiate your stacks via infrastructure as code.
• Warm Standby, allowing you to run a scaled down version of your stack, but allowing you to scale up with Auto Scaling Groups and increasing the number of running tasks in your containers.
• Before speaking about a traditional backup and restore approach, which is still very valid. LTO may be gone but you can use most backup applications in 2020 with Amazon S3 and Amazon S3 Glacier as a target and if that's not an option there is also a VTL option in Storage Gateway.

Pete talks about nifty trick for auto recovery with Min 1|Max 1 auto scaling groups as well as Amazon EC2 recovery options.

We then pivot to what it would takes to architect for multi-region application allowing you to run your solution across multiple AWS regions in an active/active topology speaking through the challenges you may face and what tools are available.

Before closing out, we share about Multi Availability Zones (AZ) architectures, which is a key differentiator of AWS from other providers. We give a refresher on what AZ are and explain that all AWS services are either multi-AZ by default or a tick-box offering allowing you to build robust architectures than with stand AZ failure.

Speakers:
Shane Baldacchino - Solutions Architect, ANZ, AWS
Peter Stanski - Head of Solution Architecture, AWS

Resources:
Amazon EKS Workshop https://eksworkshop.com/
AWS Glossary https://docs.aws.amazon.com/general/latest/gr/glos-chap.html
Amazon Disaster Recovery https://aws.amazon.com/disaster-recovery/
Auto Scaling Groups https://docs.aws.amazon.com/autoscaling/ec2/userguide/AutoScalingGroup.html
Amazon S3 Glacier https://aws.amazon.com/glacier/
Regions, Availability Zones, and Local Zones https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-regions-availability-zones.html

AWS Events:
AWS Innovate AIML Edition https://aws.amazon.com/events/aws-innovate/machine-learning/
AWS Innovate DeepRacer Challenge https://aws.amazon.com/events/aws-innovate/machine-learning/deepracer/
AWS Builders Online Series on-demand https://aws.amazon.com/events/builders-online-series/
AWS Summits https://aws.amazon.com/events/summits/
AWS Events and Webinars https://aws.amazon.com/events/