Problem solver Tod Hansmann of Catalyst joins me to discuss "observability": What it is, why it means different things to different people, and how to get started if it's new for you.
In this episode:
- What is observability (o11y)?
- What can observability do for you?
- What metrics should you track?
- How does observability relate to logging, alerting, monitoring, and other practices?
- Who should be responsbile for obervability?
- How heavily should upper management be involved?
- How does observability relate to culture?
- CI/CD as a prerequisite for observability
- Why metrics are better than logs
- Surprising metrics that can be important
- The relationship between monitoring and automated testing
- Good observability as an enabler for canary deployments, test in production, and other practices
- How to define service level objectives
- How do you define "uptime"
- How to address corner cases
- Why being on call is desireable
Guest
Tod Hansmann
Twitter: @todpunk
LinkedIn: Tod Hansmann
Catalyst
Resources
Book: Site Reliability Engineering
Watch this episode on YouTube.
信息
- 节目
- 频率一周一更
- 发布时间2022年9月27日 UTC 07:00
- 长度45 分钟
- 单集44
- 分级儿童适宜