Guests
- Lawrence Jones, Founding Engineer at Incident.io
- Ed Dean Product Lead for AI at Incident.io
Key Takeaways
- AI’s biggest impact comes from compressing time—identifying causes minutes instead of hours.
- Retrieval-augmented reasoning still benefits from simplicity: deterministic tagging and re-ranking often beat complex vector setups.
- Post-incident “time travel” evals let teams score AI accuracy after they know what really happened.
- Building trust in AI isn’t just about precision—it’s about showing reasoning and uncertainty in ways humans understand.
Mentioned Tools & Concepts
- Slack as the interface for human-AI collaboration
- PGVector and Postgres for retrieval experiments
- RAG (Retrieval-Augmented Generation)
- Multi-agent orchestration
- “AI as your company’s immune system”
Chapters 00:00 Meet the Founders: Lawrence and Ed 00:41 Introduction to Incident.io 01:25 Evolution of Incident.io Products 02:14 Understanding SRE and Its Importance 04:01 Real-World Incident Management 05:51 The Role of AI in Incident Management 10:12 Challenges and Innovations in AI SRE 12:14 Prototyping and Iterating AI Solutions 16:25 Refining Retrieval Strategies 21:52 Balancing AI and Human Interaction 32:06 User Experience and Trust in AI Systems 36:08 Interactive Slack Integration 37:08 Understanding the AI Investigation Process 37:50 Parallel Checks and Data Sources 38:35 Building Hypotheses and Refining Findings 40:09 Human-Agent Collaboration 49:23 Evaluating AI Effectiveness a01:04:13 Future Developments and Integrations
Information
- Show
- FrequencyUpdated weekly
- Published6 November 2025 at 09:00 UTC
- Length1h 8m
- Episode9
- RatingClean
