DevOps Foundations Effective Postmortems

DevOps Foundations Effective Postmortems

English | MP4 | AVC 1280×720 | AAC 48KHz 2ch | 1h 36m | 490 MB

What comes to mind when you think about incident postmortems in the tech space? If you picture a catastrophic incident, a scramble to pinpoint the root cause, and some sort of fix, you’re not alone; this was an established way of dealing with security crises for years. But since the buildup of small, unresolved problems—both in your code and with your team—can end up causing bigger issues than any one showstopper, this manner of investigation isn’t productive. In this course, Ernest Mueller shows how to approach incident investigations the DevOps way, explaining how to conduct a formal, blameless postmortem. Discover how to clearly communicate an incident’s scope, pinpoint the proper corrective actions, continuously improve your resiliency using incremental retrospectives, and more.

Topics include:

  • Resilience engineering
  • Communicating the scope and impact of an incident
  • Determining which incident metrics are helpful
  • Facilitating the postmortem meeting
  • Leading a group postmortem analysis
  • Controlling cognitive bias
  • Identifying contributing factors
  • Determining corrective actions
  • How incremental retrospectives complement postmortems
Table of Contents

Introduction
1 Learning through postmortems
2 What to expect from this course

About Postmortems
3 Why postmortems
4 Resilience engineering
5 Blamelessness
6 Root cause is a myth

Postmortem Preparation
7 Incident descriptions
8 Incident timelines
9 Incident metrics
10 Challenge Your incident write-up
11 Solution Your incident description
12 Solution Your incident timeline

Postmortem Analysis
13 Controlling cognitive bias
14 What went well
15 Contributing factors
16 Challenge Your contributing factors
17 Solution Your contributing factors
18 Corrective actions
19 Challenge Your corrective actions
20 Solution Your corrective actions
21 Facilitating the postmortem meeting
22 Leading a group postmortem analysis

After the Postmortem
23 Transparent uptime
24 Retrospectives and near misses

Conclusion
25 Next steps