DevOps Foundations: Incident Management

DevOps Foundations: Incident Management

English | MP4 | AVC 1280×720 | AAC 48KHz 2ch | 1h 03m | 326 MB

Uptime is critical for modern systems, but downtime and security incidents are inevitable. Your users’ experience depends on your ability to respond quickly, confidently, and consistently when things go awry. In this course, learn how to handle unexpected crises in information systems from a DevOps perspective. Instructor Ernest Mueller steps through the overall incident response process, explaining how to define what constitutes an incident for your organization and select the tools you’ll need to mitigate these high-stakes problems. He also explains how to detect and report incidents, communicate with users and internal employees about issues, troubleshoot problems, and continuously improve your incident management process.

Topics include:

  • The incident response process
  • Detecting and reporting incidents
  • Communicating effectively about a problem
  • Best practices for diagnosis and repair
  • Cleaning up after an incident
  • Continuously improving
  • Implementation challenges for incident response
Table of Contents

1 Handling incidents with excellence
2 What to expect from this course
3 Why do I need incident management
4 The incident command system
5 Scoping the problem
6 Your incident toolchain
7 Incident toolchain example
8 Detecting and reporting incidents
9 First response and escalation
10 Incident communication with your users
11 Communicating inside your organization
12 Best practices for diagnosis and repair
13 Cleaning up after
14 Continuously improving
15 Training and game days
16 Implementation challenges
17 Next steps