Skip to content

The Postmortem and De-escalation

Every significant incident produces a postmortem within 48 hours. One question: which chain level was missing?

The postmortem

A postmortem that concludes "the developer should have been more careful" has failed. It stops the trace at the person and misses the structural gap they fell into.

Postmortem outputVerdict
"The amigos template had no prompt for configuration discoverability. We are adding one. Owner: Maya. Check: zero scenario-gap bugs of this class in the next release."Good. Produces a structural change — testable, owned, dated.
"We need to be more careful with configuration changes going forward and improve communication."Produces nothing. A feeling. Evaporates before the next incident.

The first produces a structural change — testable, owned, dated. The second produces a feeling. The feeling evaporates before the next incident. If the same class of incident happens twice, the postmortem did not fix the system — it documented it. Documentation is not repair.

De-escalation

De-escalation is as important as escalation. Lingering war rooms drain energy and create alert fatigue. Stand down when:

  • The issue is resolved and confirmed by monitoring.
  • A stable workaround is deployed.
  • Or investigation reveals the impact is lower than assessed.

Archive the war room channel — don't delete it; the post-mortem needs the record. Check in on the people involved: "How are you doing?" not just "What happened?" Burnout from incident response is cumulative.

Resolution gate — Postmortem to Retrospective

Enough to know which chain level produced the incident.

The structural gap is named. A specific change is owned, with a check date. The escalation and de-escalation ran cleanly. The war room is archived.

Continue — Worked example: the JWT outage →

200apps · How We Work · NWIRE