Glossary

Incident postmortem

Definition

The written document produced after a production incident review, capturing the timeline of events, root cause analysis, contributing factors, impact assessment, and corrective action items — serving as both an accountability record and an organizational learning artifact.

Incident postmortem and incident review describe related but distinct things: the review is the meeting; the postmortem is the document. The distinction matters because the document lives on. Well-written incident postmortems accumulate into an organization's institutional memory — patterns emerge, recurring root causes get addressed, and new team members learn from failures they didn't experience.

Standard incident postmortem sections:

  • Summary: Two to three sentences. What happened, when, impact magnitude.
  • Timeline: Chronological event log with timestamps.
  • Root cause: The single underlying cause, written precisely.
  • Contributing factors: Other conditions that enabled the root cause or amplified the impact.
  • Impact: Users affected, duration, revenue or data implications.
  • Action items: Changes to be made, with owner and due date.
  • Lessons learned: Takeaways for the team and organization.

Good postmortem writing: Precise, factual, blameless. Avoid passive voice that obscures agency. Name systems and processes, not people. 'The deploy pipeline allowed the change to reach production without a canary stage' is more useful than 'the engineer deployed without testing.'

Distribution: Incident postmortems should be shared with the team, relevant stakeholders, and leadership. Public postmortems (shared with customers) are a trust-building practice for companies that commit to radical transparency.

Start the draft from the whiteboard: snap the incident review timeline and root cause diagrams with BoardSnap, then use the structured summary as the draft skeleton.

Examples

  • Google's SRE team maintains a library of internal incident postmortems that teams consult when investigating similar issues — the accumulated knowledge prevents repeat incidents.
  • A cloud infrastructure provider publishes sanitized incident postmortems on its status page within 72 hours of any significant outage.
  • A startup writes its first formal incident postmortem after a database migration gone wrong — the document becomes the template for all future reviews.
  • A team's incident postmortem reveals that the same contributing factor — no circuit breaker on the payment processor API — appeared in three separate incidents over 18 months.

Snap a incident postmortem. Ship its actions.

BoardSnap turns any whiteboard — including this one — into a summary and action plan.

Free · 1 project, 30 boards Pro $9.99/mo · everything unlimited Pro $69.99/yr · save 42%
BoardSnap Free on the App Store Get