Timeline
A chronological list of events from the first signal (alert, user report, or internal observation) to full recovery. Write times, not durations. 'At 14:32 UTC, alert fired.' Write the detection event, the investigation pivots, the mitigation steps, and the recovery. The timeline is the most objective section — stick to what happened, not why.
Root cause
The technical condition that allowed the incident to occur. Not the person who made the change — the system condition that made the change impactful. A blameless postmortem names the root cause as a system problem, not a human error. Use the five whys if needed: ask 'why did this happen?' five times until you reach a systemic cause.
Impact
Users affected (number or percentage), duration of impact, services degraded, and business impact (support tickets, revenue, SLA breaches). Write specific numbers. 'Some users saw errors' is not impact analysis. '12% of users on iOS 17.4 experienced 503 errors for 47 minutes from 14:32 to 15:19 UTC' is impact analysis.
What went well
Detection time, escalation speed, communication to users, rollback success, or any part of the response that worked better than expected. Blameless postmortems credit the responders as well as holding the system accountable. What the team did well should be recognized and repeated.
Follow-up actions
The specific changes — technical, process, or monitoring — that will prevent this incident or reduce its impact. Each action has: a description, an owner, and a due date. No action item without an owner. This section is the entire point of the postmortem.