clinic · a retro that listed

A retro that listed

The retro produced eleven sticky notes. The team felt productive. Three cycles later, the team is having the same retro. The retro that listed is the retro that compounded nothing.

The artefact

Excerpt — Retro notes, "2026-Q2 cycle", 2026-06-17
What worked:
Trio sign-off held all four briefs.
The 48h watch caught the queue.render spike.
Maya's Gherkin sets were really thorough.
What didn't work:
Observation sessions slipped twice.
The signal reading was written by the PO alone, no CS voice.
We had two stories that turned out to be epics.
The locale map needed extending mid-cycle.
Maya was on PTO during release week and there was a gap.
Some PRs sat in review for >3 days.
The flag inventory grew to 12 active flags.
The standup ran long all week 3.
What we will change:
the TL: write an ADR for flag-cleanup process.
Alex: get CS involved in signal reading earlier.
the senior dev: tighten the PR review SLA.
Maya: train a backup QA.
Whole team: be better at story sizing.
Action items: all of the above. Owners assigned.

The retro felt productive. Sticky notes everywhere. Five action items, multiple owners. Everyone left with a sense of progress.

Three months later:

The flag inventory is at 14, not 12.
The PO is still writing signal readings alone.
The PR review SLA was forgotten by week 2.
a backup QA was not trained; the next QA gap repeated.
Story sizing did not improve.

The retro produced eleven observations and five commitments. Three cycles later, zero of them landed.

What's wrong?

Stop. Find three things wrong before reading the diagnosis.

Diagnosis (open when ready)

1. The retro produced five changes, not one

The corpus rule from Practice · Retrospective · Step 5: one change. Owned. Dated. Testable.

Five changes is not one change. Five changes is the team noticed five things and committed to none. The cost of writing a single change is the discipline of choosing — what is the one thing whose absence is costing us the most? Without that choice, the retro is an observation log, not a chain artefact.

What the corpus would have produced from this same conversation:

text

What we will change:
  One change: The signal reading is co-written with CS Lead
  within 48 hours of the check date.
  Owner:    Alex (PO) — schedules; Dina (CS) — co-writes.
  Dated:    Process change in 2026-Q3 cycle (first signal
            reading: 2026-09-01).
  Testable: 2026-Q3 signal reading has a CS-signed
            voice-of-customer paragraph, written within 48h
            of the check date.

The other ten observations are still real. They are noted. Some are pushed to next retro. Some are merged into existing artefacts (the flag inventory becomes a flag cleanup story; the long standup becomes its own retro item next cycle). The discipline of one change is what makes any of the eleven actually compound.

2. The didn't work section has no chain-level tags

Every item in the list is a symptom, not a level.

Observation sessions slipped twice → Level 2 (Discovery)
Signal reading was written by the PO alone → Level 5 (Operation — CS routing)
Two stories were epics in disguise → Level 3 (Scope — slicing)
Locale map needed extending mid-cycle → Level 2 (Discovery — locale set was an unvalidated assumption)
Maya was on PTO during release week, gap → Level 5 (Operation — coverage planning)
PRs sat in review >3 days → Level 4 (Execution — review discipline)
Flag inventory grew to 12 → Level 4 (Execution — cleanup discipline)
Standup ran long week 3 → Level 5 (Operation — meeting discipline)

When the eight items are tagged, the team sees the pattern: the most-felt friction was Level 2 (Discovery slipped, locale map was an assumption never tested). The one change that addresses Level 2 is more valuable than five changes that each address one symptom of Level 4 or Level 5.

The corpus's discipline at the retro: trace every didn't work to a level before picking the change. Without the tags, the retro picks based on whoever spoke last.

3. The previous retro's change was not read at the start

There is no record in the retro notes of what we committed to last time and whether it landed. The corpus rule: open every retro with the previous retro's named change. If it landed, name what it produced. If it didn't, name what the chain learned about why.

Without reading the previous retro, this retro had no continuity. The team produced eleven new observations without first checking whether the previous five commitments had survived. The next retro will produce eleven more. The team will feel productive while compounding nothing.

The fix

The retro the corpus would have run:

text

# Retrospective — 2026-Q2 cycle · Hebrew-name grading flow

## Read first
Signal reading (5 lines, read aloud).
Previous retro's change: "TL writes the TDB before the trio
signs the Feature Brief, not after." STATUS: LANDED. Produced
three of four briefs with feasibility-signed; the fourth was
an outlier (re-opened mid-cycle).

## What worked
- TDB-before-FB-sign held; engineering had constraint clarity.
- 48h watch caught queue.render spike; the TL's follow-up patch
  shipped in window.

## What didn't work   (with chain levels)
- Observation slipped twice            → L2 Discovery
- Signal reading without CS voice      → L5 Operation
- Two stories were epics in disguise   → L3 Scope
- Locale map extended mid-cycle        → L2 Discovery
- QA gap during release week           → L5 Operation
- PR review SLA broke                  → L4 Execution
- Flag inventory grew                  → L4 Execution
- Standup ran long week 3              → L5 Operation

## Pattern
Two of the eight are L2 (Discovery). The two L2 items are
also the two that affected the prediction's signal reading.
Picking one L2 change is highest leverage.

## What we will change
One change: Initiative Brief assumption section is reviewed
at the start of each cycle's amigos, with a named owner for
the most-risk-bearing unvalidated assumption.
Owner:    Alex (PO).
Dated:    First amigos of 2026-Q3 cycle (week starting
          2026-07-08).
Testable: The locale-map assumption is reviewed; the cycle
          either tests it or names why it is being deferred.

## Other items noted (not committed this retro)
- Flag inventory cleanup — push to Q3 retro.
- QA backup coverage — Maya to schedule a 1:1 with a backup QA;
  not a retro change, a normal management action.
- PR review SLA — the senior dev to read [Practice · Story writing] for
  sizing discipline; not a retro change.
- Standup length — bring back if it persists in 2026-Q3.

Signed: Alex (PO), the TL, Maya (QA),
        Dina (CS), the senior dev.

The retro produced one chain-level change and explicitly named four other observations as not committed. Three cycles later, the L2 change has landed and the team's calibration on Discovery has improved. The four uncommitted items either resolved themselves, surfaced again with more clarity, or were quietly handled outside the retro.

The retro that listed felt productive. This retro compounds.

Where this comes from in the chain

This failure traces to Operation (Level 5) — but compounds across every level. The retro is the operation that makes the cycle's lessons survive into the next cycle. When the operation degrades to a list, every other level pays interest.

The structural fix is in the retro's discipline itself, not in any particular operational practice. The Retrospective checklist is the gate.

A retro that listed ​

The artefact ​

What's wrong? ​

1. The retro produced five changes, not one ​

2. The didn't work section has no chain-level tags ​

3. The previous retro's change was not read at the start ​

The fix ​

Where this comes from in the chain ​

See also ​