clinic · an adr with one option

An ADR with one option

The ADR is written. The status is Accepted. The MADR sections are all filled. But the Considered Options list has one entry. The decision was a default the team announced, not a choice the team made — and the precedent is silently wrong.

The artefact

Excerpt — ADR-031, "Background job runner", March 2026
Status: Accepted Date: 2026-03-04 Author: the senior dev
Context and Problem Statement: The new feature requires background processing for nightly batch normalisation. We need a job runner.
Decision Drivers:
Reliability — jobs must not be lost on crash.
Team familiarity.
Time to ship.
Considered Options:
BullMQ
Options Analysis:
BullMQ has good documentation and the team has used it before.
Decision Outcome: Use BullMQ.
Consequences: We will need to set up Redis if we don't already have it.
Signed: the senior dev, the TL

The ADR is structurally complete by the letter of the MADR format. Status is Accepted. Two signatures. The team started building on BullMQ that afternoon.

Three months later, the next team that joined the codebase asked: why BullMQ and not Inngest, Hatchet, or Temporal? We just chose a fundamentally different one for the support service. The original team's answer was we already had Redis. The new team's answer was we don't want Redis as a dependency for this service. Both decisions were now precedents. Neither agreed.

A quarter later, a third service needed a runner. The team had two precedents, opposite outcomes. The architecture review ran for ninety minutes. Nothing was decided. A fourth service shipped with a third choice (Inngest) because that team got tired of waiting.

What's wrong?

Stop. Find three things wrong before reading the diagnosis.

Diagnosis (open when ready)

1. The Considered Options list has one entry

Considered Options:1. BullMQ

This is the failure. An ADR with one option is not an ADR; it is an announcement. The corpus rule from Practice · Writing ADRs · Step 5: at least two real options, including do nothing.

A real Considered Options section here would have been:

text

1. BullMQ — Redis-backed, team has used it.
2. Inngest — managed, no Redis, language-agnostic.
3. Temporal — workflow-shaped, strong durability story.
4. Do nothing — run the batch in-process as a scheduled task,
   accepting the failure mode if the process restarts mid-batch.

If those four had been on the page, the team would have noticed something. Maybe BullMQ would still have won. The win would have been a choice — survivable as precedent.

2. The Options Analysis is a single positive sentence

Options Analysis: BullMQ has good documentation and the team has used it before.

No cons. No trade-offs. This is not analysis; it is endorsement. The corpus's MADR pattern requires pros and cons per option, anchored in the drivers. The cons section forces the team to confront what the chosen option costs — and that confrontation is what makes the precedent durable.

When the next service evaluated BullMQ and ran into a real con (Redis dependency they didn't want), they had no record of why the first team had thought that was acceptable. So they made a new choice with no acknowledgement of the precedent — and the architecture started to fragment.

3. The Decision Outcome has no trade-off

Decision Outcome: Use BullMQ.Consequences: We will need to set up Redis if we don't already have it.

The Redis dependency is the chosen option's trade-off, not its consequence. Trade-off ≠ consequence:

Trade-off accepted is what we are choosing to lose by picking this option. It belongs in Decision Outcome.
Consequence is what happens next because of this decision. It is downstream.

The Redis dependency was a trade-off — and a load-bearing one. The next service rejected BullMQ precisely because of it. If the first ADR had said Trade-off accepted: we are coupling this service to Redis; future services may need different runners, the second team could have read the precedent honestly and decided.

The fix

text

# ADR-031 — Background job runner for normalisation service

Status:  Accepted
Date:    2026-03-04
Author:  the senior dev

## Context and Problem Statement
The normalisation service requires background processing for
nightly batch normalisation of ~50 M submissions. The runner
must survive worker restarts without losing in-flight jobs.

## Decision Drivers
- Reliability (jobs must not be lost mid-batch).
- Team familiarity (cycle time to first batch).
- Operational dependency surface area (each runner brings
  its own infra; we already run several services with Redis).

## Considered Options
1. BullMQ — Redis-backed, team has shipped with it before.
2. Inngest — managed, no Redis, language-agnostic.
3. Temporal — workflow-shaped; strong durability; new infra.
4. Do nothing — run the batch in-process as a scheduled task,
   accepting jobs lost on process restart.

## Options Analysis
### BullMQ
Pros: Team shipped it before; Redis already in stack.
Cons: Couples this service to Redis; debugging requires
      Redis-side tooling.

### Inngest
Pros: No new infra; language-agnostic; durable.
Cons: Adds a managed dependency; cost model changes with
      job volume.

### Temporal
Pros: Strong durability story; suits workflow shape.
Cons: Significant new infra; team learning curve.

### Do nothing
Pros: Zero new infra; ships fastest.
Cons: Mid-batch restart loses progress; prediction's check
      method requires complete batches.

## Decision Outcome
Chosen: BullMQ.

Rationale:
  Team familiarity is the dominant driver this cycle —
  shipping the cycle's prediction is more important than
  optimising the runner choice. Redis is acceptable for
  this service because it is already operationally owned.

Trade-offs explicitly accepted:
  - This service is now coupled to Redis. Future services
    may need a different runner; this ADR is *not* a
    precedent for new services. Each new service should
    re-evaluate.
  - Debugging requires Redis-side tooling; on-call rotation
    documented separately.

## Consequences
Positive:
  - Cycle ships with familiar infra; prediction's check
    method is protected.

Negative:
  - One more system tied to Redis liveness.

Risks:
  - Redis becomes a single point of failure across multiple
    services. Mitigation: tracked in capacity-plan ADR-018.

## Implementation Notes
Configuration lives in services/normalisation/jobs/.
Migration from existing in-process scheduler in story J6-jobs.

## Sign-off
Author:   the senior dev · 2026-03-04
Reviewer: the TL · 2026-03-05

The key change: Trade-offs explicitly accepted explicitly names the boundary of the precedent (this ADR is not a precedent for new services). The second service team reads that and knows. The third team reads it and knows.

Where this comes from in the chain

This failure traces to Scope (Level 3). The ADR was written, but the discipline of writing it was skipped. The structural fix is at Level 3 — the ADR checklist gates at least two real options, and the writing practice teaches trade-off in Decision Outcome.

A senior practitioner catches this in 60 seconds: any ADR with one entry under Considered Options is rejected at review. The PR does not merge until the alternatives are honestly considered.

An ADR with one option ​

The artefact ​

What's wrong? ​

1. The Considered Options list has one entry ​

2. The Options Analysis is a single positive sentence ​

3. The Decision Outcome has no trade-off ​

The fix ​

Where this comes from in the chain ​

See also ​