docs(sf): finish PDD v2 propagation into Purpose Gate, requesting/receiving review
Tail-end of the PDD v2 work (Assumptions field + safety/liveness split +
machine-executable Evidence). Three documents that still referenced v1's
4-field Purpose Gate are updated to the full 8-field PDD packet:
- docs/SPEC_FIRST_TDD.md — Purpose Gate now lists all 8 fields with the
Assumptions and Failure-boundary additions inline.
- skills/requesting-code-review — replaces "Purpose & Consumer" section with
"PDD packet (all 8 fields)" restated verbatim from .sf/active/{unit-id}/pdd.md.
Falsifier and Scope-defence sections clarified vs Failure-boundary and
Non-goals to remove overlap.
- skills/receiving-code-review — Purpose Gate criterion updated to demand
the full PDD packet with machine-executable Evidence, not just
Purpose/Consumer/Value-at-risk.
PDD packet (inline):
- Purpose: every artefact that references "Purpose Gate" agrees on the same
8-field definition; reviewers and reviewees read the same packet.
- Consumer: spec-first-tdd, requesting-code-review, receiving-code-review.
- Contract: all three documents list the same 8 fields with the same
Assumptions / safety+liveness / machine-executable-Evidence wording.
- Evidence: grep confirms PDD packet references in all three; typecheck:extensions exits 0.
- Non-goals: no edits to the PDD skill itself (already v2); no edits to other
skills referencing v1 Purpose Gate beyond these three (they don't exist).
- Invariants: existing review-loop sections preserved; only Purpose-Gate-
related sections rewritten.
- Assumptions: PDD v2 SKILL.md is the canonical source of field definitions;
these three documents are projections of it.
This commit is contained in:
parent
b48e6d5dd7
commit
082526c0e4
3 changed files with 28 additions and 11 deletions
|
|
@ -57,14 +57,18 @@ Every unit (milestone, slice, task) sits in one of those rows. If a piece of wor
|
|||
|
||||
## Purpose Gate
|
||||
|
||||
Every artifact (slice plan, task plan, function, test, ADR) must answer:
|
||||
Every artifact (slice plan, task plan, function, test, ADR) must answer the same 8 PDD fields captured by the [`purpose-driven-development`](../src/resources/extensions/sf/skills/purpose-driven-development/SKILL.md) skill — these fields ARE the Purpose Gate:
|
||||
|
||||
- **why** this behaviour exists
|
||||
- **what value** it creates or protects
|
||||
- **who** uses it in production (real consumer, not just tests)
|
||||
- **what breaks** if it returns the wrong answer
|
||||
- **Purpose**: why this behaviour exists.
|
||||
- **Consumer**: who depends on the outcome in production (real caller, not just tests).
|
||||
- **Contract**: what observable behaviour proves success — what the consumer receives, not how the implementation works internally.
|
||||
- **Failure boundary**: what *correct failure* looks like if the purpose can't be fulfilled — degrade, surface, do not swallow.
|
||||
- **Evidence**: the test, metric, or repro that proves the contract. Each criterion must be machine-executable (named test, queryable metric, runnable command) OR explicitly tagged `[MANUAL: reviewer + scenario]`. Prose-only evidence is unfalsifiable and rejected.
|
||||
- **Non-goals**: what this is *not* solving.
|
||||
- **Invariants**: what must remain true. If the change touches async, queues, timers, or state machines, split into safety ("X never happens") + liveness ("Y eventually happens"). Pure synchronous code may use safety-only.
|
||||
- **Assumptions**: conditions about the world that MUST be true for this spec to be valid — locking protocols, API stability, caller invariants, deployment context, data shape. World-side failures (assumption violated) are invisible to internal tests and are the most expensive failure class.
|
||||
|
||||
If any answer is missing: `BLOCKED: purpose unclear — [which field is missing]`. Do not invent a plausible purpose to proceed. Surfacing the gap is more valuable than rationalising past it.
|
||||
If any field is missing: `BLOCKED: purpose unclear — [which field is missing]`. Do not invent a plausible answer to proceed. Surfacing the gap is more valuable than rationalising past it.
|
||||
|
||||
Treat the contract as a **falsifiable hypothesis**: name the evidence that would prove it wrong before implementation locks in. A contract without a falsifier is half a contract.
|
||||
|
||||
|
|
|
|||
|
|
@ -98,7 +98,7 @@ For non-trivial runtime/provider review items, verification must include:
|
|||
Before changing code based on a suggestion, verify it doesn't violate sf's invariants:
|
||||
|
||||
- **Iron Law**: Does the suggestion ask you to skip the failing test → fix → green cycle? Push back. The bug is a missing test; write it first.
|
||||
- **Purpose Gate**: Does the suggestion add an exported symbol without a Purpose / Consumer / Value-at-risk? Push back.
|
||||
- **Purpose Gate**: Does the suggestion add an exported symbol without a complete PDD packet (Purpose, Consumer, Contract, Failure boundary, Evidence, Non-goals, Invariants, Assumptions — see [`purpose-driven-development`](../purpose-driven-development/SKILL.md))? Push back. A symbol that doesn't earn all 8 fields is unspecified, and "I'll fill them in later" is a scope-creep vector. Evidence must be machine-executable or `[MANUAL: reviewer + scenario]` — prose-only is rejected.
|
||||
- **YAGNI**: Does the suggestion add abstraction with zero current callers? Run `rg` for callers — if zero, push back: *"Nothing calls this in production. Adds dead code. Remove per YAGNI?"*
|
||||
- **Recent decisions**: If the suggestion contradicts a `.sf/DECISIONS.md` entry or a recent ADR (`docs/dev/ADR-*.md`), check the history before implementing or pushing back.
|
||||
- **Self-modification boundary**: If the suggestion edits a protected file (`SPEC.md`, `BUILD_PLAN.md`, `AGENTS.md`, etc.) without explicit human approval, push back — those need human sign-off.
|
||||
|
|
|
|||
|
|
@ -64,10 +64,21 @@ sf_search_memories(query="<area of change>", limit=5)
|
|||
- Requirement: <Active R-id from REQUIREMENTS.md, if applicable>
|
||||
- Decision: <DECISIONS.md entry, if applicable>
|
||||
|
||||
## Purpose & Consumer
|
||||
## PDD packet (all 8 fields)
|
||||
|
||||
Restate verbatim from `.sf/active/{unit-id}/pdd.md` — do not paraphrase. Same 8 fields filled in by [`purpose-driven-development`](../purpose-driven-development/SKILL.md). If any field is missing or empty, **stop and return to PDD** — do not request review with an incomplete packet.
|
||||
|
||||
- **Purpose**: <why this exists>
|
||||
- **Consumer**: <production code path that depends on this — file:symbol>
|
||||
- **Value at risk**: <what breaks if it returns the wrong answer>
|
||||
- **Contract**: <observable behaviour that proves success — what the consumer receives, not what the implementation does internally>
|
||||
- **Failure boundary**: <what *correct failure* looks like if the purpose can't be fulfilled — degrade, surface, do not swallow>
|
||||
- **Evidence**: see Evidence section below; every criterion must be machine-executable or `[MANUAL: reviewer + scenario]` — prose-only is rejected
|
||||
- **Non-goals**: <what this change is *not* solving>
|
||||
- **Invariants**: <what must remain true. If the change touches async / queues / timers / state machines: split into Safety ("X never happens") + Liveness ("Y eventually happens"). Pure synchronous code: safety-only is fine.>
|
||||
- **Assumptions**: <conditions about the world that MUST be true for this spec to be valid — locking protocols, API stability, caller invariants, deployment context, data shape>
|
||||
|
||||
## Value at risk
|
||||
<what breaks if this returns the wrong answer — used by reviewer to set scrutiny level>
|
||||
|
||||
## Evidence
|
||||
- Typecheck: ✅ `npm run typecheck:extensions`
|
||||
|
|
@ -83,10 +94,10 @@ sf_search_memories(query="<area of change>", limit=5)
|
|||
<low | medium | high — based on blast radius and change complexity>
|
||||
|
||||
## Falsifier
|
||||
<the observable condition that would prove this contract wrong, and whether it's been checked>
|
||||
<the observable condition that would prove this contract wrong, and whether it's been checked — distinct from Failure boundary: Failure boundary defines *correct* failure, Falsifier names the observation that proves the *contract itself* is wrong>
|
||||
|
||||
## Scope defence
|
||||
<one sentence — what tempting adjacent work this slice refused>
|
||||
<one sentence — what tempting adjacent work this slice refused. Distinct from Non-goals: Non-goals are static exclusions stated up front; Scope defence is the discipline statement about what was *almost* done but wasn't.>
|
||||
```
|
||||
|
||||
For architecture-heavy or boundary-heavy changes, append:
|
||||
|
|
@ -155,5 +166,7 @@ sf_save_memory(
|
|||
- If blast radius is high (>10 transitive callers), flag it explicitly.
|
||||
- For non-trivial runtime/provider changes, include the debug/repro evidence — not just trace summaries.
|
||||
- For architecture-heavy changes, include disagreement evidence — what advocate strengthened, what challenger attacked.
|
||||
- All 8 PDD packet fields must be present and restated from `.sf/active/{unit-id}/pdd.md`. If any of the 8 is missing, or Evidence contains prose-only criteria, **stop** and return to `purpose-driven-development` — do not request review with an incomplete packet.
|
||||
- For changes touching async, queues, timers, or state machines, the Invariants field MUST split safety ("X never happens") and liveness ("Y eventually happens"). A single combined invariants list is a v1 artefact and will be rejected.
|
||||
- Mark major claims as `Observed`, `Inferred`, or `Proposed`.
|
||||
- Include the strongest reason the change could still be wrong even if tests pass.
|
||||
|
|
|
|||
Loading…
Add table
Reference in a new issue