fix(sf): generalize M008 leak in systematic-debugging skill

The global skill hardcoded `.sf/milestones/M008/bugs/bug-registry.json`
and `M008-specific:` rules — when M008 closes the skill goes stale and
misleads agents on every other milestone.

Reframed as "Milestone Bug Registry Guidance": the rules apply to any
milestone that ships a `bug-registry.json` + `triage-protocol.md` pair,
with M008 cited as the canonical example for the registry test. When no
registry exists, the section is skipped — agents follow the normal
evidence/repro/fix flow.

triage-protocol-registry test (31 tests) still passes — keeps the
literal `bug-registry.json` reference and HIGH/MEDIUM/LOW + cluster +
update-after-fix assertions.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Mikael Hugo 2026-05-02 21:44:08 +02:00
parent 08859624f8
commit b79ebbf10a

View file

@ -154,6 +154,32 @@ Use only when evidence collection runs in parallel:
Do not delegate the fix — the main thread owns it. Subagents return paragraphs; you apply the change.
## Milestone Bug Registry Guidance
Some milestones (notably systematic bug-hunt milestones such as M008) ship with a curated `bug-registry.json` and a `triage-protocol.md` under `.sf/milestones/<MID>/`. When you're fixing a bug and either artifact is present for the active milestone, follow these additional rules:
1. **Read the registry first** — before proposing a fix, check `.sf/milestones/<MID>/bugs/bug-registry.json` to see if the issue is already documented. If it is, use the `suggestedFix` field as a starting hypothesis, not as gospel. (Canonical example: `.sf/milestones/M008/bugs/bug-registry.json`.)
2. **Follow the triage protocol** — classify severity using the definitions in `.sf/milestones/<MID>/triage-protocol.md`. HIGH and MEDIUM severity bugs require regression tests; LOW severity bugs should be batched with other fixes in the same cluster.
3. **Update the registry** — after fixing, update `bug-registry.json`: set `status` to `"FIXED"`, add `fixedByTaskId`, and remove `assignedTo` if present.
4. **Cluster-aware fixes** — if a fix touches files in multiple clusters, escalate to human review per the triage protocol escalation rules.
5. **Confidence gate for batch fixes** — when fixing multiple LOW-severity bugs in one PR, each item must have a passing regression test and the batch confidence must be ≥ 0.80.
When the active milestone has no `bug-registry.json`, this section does not apply — proceed with the normal evidence/repro/fix flow above.
### Registry Fields Reference
| Field | Meaning |
|---|---|
| `id` | Unique finding identifier (`<cluster>-<file>-<description>-<line>`) |
| `file` | Source file where issue was found |
| `lines` | Line number(s) |
| `category` | Bug category (e.g. `logic`, `race condition`, `resource leak`) |
| `severity` | `HIGH` / `MEDIUM` / `LOW` / `FALSE_POSITIVE` |
| `status` | `CONFIRMED` / `FALSE_POSITIVE` / `FIXED` / `WONTFIX` / `IN_PROGRESS` |
| `cluster` | Which cluster the finding belongs to |
| `suggestedFix` | Recommended fix or `"N/A (intentional design)"` |
| `fixedByTaskId` | Task ID that fixed the finding (set after fix) |
## Rules
- Fix root cause, not symptom.
@ -161,6 +187,7 @@ Do not delegate the fix — the main thread owns it. Subagents return paragraphs
- Keep before/after repro for non-trivial bugs.
- Three failed fix attempts → stop. Question the architecture. Discuss before attempt 4.
- Never claim "fixed" without re-running the original repro path.
- **Registry-backed milestones (e.g., M008)**: Never claim "fixed" without updating `bug-registry.json`.
## Red Flags — STOP
@ -170,3 +197,4 @@ Do not delegate the fix — the main thread owns it. Subagents return paragraphs
- Diagnosis stated without `Observed:` evidence.
- Third fix attempt without questioning architecture.
- Test deleted to make a build green — that's encoding the bug, not fixing it.
- Fixing a bug listed in a milestone bug-registry without reading the registry first.