diff --git a/.sf/REQUIREMENTS.md b/.sf/REQUIREMENTS.md index fddfc34f8..4cb6e9a68 100644 --- a/.sf/REQUIREMENTS.md +++ b/.sf/REQUIREMENTS.md @@ -636,3 +636,14 @@ ADR-0000 declares SF a **purpose-to-software compiler**. R036–R040 codify that - Supporting slices: none - Validation: unmapped - Notes: Two paths to scores: (a) bulk-import published scores from MMLU/HumanEval/SWE-bench for known models, (b) live-measure via SF's eval suite for unknown models (existing `.sf/evals/autonomous-solver/` framework). Doctor surfaces uncovered models; scheduler treats uncovered as "use cautiously, not for high-stakes units." + +### R051 — Same-Unit Dispatch-Loop Detection (Ralph Wiggum Safety Net) +- Class: failure-visibility +- Status: active +- Description: When the dispatcher resolves the same `{unitType, unitId}` more than N=3 times in a single autonomous session without that unit's outcome changing the dispatcher's decision, SF must detect the loop, pause the autoLoop, file a self-feedback entry of kind `dispatch:degenerate-loop` with the unit details, and surface to operator. Default: stop dispatching that unit; advance to the next-best alternative. +- Why it matters: 2026-05-17 dogfood: SF dispatched `reassess-roadmap M010/S03` 70+ times because `checkNeedsReassessment` had a suffix-mismatch bug (`ASSESS` vs `ASSESSMENT.md`). Each dispatch took 2-7 minutes; cumulatively hours of compute burned on the same no-op task. The Ralph-Wiggum-obvious failure mode — "I keep doing the same thing and nothing changes" — needs to be a first-class detector, not require operator hand-debugging. +- Source: spec (responds to dogfood evidence 2026-05-17) +- Primary owning slice: unmapped (future "M038 Dispatch Loop Safety") +- Supporting slices: none +- Validation: unmapped +- Notes: Implementation: per-session counter of `dispatch-resolve` decisions keyed by `${unitType}:${unitId}`. When count > 3 in <30min wall AND no measurable state change between dispatches (e.g., milestone status, slice status, artifact set unchanged), trigger the safety net. Doctor surfaces the loop as a project-level issue.