From e470939723c5d8c57dc1044d54d86407b4ac2a0c Mon Sep 17 00:00:00 2001 From: Mikael Hugo Date: Sun, 17 May 2026 04:12:55 +0200 Subject: [PATCH] spec(R051): same-unit dispatch-loop detection (Ralph Wiggum safety net) When dispatcher resolves the same unit N>3 times in a session without state-change between dispatches, detect the loop, pause, file self-feedback. Targets the 2026-05-17 dogfood pattern where reassess-roadmap M010/S03 ran 70+ times because of the ASSESSMENT suffix mismatch (now fixed in a737af318). Even after the immediate fix, this safety net prevents future unknown-bug versions of the same failure mode from burning hours of compute. R051 makes the failure first-class detectable instead of operator-hand-debug. Owning milestone M038 (future). Co-Authored-By: Claude Opus 4.7 (1M context) --- .sf/REQUIREMENTS.md | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/.sf/REQUIREMENTS.md b/.sf/REQUIREMENTS.md index fddfc34f8..4cb6e9a68 100644 --- a/.sf/REQUIREMENTS.md +++ b/.sf/REQUIREMENTS.md @@ -636,3 +636,14 @@ ADR-0000 declares SF a **purpose-to-software compiler**. R036–R040 codify that - Supporting slices: none - Validation: unmapped - Notes: Two paths to scores: (a) bulk-import published scores from MMLU/HumanEval/SWE-bench for known models, (b) live-measure via SF's eval suite for unknown models (existing `.sf/evals/autonomous-solver/` framework). Doctor surfaces uncovered models; scheduler treats uncovered as "use cautiously, not for high-stakes units." + +### R051 — Same-Unit Dispatch-Loop Detection (Ralph Wiggum Safety Net) +- Class: failure-visibility +- Status: active +- Description: When the dispatcher resolves the same `{unitType, unitId}` more than N=3 times in a single autonomous session without that unit's outcome changing the dispatcher's decision, SF must detect the loop, pause the autoLoop, file a self-feedback entry of kind `dispatch:degenerate-loop` with the unit details, and surface to operator. Default: stop dispatching that unit; advance to the next-best alternative. +- Why it matters: 2026-05-17 dogfood: SF dispatched `reassess-roadmap M010/S03` 70+ times because `checkNeedsReassessment` had a suffix-mismatch bug (`ASSESS` vs `ASSESSMENT.md`). Each dispatch took 2-7 minutes; cumulatively hours of compute burned on the same no-op task. The Ralph-Wiggum-obvious failure mode — "I keep doing the same thing and nothing changes" — needs to be a first-class detector, not require operator hand-debugging. +- Source: spec (responds to dogfood evidence 2026-05-17) +- Primary owning slice: unmapped (future "M038 Dispatch Loop Safety") +- Supporting slices: none +- Validation: unmapped +- Notes: Implementation: per-session counter of `dispatch-resolve` decisions keyed by `${unitType}:${unitId}`. When count > 3 in <30min wall AND no measurable state change between dispatches (e.g., milestone status, slice status, artifact set unchanged), trigger the safety net. Doctor surfaces the loop as a project-level issue.