fix(auto/loop): convergence guard breaks the reassess-roadmap redispatch loop
Some checks are pending
CI / detect-changes (push) Waiting to run
CI / docs-check (push) Blocked by required conditions
CI / lint (push) Blocked by required conditions
CI / build (push) Blocked by required conditions
CI / integration-tests (push) Blocked by required conditions
CI / windows-portability (push) Blocked by required conditions
CI / rtk-portability (linux, blacksmith-4vcpu-ubuntu-2404) (push) Blocked by required conditions
CI / rtk-portability (macos, macos-15) (push) Blocked by required conditions
CI / rtk-portability (windows, blacksmith-4vcpu-windows-2025) (push) Blocked by required conditions
Some checks are pending
CI / detect-changes (push) Waiting to run
CI / docs-check (push) Blocked by required conditions
CI / lint (push) Blocked by required conditions
CI / build (push) Blocked by required conditions
CI / integration-tests (push) Blocked by required conditions
CI / windows-portability (push) Blocked by required conditions
CI / rtk-portability (linux, blacksmith-4vcpu-ubuntu-2404) (push) Blocked by required conditions
CI / rtk-portability (macos, macos-15) (push) Blocked by required conditions
CI / rtk-portability (windows, blacksmith-4vcpu-windows-2025) (push) Blocked by required conditions
Dogfood today: autonomous mode burned $4.95 / 33.5M tokens / 28 min /
500 unproductive iterations on reassess-roadmap M006/S01 redispatching
the SAME unit ≥45 consecutive times before runaway-guard finally
fired. Each cycle: unit dispatches → swarm planner completes → unit
exits "success" → next iteration sees the same doctor slice-ref
health issue → re-queues the same unit. The auto-post-unit
auto-remediate path (insertArtifact for ASSESSMENT files) is wired
correctly but the reassess-roadmap unit's success doesn't actually
resolve the doctor's slice-reference issues — so the gate keeps
firing.
SF already has detectStuck Rule 2 ("Same unit 3+ consecutive times →
stuck") in auto/detect-stuck.js, but the doctor-health-reassess-
roadmap shortcut in auto/loop.js:1095-1170 bypasses normal pre-dispatch
and unshifts directly to sidecarQueue — so the unit never goes through
the phases-dispatch path that pushes to loopState.recentUnits, and
detectStuck never sees the repetition.
Convergence guard: before unshifting reassess-roadmap, check whether
the SAME (unitType + unitId) just ran 3+ consecutive times in
loopState.recentUnits. If yes:
- Skip the redispatch (don't unshift, don't finishTurn("retry"))
- File a self-feedback entry kind=engine-loop:non-converging-
redispatch so triage sees the pattern and can plan a real fix
- Fall through to normal runPreDispatch so the existing detectStuck
machinery can break the loop the next time the same key derives.
This is the user's "Ralph Wiggum loop" pattern — system observing its
own failure repeatedly without ever escaping. The broader convergence-
detector / solver-handoff / quarantine framework is filed for slice
planning in sf-mp8x32sy-70w298; this commit is the minimum surgical
fix for the specific reassess-roadmap-via-doctor-shortcut path that
actually fired today.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
parent
6481e54fec
commit
56e8ec6c53
1 changed files with 68 additions and 25 deletions
|
|
@ -1136,31 +1136,74 @@ export async function autoLoop(ctx, pi, s, deps) {
|
|||
const midTitle = sfState.activeMilestone?.title ?? "";
|
||||
const sliceId = sfState.activeSlice?.id ?? "reassess";
|
||||
if (mid) {
|
||||
ctx.ui.notify(
|
||||
`Health issues detected with slice references — queuing reassess-roadmap instead of pausing.`,
|
||||
"warning",
|
||||
{
|
||||
noticeKind: NOTICE_KIND.SYSTEM_NOTICE,
|
||||
dedupe_key: "doctor-health-reassess-roadmap",
|
||||
},
|
||||
);
|
||||
const { buildReassessRoadmapPrompt } = await import(
|
||||
"../auto-prompts.js"
|
||||
);
|
||||
const reassessPrompt = await buildReassessRoadmapPrompt(
|
||||
mid,
|
||||
midTitle,
|
||||
sliceId,
|
||||
s.basePath,
|
||||
);
|
||||
s.sidecarQueue.unshift({
|
||||
kind: "hook",
|
||||
unitType: "reassess-roadmap",
|
||||
unitId: `${mid}/${sliceId}`,
|
||||
prompt: `## Doctor Health Issues\n\n${healthCheck.issues.map((i) => `- ${i}`).join("\n")}\n\n${reassessPrompt}`,
|
||||
});
|
||||
finishTurn("retry");
|
||||
continue;
|
||||
// Convergence guard (Ralph Wiggum): if the SAME
|
||||
// reassess-roadmap target just ran 3+ consecutive
|
||||
// times the doctor's slice-ref issues evidently
|
||||
// aren't being resolved by reassessment. Skip
|
||||
// the redispatch, file self-feedback, and fall
|
||||
// through to normal pre-dispatch so the existing
|
||||
// detectStuck path (Rule 2) can break the loop
|
||||
// instead of looping forever burning tokens.
|
||||
const newKey = `reassess-roadmap:${mid}/${sliceId}`;
|
||||
const recentKeys = (loopState.recentUnits || [])
|
||||
.slice(-3)
|
||||
.map((u) => u?.key);
|
||||
const stuckOnReassess =
|
||||
recentKeys.length === 3 &&
|
||||
recentKeys.every((k) => k === newKey);
|
||||
if (stuckOnReassess) {
|
||||
ctx.ui.notify(
|
||||
`Convergence guard: ${newKey} succeeded 3 consecutive times but doctor's slice-ref issues persist. Skipping redispatch — running normal pre-dispatch so detectStuck can break the loop.`,
|
||||
"warning",
|
||||
{
|
||||
noticeKind: NOTICE_KIND.SYSTEM_NOTICE,
|
||||
dedupe_key: "convergence-guard-reassess",
|
||||
},
|
||||
);
|
||||
try {
|
||||
recordSelfFeedback(
|
||||
{
|
||||
kind: "engine-loop:non-converging-redispatch",
|
||||
severity: "high",
|
||||
summary: `${newKey} dispatched 3 consecutive times with success exit, but doctor's slice-reference health issues persist. Convergence guard skipped further redispatch.`,
|
||||
evidence: `Doctor health issues persisting after 3 successful reassess-roadmap cycles: ${healthCheck.issues.slice(0, 5).join(" | ")}`,
|
||||
},
|
||||
s.basePath,
|
||||
);
|
||||
} catch {
|
||||
// Filing must never block the loop's recovery path.
|
||||
}
|
||||
// Fall through to normal pre-dispatch (no
|
||||
// unshift, no finishTurn — the next phases
|
||||
// will either advance state via a different
|
||||
// unit or hit detectStuck and bail.
|
||||
} else {
|
||||
ctx.ui.notify(
|
||||
`Health issues detected with slice references — queuing reassess-roadmap instead of pausing.`,
|
||||
"warning",
|
||||
{
|
||||
noticeKind: NOTICE_KIND.SYSTEM_NOTICE,
|
||||
dedupe_key: "doctor-health-reassess-roadmap",
|
||||
},
|
||||
);
|
||||
const { buildReassessRoadmapPrompt } = await import(
|
||||
"../auto-prompts.js"
|
||||
);
|
||||
const reassessPrompt = await buildReassessRoadmapPrompt(
|
||||
mid,
|
||||
midTitle,
|
||||
sliceId,
|
||||
s.basePath,
|
||||
);
|
||||
s.sidecarQueue.unshift({
|
||||
kind: "hook",
|
||||
unitType: "reassess-roadmap",
|
||||
unitId: `${mid}/${sliceId}`,
|
||||
prompt: `## Doctor Health Issues\n\n${healthCheck.issues.map((i) => `- ${i}`).join("\n")}\n\n${reassessPrompt}`,
|
||||
});
|
||||
finishTurn("retry");
|
||||
continue;
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
|
|
|||
Loading…
Add table
Reference in a new issue