From d80060fec5b0f39913d825812db3681bad67e101 Mon Sep 17 00:00:00 2001 From: Mikael Hugo Date: Sat, 16 May 2026 20:58:07 +0200 Subject: [PATCH] fix(headless-triage): 60s no-output watchdog to cap session.prompt hang MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit When session.prompt() hits the deadlock seam sf-mp8e02m1-zpk903 (Promise never resolves pre-LLM-dispatch, 0 syscall activity, blocks until outer abort), the previous triage call had noOutputTimeoutMs=0 — meaning no fast-fail path. The full 8-minute timeoutMs would burn before the parent abort fired, wasting 8 minutes of subscription window per stuck triage attempt. This adds a 60s no-output watchdog: if no meaningful subagent event fires for 60s, abort the prompt. Combined with the diagnostic logs in subagent-runner.ts (commit 67e5ac9db) the operator gets: [subagent:triage-decider] phase=session.prompt-entered ... [subagent:triage-decider] STUCK phase=session.prompt 10001ms ... [forge] triage] apply blocked: triage-decider produced no output for 60000ms ↑ 60s, not 480s Triage failure stays non-fatal (per the existing handleTriage error catch in headless.ts:auto-triage path) — the autonomous loop continues to its main milestone dispatch. Net effect: SF moves forward 8× faster when the triage deadlock fires. Doesn't fix the underlying Promise deadlock (still tracked in sf-mp8e02m1-zpk903 and the new sf-mpmpXXX-... follow-up). This is a "unblock the autonomous loop now, fix the deadlock later" patch. Co-Authored-By: Claude Opus 4.7 (1M context) --- src/headless-triage.ts | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/src/headless-triage.ts b/src/headless-triage.ts index 85a51bb2c..d97873ea1 100644 --- a/src/headless-triage.ts +++ b/src/headless-triage.ts @@ -443,6 +443,11 @@ async function defaultAgentRunner( tools: options.tools ?? agent.tools, }) ?? agent.systemPrompt; const appendedPrompt = `${composed}\n\n## Task Input\n\n${task}`; + // noOutputTimeoutMs = 60000 (60s): without this, a hung session.prompt + // (sf-mp8e02m1-zpk903 — Promise deadlock pre-LLM-dispatch with 0 syscall + // activity) blocks for the full timeoutMs (8min) before the parent + // abort fires. 60s no-output captures the hang fast enough to keep the + // autonomous loop moving while the root-cause investigation continues. const result = await runSubagent( { systemPrompt: appendedPrompt, @@ -452,7 +457,7 @@ async function defaultAgentRunner( name: agent.name, }, task, - { timeoutMs: DEFAULT_AGENT_TIMEOUT_MS }, + { timeoutMs: DEFAULT_AGENT_TIMEOUT_MS, noOutputTimeoutMs: 60_000 }, ); return { ok: result.ok,