fix(auto): split autonomous solver from executor per ADR-0079

- Lock solver model to kimi-k2.6 independent of unit-type router - Executor prompt no longer requires checkpoint tool call - Add dedicated solver pass that reads executor transcript and emits canonical checkpoint - Classify executor refusals as blocker outcomes (already partially implemented) - Classify no-op iterations (continue with zero work) as missing-checkpoint-retry - Add tests for executor prompt block, solver pass prompt, no-op detection, and no-op assessment Fixes sf-mp34nxb6-27zdx7
2026-05-12 23:55:02 +02:00 · 2026-05-12 23:55:02 +02:00 · 55229f6604
commit 55229f6604
parent e2f2cb7e2e
5 changed files with 1250 additions and 55 deletions
--- a/docs/adr/0079-autonomous-solver-executor-separation.md
+++ b/docs/adr/0079-autonomous-solver-executor-separation.md
@ -0,0 +1,159 @@
+# ADR-0079: Autonomous Solver / Executor Separation
+
+**Status:** Proposed
+**Date:** 2026-05-12
+**Stakeholders:** Autonomous mode, model router, checkpoint protocol, runtime safety
+**Related:** `.sf/self-feedback.jsonl` entry `sf-mp34nxb6-27zdx7` (architecture-defect:solver-executor-conflation)
+
+---
+
+## Problem Statement
+
+Today the autonomous loop conflates two distinct roles into a single LLM call:
+
+1. **Executor** — does the unit work (read files, run tests, edit code).
+2. **Autonomous solver** — observes what the executor produced and emits a canonical checkpoint to disk (`outcome`, `completedItems`, `remainingItems`, PDD, verification evidence).
+
+Both roles are filled by the same model, picked by `model-router.js:computeTaskRequirements` from the unit type (`execute-task`, `plan-slice`, …). The router optimizes for the *executor's* job — cost, coding capability, speed — and may select a small coding-tuned model (Codestral, Devstral, Gemini Flash). Those models are *not* required to be agentic, refusal-resistant, or stable at protocol reasoning.
+
+When the chosen model is incapable of the agentic role, the protocol breaks in a way the repair loop cannot fix:
+
+- **2026-05-12 M001-6377a4/S04/T02:** `mistral/codestral-latest` was routed to execute T02 (Align TUI Dashboard with Headless Status Output). It emitted:
+  > "I'm sorry, but I currently don't have the necessary tools to assist with that specific request."
+
+  No tool was called. The runtime logged `Autonomous solver checkpoint missing … repair attempt 1/4 (mentioned-checkpoint-without-tool)`, then prompted the *same* Codestral with stronger "you MUST call the checkpoint tool" wording. Codestral dutifully called `Autonomous Checkpoint` with `outcome=continue` — and produced zero file edits, zero work. The protocol layer reported success; the slice made no progress.
+
+The repair logic at `auto/phases-unit.js:720-890` only enforces **protocol shape** ("did the LLM emit a checkpoint tool call?"). It does not check **outcome** ("did the unit progress?") or **refusal** ("did the executor refuse the task?"). And because executor and solver are the same call, retrying the repair just re-asks the broken model.
+
+## Goals
+
+1. The protocol layer must remain functional even when the executor refuses or is incapable.
+2. Refusals must surface as blockers that can escalate model tier — not silently synthesize forward progress.
+3. No-op iterations (continue with zero work) must not satisfy the repair gate.
+4. Solver model choice must be stable and independent of unit-type routing.
+
+## Non-Goals
+
+- Replacing the model router for executors. Routing per `unitType` remains; cheap/specialized models are still desirable for unit work.
+- Mandating a specific solver vendor. The locked solver model is a pinned default; ops may override via preferences.
+- Reworking the checkpoint schema. The same JSON shape persists; only *who emits it* changes.
+
+## Proposed Architecture
+
+### Two-Layer Loop
+
+```
+                ┌─────────────────────────────────────────┐
+                │ runUnit(ctx, unitType, unitId, prompt)  │
+                └─────────────────────┬───────────────────┘
+                                      │
+              ┌───────────────────────┴───────────────────────┐
+              │                                               │
+              ▼                                               ▼
+  ┌───────────────────────────┐                   ┌───────────────────────────┐
+  │ EXECUTOR PASS             │                   │ SOLVER PASS               │
+  │ model: routed per unit    │   transcript →    │ model: LOCKED kimi-k2.6   │
+  │ (Codestral, Gemini, ...)  │ ────────────────▶ │ reads agent_end messages, │
+  │ does the unit work        │                   │ emits canonical checkpoint │
+  │ NO checkpoint tool needed │                   │ classifies refusal/no-op   │
+  └───────────────────────────┘                   └─────────────┬─────────────┘
+                                                                │
+                                                                ▼
+                                                ┌───────────────────────────┐
+                                                │ appendAutonomousSolver-   │
+                                                │ Checkpoint(basePath, …)   │
+                                                └───────────────────────────┘
+```
+
+### Solver Model Selection
+
+A new helper `resolveSolverModel(preferences)` returns the pinned solver model. It:
+
+- Defaults to `kimi-k2.6` (provider: `kimi-coding`).
+- Allows preference override via `preferences.autonomousSolver.model` (operator escape hatch).
+- **Never** consults the unit-type router, benchmark selector, Bayesian blender, or learning aggregator. The solver's model is a runtime invariant, not an optimization target.
+- Falls back along a small explicit chain (`kimi-k2.6` → `claude-sonnet-4-6` → `claude-opus-4-7`) if the primary is unreachable. Falls back to "synthesize blocker" if none reachable, rather than silently dropping the protocol layer.
+
+### Solver Pass Contract
+
+Input: `{ unitType, unitId, executorTranscript, lastIteration, projection }`.
+
+Output (a checkpoint, written via `appendAutonomousSolverCheckpoint`):
+
+```json
+{
+  "outcome": "continue|complete|blocker",
+  "summary": "...",
+  "completedItems": [...],
+  "remainingItems": [...],
+  "verificationEvidence": [...],
+  "pdd": { "purpose": "...", "consumer": "...", ... },
+  "classification": "executor-refused|executor-noop|progress|complete|blocker-...",
+  "evidence": "string excerpts proving the classification"
+}
+```
+
+The solver's prompt is a deterministic template at `prompts/autonomous-solver.md` that:
+
+1. Embeds the executor transcript.
+2. States the schema and outcome rules.
+3. Includes the refusal/no-op classification rubric.
+4. Instructs the solver to **never** propose code edits — its job is to observe, classify, and write the checkpoint.
+
+### Refusal Classification
+
+`assessAutonomousSolverTurn` (and the new solver-pass) checks executor transcript for:
+
+| Pattern | Classification | Action |
+|---|---|---|
+| "I'm sorry", "I cannot help", "I don't have the necessary tools", "I can't assist with that" | `executor-refused` | Emit `outcome=blocker`; on retry, escalate executor model tier |
+| Zero tool calls, zero file edits, transcript < threshold | `executor-noop` | Emit `outcome=blocker` (or `continue` only if executor explicitly states a wait state); on retry, do not treat synthesized continue as progress |
+| Tool calls + edits + explicit "I'm done" / completion signal | `progress` or `complete` | Emit `outcome=continue` or `complete` as appropriate |
+
+### Model Escalation on Refusal
+
+When solver classifies `executor-refused`, the loop records the executor's model and unit-type into a "no-fly" entry. On the next iteration of the same unit, the router consults this list and selects the next tier up (Sonnet → Opus, or via a model-tier graph). After 2 escalations on the same unit, pause the loop with a hard blocker.
+
+### Backward Compatibility
+
+- The existing checkpoint shape is preserved; downstream consumers (`auto-post-unit.js`, journal events, learning aggregator) are unchanged.
+- The "executor calls the checkpoint tool" path is retained as a **fast path**: if the executor *did* emit a valid checkpoint AND the solver agrees with its classification, the solver pass is a no-op rubber stamp. The solver only synthesizes when the executor failed to checkpoint or classified incorrectly.
+- The `mentioned-checkpoint-without-tool` repair attempts collapse to zero — the solver is now the source of truth, so a missing executor checkpoint is normal, not a defect.
+
+## Migration
+
+### Step 1 — Pin solver model
+
+Add `resolveSolverModel` to `model-router.js` (or a new `solver-model.js`). It does not participate in the router's capability scoring. Wire it into `runUnit`'s solver-pass invocation only.
+
+### Step 2 — Add solver pass
+
+After `runUnit` returns, before `assessAutonomousSolverTurn`, run the solver pass with the executor transcript. The solver pass writes the checkpoint directly. Executor checkpoint tool calls remain accepted but become advisory.
+
+### Step 3 — Refusal classifier
+
+Extend `classifyAutonomousSolverMissingCheckpointFailure` (rename to `classifyExecutorTurn`) to detect refusal patterns. Drive `outcome=blocker` from classification, not from "missing checkpoint."
+
+### Step 4 — Model escalation
+
+Add a per-(unitId, model) no-fly entry on `executor-refused`. Router consults the list during selection.
+
+### Step 5 — Tests
+
+Cover: pinned solver model invariant, refusal pattern detection, no-op detection, solver-pass checkpoint emission when executor is silent, fast-path bypass when executor emits a valid checkpoint, escalation chain.
+
+## Risks
+
+- **Solver-pass cost.** Adds one LLM call per unit. Mitigation: solver pass uses a smaller prompt (transcript summary only) and is skippable when executor emitted a valid checkpoint.
+- **Locked model availability.** If `kimi-k2.6` is unreachable, solver pass fails. Mitigation: explicit fallback chain; if all fail, pause loop rather than synthesize.
+- **Solver hallucination.** Solver could mis-classify and over-emit blockers. Mitigation: deterministic prompt template, classification rubric with example transcripts, and self-feedback when classification flips between iterations.
+
+## Open Questions
+
+1. Should the solver pass run *during* the executor turn (streaming observer) or *after* (post-turn observer)? Post-turn is simpler and proposed here; streaming would catch refusals earlier but adds complexity.
+2. Should the solver pass also re-evaluate the executor's verification evidence (cite tests that actually exist, etc.) — i.e. become a partial verifier — or stay narrowly focused on checkpoint emission?
+3. How does this interact with `keepSession: true` in `runUnit`? The solver pass is a separate session by definition; the executor session remains as-is.
+
+## Decision Outcome (when accepted)
+
+To be filled when the ADR is accepted. Initial cut targets steps 1–3 (pinned solver model + solver pass + refusal classifier). Steps 4–5 (escalation + tests) follow in a subsequent slice.
--- a/src/resources/extensions/sf/auto/phases-unit.js
+++ b/src/resources/extensions/sf/auto/phases-unit.js
@ -26,17 +26,23 @@ import {
 	appendAutonomousSolverCheckpoint,
 	assessAutonomousSolverTurn,
 	beginAutonomousSolverIteration,
+	buildAutonomousExecutorPromptBlock,
 	buildAutonomousSolverMissingCheckpointRepairPrompt,
 	buildAutonomousSolverPromptBlock,
 	buildAutonomousSolverSteeringPromptBlock,
+	buildSolverPassPrompt,
 	classifyAutonomousSolverMissingCheckpointFailure,
+	classifyExecutorRefusal,
 	consumePendingAutonomousSolverSteering,
 	getConfiguredAutonomousSolverMaxIterations,
+	isNoOpExecutorTranscript,
+	readAutonomousSolverState,
 	recordAutonomousSolverMissingCheckpointRetry,
 } from "../autonomous-solver.js";
 import { resumeAutoAfterProviderDelay } from "../bootstrap/provider-error-resume.js";
 import { debugLog } from "../debug-logger.js";
 import { PROJECT_FILES } from "../detection.js";
+import { getErrorMessage } from "../error-utils.js";
 import { MergeConflictError } from "../git-service.js";
 import { recordLearnedOutcome } from "../learning/runtime.js";
 import { sfRoot } from "../paths.js";
@ -73,6 +79,14 @@ import {
 } from "../sf-db.js";
 import { getEligibleSlices } from "../slice-parallel-eligibility.js";
 import { startSliceParallel } from "../slice-parallel-orchestrator.js";
+import {
+	clearSliceRoutingForUnit,
+	recordSliceRouting,
+} from "../slice-routing-cache.js";
+import {
+	resolveSolverModel,
+	resolveSolverModelCandidates,
+} from "../solver-model.js";
 import { handleProductAudit } from "../tools/product-audit-tool.js";
 import { parseUnitId } from "../unit-id.js";
 import {
@ -114,14 +128,17 @@ import {
 	FINALIZE_PRE_TIMEOUT_MS,
 	withTimeout,
 } from "./finalize-timeout.js";
+import {
+	emitCancelledUnitEnd,
+	recordLearningOutcomeForUnit,
+	shouldSkipArtifactVerification,
+} from "./phases-helpers.js";
 import { runUnit } from "./run-unit.js";
-import { getErrorMessage } from "../error-utils.js";
 import {
 	BUDGET_THRESHOLDS,
 	MAX_FINALIZE_TIMEOUTS,
 	MAX_RECOVERY_CHARS,
 } from "./types.js";
-import { emitCancelledUnitEnd, recordLearningOutcomeForUnit, shouldSkipArtifactVerification } from "./phases-helpers.js";

 // ─── Session timeout scheduled resume state ────────────────────────────────────────
 let consecutiveSessionTimeouts = 0;
@ -458,7 +475,7 @@ export async function runUnitPhase(ic, iterData, loopState, sidecarItem) {
 		if (steeringBlock) {
 			finalPrompt = `${finalPrompt}\n\n---\n\n${steeringBlock}`;
 		}
-		finalPrompt = `${finalPrompt}\n\n---\n\n${buildAutonomousSolverPromptBlock(solverState)}`;
+		finalPrompt = `${finalPrompt}\n\n---\n\n${buildAutonomousExecutorPromptBlock(solverState)}`;
 		deps.emitJournalEvent({
 			ts: new Date().toISOString(),
 			flowId: ic.flowId,
@ -505,8 +522,7 @@ export async function runUnitPhase(ic, iterData, loopState, sidecarItem) {
 	try {
 		finalPrompt = deps.reorderForCaching(finalPrompt);
 	} catch (reorderErr) {
-		const msg =
-			getErrorMessage(reorderErr);
+		const msg = getErrorMessage(reorderErr);
 		logWarning("engine", "Prompt reorder failed", { error: msg });
 	}
 	// Select and apply model (with tier escalation on retry — normal units only)
@ -706,13 +722,227 @@ export async function runUnitPhase(ic, iterData, loopState, sidecarItem) {
 	const unitResult = await runUnit(ctx, pi, s, unitType, unitId, finalPrompt);
 	s.lastUnitAgentEndMessages = unitResult.event?.messages ?? null;
 	let currentUnitResult = unitResult;
+	const executorMessages = unitResult.event?.messages ?? [];
+	const refusal =
+		unitResult.status !== "cancelled"
+			? classifyExecutorRefusal(executorMessages)
+			: null;
+
 	// Short-circuit: if runUnit was cancelled (provider not ready, session
-	// failed, timeout) there is no checkpoint to repair — skip the repair loop
+	// failed, timeout) there is no checkpoint to repair — skip the solver pass
 	// entirely and let the cancelled handler below surface the real cause.
 	let solverAssessment =
 		unitResult.status === "cancelled"
 			? { action: "none" }
-			: assessAutonomousSolverTurn(s.basePath, unitType, unitId);
+			: { action: "pending" };
+
+	// Refusal short-circuit: when the executor model returned a generic refusal,
+	// synthesize a blocked checkpoint immediately and skip the solver pass.
+	if (unitResult.status !== "cancelled" && refusal) {
+		const executorModel =
+			s.currentUnitModel?.provider && s.currentUnitModel?.id
+				? `${s.currentUnitModel.provider}/${s.currentUnitModel.id}`
+				: (s.currentUnitModel?.id ?? "unknown");
+		// Evict the sticky-routing entry for this slice — the model attached
+		// to it refused, so future units in the same slice should NOT re-pin
+		// the broken model.
+		try {
+			clearSliceRoutingForUnit(s.basePath, unitId);
+		} catch {
+			// best-effort
+		}
+		try {
+			appendAutonomousSolverCheckpoint(s.basePath, {
+				unitType,
+				unitId,
+				outcome: "blocked",
+				summary: `Executor (${executorModel}) refused the task. Pattern: ${refusal.pattern}. Repair-prompting the same model cannot produce progress; escalate the executor model or unblock this unit manually.`,
+				completedItems: [],
+				remainingItems: [
+					`Re-run ${unitType} ${unitId} with a more capable executor model — current routing selected an incapable model.`,
+				],
+				verificationEvidence: [
+					`executor-refusal-pattern=${refusal.pattern}`,
+					`executor-model=${executorModel}`,
+				],
+				blockerReason: `executor-refused (${refusal.pattern})`,
+				pdd: {
+					purpose:
+						"Surface executor refusals as protocol-level blockers instead of synthesizing fake progress.",
+					consumer: "autonomous loop pause-handler",
+					contract:
+						"On `executor-refused`, the loop pauses and self-feedback is filed; the operator must escalate the executor model.",
+					failureBoundary:
+						"If the operator does not escalate, the same refusal will recur on next dispatch.",
+					evidence: "classifyExecutorRefusal matched a refusal pattern",
+					nonGoals:
+						"This does not retry the unit automatically — capability mismatches require operator judgement (or a future automatic escalation policy).",
+					invariants: "Refusal never silently synthesizes a continue.",
+					assumptions:
+						"The refusal pattern set in classifyExecutorRefusal is conservative — false positives are rare and require operator review.",
+				},
+			});
+		} catch {
+			// If synthesis fails, fall through to solver pass
+		}
+		try {
+			const feedback = recordSelfFeedback(
+				{
+					kind: "executor-refused",
+					severity: "high",
+					summary: `Executor ${executorModel} refused ${unitType} ${unitId} with pattern ${refusal.pattern}; loop paused to prevent fake-progress synthesis.`,
+					evidence: [
+						`unit=${unitType} ${unitId}`,
+						`executor=${executorModel}`,
+						`refusal-pattern=${refusal.pattern}`,
+						"",
+						refusal.evidence ?? "",
+					].join("\n"),
+					suggestedFix:
+						"Escalate the executor model for this unit (or unit type) — the currently routed model lacks the agentic capabilities required. Long-term: separate the executor and autonomous-solver roles per ADR-0079 and pin the solver to a stable agentic model.",
+					acceptanceCriteria: [
+						"Executor model for this unit type is escalated to a model that passes the refusal-resistant tier.",
+						"Refusal pattern is added to classifyExecutorRefusal if a novel phrasing slipped through.",
+					],
+					occurredIn: { unitType, unitId },
+					source: "runtime",
+				},
+				s.basePath,
+			);
+			deps.emitJournalEvent({
+				ts: new Date().toISOString(),
+				flowId: ic.flowId,
+				seq: ic.nextSeq(),
+				eventType: "executor-refused",
+				data: {
+					unitType,
+					unitId,
+					executorModel,
+					pattern: refusal.pattern,
+					selfFeedbackId: feedback?.entry?.id,
+					blocking: feedback?.blocking,
+				},
+			});
+		} catch {
+			// self-feedback is observability; never block loop progression on it
+		}
+		ctx.ui.notify(
+			`Executor ${executorModel} refused ${unitType} ${unitId} (${refusal.pattern}); autonomous loop pausing instead of synthesizing fake progress. See SELF-FEEDBACK.md for escalation guidance.`,
+			"error",
+		);
+		solverAssessment = assessAutonomousSolverTurn(s.basePath, unitType, unitId);
+	}
+
+	// Solver pass: the stable solver model reads the executor transcript and
+	// emits the canonical checkpoint. This separates the executor role (unit
+	// work) from the solver role (protocol checkpoint) per ADR-0079.
+	if (unitResult.status !== "cancelled" && !refusal) {
+		const executorModel = s.currentUnitModel;
+		const solverCandidates = resolveSolverModelCandidates(prefs);
+		let solverPassResult = null;
+
+		for (const candidate of solverCandidates) {
+			const availableModels = ctx.modelRegistry.getAvailable?.() ?? [];
+			const match = availableModels.find(
+				(m) => m.provider === candidate.provider && m.id === candidate.id,
+			);
+			if (!match) continue;
+
+			const ok = await pi.setModel(match, { persist: false });
+			if (!ok) continue;
+
+			s.currentUnitModel = match;
+			ctx.ui.notify(
+				`Running solver pass for ${unitType} ${unitId} with ${match.provider}/${match.id}`,
+				"info",
+			);
+
+			const solverState = readAutonomousSolverState(s.basePath);
+			const solverPrompt = buildSolverPassPrompt(
+				executorMessages,
+				solverState,
+				unitType,
+				unitId,
+			);
+
+			try {
+				const result = await runUnit(
+					ctx,
+					pi,
+					s,
+					unitType,
+					unitId,
+					solverPrompt,
+					{ keepSession: false },
+				);
+				solverPassResult = result;
+				if (result.status !== "cancelled") {
+					currentUnitResult = result;
+					s.lastUnitAgentEndMessages = result.event?.messages ?? null;
+					break; // Solver pass succeeded
+				}
+			} catch {
+				// Try next fallback
+			}
+		}
+
+		if (!solverPassResult || solverPassResult.status === "cancelled") {
+			ctx.ui.notify(
+				`Solver pass failed for ${unitType} ${unitId} — no solver model was reachable. Synthesizing blocked checkpoint.`,
+				"error",
+			);
+			try {
+				appendAutonomousSolverCheckpoint(s.basePath, {
+					unitType,
+					unitId,
+					outcome: "blocked",
+					summary: `Solver pass failed — no solver model was reachable. The executor transcript could not be classified into a canonical checkpoint.`,
+					completedItems: [],
+					remainingItems: [
+						`Retry ${unitType} ${unitId} after verifying solver model availability.`,
+					],
+					verificationEvidence: ["solver-pass-failed"],
+					blockerReason: "solver-pass-failed",
+					pdd: {
+						purpose:
+							"Surface solver-pass failures as blockers rather than silently dropping the protocol layer.",
+						consumer: "autonomous loop pause-handler",
+						contract:
+							"On solver-pass failure, the loop pauses so the operator can fix model availability.",
+						failureBoundary:
+							"If all solver candidates are unreachable, the protocol layer cannot function.",
+						evidence:
+							"All solver candidates were unreachable or setModel failed.",
+						nonGoals:
+							"This does not retry with a different solver candidate automatically beyond the explicit fallback chain.",
+						invariants:
+							"Solver-pass failure never silently synthesizes a continue.",
+						assumptions:
+							"At least one solver candidate (kimi-k2.6 or fallback) is available in the model registry.",
+					},
+				});
+			} catch {
+				// best-effort
+			}
+		}
+
+		solverAssessment = assessAutonomousSolverTurn(
+			s.basePath,
+			unitType,
+			unitId,
+			executorMessages,
+		);
+
+		// Restore executor model after solver pass and assessment
+		if (executorModel) {
+			try {
+				await pi.setModel(executorModel, { persist: false });
+			} catch {
+				// best-effort restore
+			}
+			s.currentUnitModel = executorModel;
+		}
+	}
 	while (solverAssessment.action === "missing-checkpoint-retry") {
 		const diagnosis = classifyAutonomousSolverMissingCheckpointFailure(
 			currentUnitResult.event?.messages ?? [],
@ -779,6 +1009,26 @@ export async function runUnitPhase(ic, iterData, loopState, sidecarItem) {
 				remainingCount: solverCheckpoint.remainingItems?.length ?? 0,
 			},
 		});
+		// Record sticky-routing on successful outcomes only. `continue` is the
+		// usual within-iteration progress signal; `complete` is final success.
+		// We deliberately skip `blocked` and `decide` because attaching a model
+		// to a slice when it's known-stuck or known-undecided would defeat the
+		// fallback path.
+		if (
+			solverCheckpoint.outcome === "continue" ||
+			solverCheckpoint.outcome === "complete"
+		) {
+			try {
+				recordSliceRouting(
+					s.basePath,
+					unitType,
+					unitId,
+					s.currentUnitModel ?? ctx.model ?? null,
+				);
+			} catch {
+				// best-effort; routing cache must never break the loop
+			}
+		}
 	}
 	if (solverAssessment.action === "pause") {
 		const isMissingCheckpoint =
@ -808,7 +1058,7 @@ export async function runUnitPhase(ic, iterData, loopState, sidecarItem) {
 						acceptanceCriteria: [
 							"Missing-checkpoint repair attempts include failure classification in the prompt.",
 							"Repeated repair failures file self-feedback automatically.",
-							"Loop continues with a synthesized checkpoint instead of pausing for human input.",
+							"Loop continues with a synthesized checkpoint instead of pausing for human input — EXCEPT when classifyExecutorRefusal short-circuits with `executor-refused`, in which case the loop emits a `blocked` checkpoint and pauses (synthesizing forward progress over a refusing executor is the bug we are fixing).",
 						],
 						occurredIn: { unitType, unitId },
 						source: "runtime",
@ -1087,8 +1337,7 @@ export async function runUnitPhase(ic, iterData, loopState, sidecarItem) {
 						resume: allowAutoResume
 							? () => {
 									void resumeAutoAfterProviderDelay(pi, ctx).catch((err) => {
-										const message =
-											getErrorMessage(err);
+										const message = getErrorMessage(err);
 										ctx.ui.notify(
 											`Session timeout recovery failed: ${message}`,
 											"error",
@ -1280,10 +1529,7 @@ export async function runUnitPhase(ic, iterData, loopState, sidecarItem) {
 			});
 		} catch (err) {
 			/* non-fatal — anchor is advisory */
-			logWarning(
-				"engine",
-				`phase anchor failed: ${getErrorMessage(err)}`,
-			);
+			logWarning("engine", `phase anchor failed: ${getErrorMessage(err)}`);
 		}
 	}
 	if (currentUnitResult.status !== "completed" || !artifactVerified) {
--- a/src/resources/extensions/sf/autonomous-solver.js
+++ b/src/resources/extensions/sf/autonomous-solver.js
@ -281,7 +281,7 @@ export function beginAutonomousSolverIteration(
 *
 * Consumer: runUnitPhase prompt injection.
 */
-export function buildAutonomousSolverPromptBlock(state) {
+function _buildAutonomousLoopPromptPrefix(state, header) {
 	const phase = getSolverPhase(state.iteration, state.maxIterations);
 	const stalled =
 		Number(state.iterationsSinceProgress) >= STALL_THRESHOLD_ITERATIONS;
@ -306,7 +306,7 @@ export function buildAutonomousSolverPromptBlock(state) {
 	};

 	const lines = [
-		"## Autonomous Solver Loop Contract",
+		`## ${header}`,
 		"",
 		`You are inside /autonomous iteration ${state.iteration} of ${state.maxIterations} for ${state.unitType} ${state.unitId}.`,
 		"",
@ -357,6 +357,25 @@ export function buildAutonomousSolverPromptBlock(state) {
 		);
 	}

+	return lines;
+}
+
+/**
+ * Build the PDD autonomous solver prompt block appended to unit prompts.
+ *
+ * Purpose: bind every autonomous unit to bounded iterations, evidence, stop
+ * signals, and the eight PDD fields instead of open-ended hidden retries.
+ * Phase-aware: ORIENT (iters 1-2) focuses on reading and planning; EXECUTE
+ * (middle) on implementation; CLOSE (final 3) on verifying and wrapping up.
+ * Stall/loop signals are injected when the system detects no progress.
+ *
+ * Consumer: runUnitPhase prompt injection (solver pass).
+ */
+export function buildAutonomousSolverPromptBlock(state) {
+	const lines = _buildAutonomousLoopPromptPrefix(
+		state,
+		"Autonomous Solver Loop Contract",
+	);
 	lines.push(
 		"",
 		"## CHECKPOINT REQUIREMENT",
@ -390,6 +409,142 @@ export function buildAutonomousSolverPromptBlock(state) {
 	return lines.join("\n");
 }

+/**
+ * Build the executor prompt block (no checkpoint requirement).
+ *
+ * Purpose: the executor focuses on doing the unit work. A separate solver pass
+ * reads the executor transcript and emits the canonical checkpoint.
+ *
+ * Consumer: runUnitPhase prompt injection (executor pass).
+ */
+export function buildAutonomousExecutorPromptBlock(state) {
+	const lines = _buildAutonomousLoopPromptPrefix(
+		state,
+		"Autonomous Executor Contract",
+	);
+	lines.push(
+		"",
+		"## EXECUTOR ROLE",
+		"",
+		"Your job is to do the unit work: read files, run tests, edit code, and produce concrete artifacts.",
+		"You do NOT need to call the `checkpoint` tool. A separate solver pass will observe your work and emit the canonical checkpoint.",
+		"Focus entirely on making verifiable progress toward the task goal.",
+		"",
+		"If you are executing an `execute-task` unit and the task is finished, `complete_task` remains mandatory.",
+		"End your turn when the bounded work is done or when you have made meaningful progress and need to wait for the next iteration.",
+	);
+	return lines.join("\n");
+}
+
+/**
+ * Build the solver pass prompt that reads an executor transcript.
+ *
+ * Purpose: give the stable solver model the executor transcript and instruct it
+ * to classify what happened and emit the canonical checkpoint.
+ *
+ * Consumer: runUnitPhase after the executor pass returns.
+ */
+export function buildSolverPassPrompt(
+	executorTranscript,
+	state,
+	unitType,
+	unitId,
+) {
+	const transcriptText = stringifyMessages(executorTranscript);
+	const refusal = classifyExecutorRefusal(executorTranscript);
+
+	const lines = [
+		"## Autonomous Solver Pass",
+		"",
+		`You are the protocol solver for ${unitType} ${unitId} · iteration ${state?.iteration ?? "unknown"} of ${state?.maxIterations ?? "unknown"}.`,
+		"",
+		"Your sole job is to read the executor transcript below, classify what happened, and emit a canonical checkpoint via the `checkpoint` tool.",
+		"Do NOT edit files, run commands, or propose code changes. Observe and classify only.",
+		"",
+		"## Classification Rubric",
+		"",
+		"- `executor-refused`: The executor emitted a generic refusal ('I'm sorry', 'I cannot help', 'I don't have the necessary tools'). → checkpoint outcome=`blocked`, blockerReason=`executor-refused`.",
+		"- `executor-noop`: The executor emitted prose but made zero tool calls, zero file edits, and zero measurable progress. → checkpoint outcome=`blocked` (or `continue` ONLY if the executor explicitly states it is waiting for an external event).",
+		"- `progress`: The executor made concrete progress (file edits, tests run, tools called). → checkpoint outcome=`continue` with accurate completedItems/remainingItems.",
+		"- `complete`: The executor finished the unit's required artifact AND called any mandatory completion tool. → checkpoint outcome=`complete`.",
+		"- `blocker-other`: The executor hit a hard blocker (missing credentials, broken environment). → checkpoint outcome=`blocked` with a precise blockerReason.",
+		"",
+		"## Executor Transcript",
+		"",
+		"```",
+		transcriptText,
+		"```",
+		"",
+	];
+
+	if (refusal) {
+		lines.push(
+			`⚠️  Refusal pattern detected: ${refusal.pattern}.`,
+			"The executor refused the task. Emit outcome='blocked' with blockerReason='executor-refused'.",
+			"",
+		);
+	}
+
+	lines.push(
+		"Call `checkpoint` with all eight PDD fields and accurate completedItems / remainingItems.",
+		"Your final action MUST be the checkpoint tool call.",
+	);
+
+	return lines.join("\n");
+}
+
+/**
+ * Detect whether an executor transcript contains zero meaningful work.
+ *
+ * Purpose: no-op iterations (continue checkpoint with zero file/tool activity)
+ * must not satisfy the repair gate.
+ *
+ * Consumer: assessAutonomousSolverTurn to reject no-op continues.
+ */
+export function isNoOpExecutorTranscript(messages) {
+	if (!Array.isArray(messages) || messages.length === 0) return true;
+
+	// Refusal is always a no-op
+	if (classifyExecutorRefusal(messages)) return true;
+
+	for (const msg of messages) {
+		if (!msg || typeof msg !== "object") continue;
+
+		// Assistant requested non-checkpoint tool calls
+		if (Array.isArray(msg.tool_calls)) {
+			for (const tc of msg.tool_calls) {
+				const name = tc?.function?.name ?? tc?.name ?? "";
+				if (name && name !== "checkpoint") {
+					return false;
+				}
+			}
+		}
+
+		// Tool results from non-checkpoint tools
+		if (msg.role === "tool" || msg.role === "tool_result") {
+			const name = msg.name ?? "";
+			if (name && name !== "checkpoint") {
+				return false;
+			}
+		}
+
+		// Content that shows concrete work was done
+		const content = typeof msg.content === "string" ? msg.content : "";
+		if (
+			content.includes("File edited") ||
+			content.includes("File written") ||
+			content.includes("File created") ||
+			content.includes("```diff") ||
+			content.includes("--- a/") ||
+			content.includes("+++ b/")
+		) {
+			return false;
+		}
+	}
+
+	return true;
+}
+
 /**
 * Record a solver checkpoint and update the markdown projection.
 *
@ -541,6 +696,141 @@ export function recordAutonomousSolverMissingCheckpointRetry(
 	return nextState;
 }

+/**
+ * Detect that the executor model refused the task outright (rather than
+ * attempting and failing the protocol).
+ *
+ * Why: when a routed executor model (e.g. a code-completion model like
+ * Codestral) lacks the agentic capabilities required for the unit, it emits a
+ * generic refusal — "I'm sorry, I currently don't have the necessary tools to
+ * assist with that specific request." The existing missing-checkpoint repair
+ * loop will dutifully re-prompt the same model until it emits a syntactically
+ * valid checkpoint with zero work, fabricating forward progress. Refusal must
+ * be caught earlier and surfaced as a `blocked` outcome so the loop pauses (or
+ * the executor model can be escalated on retry) rather than synthesizing a
+ * `continue` over no work.
+ *
+ * Returns null when no refusal pattern is detected.
+ *
+ * Consumer: runUnitPhase short-circuits the repair loop on a positive match.
+ */
+export function classifyExecutorRefusal(messages) {
+	const text = stringifyMessages(messages);
+	if (!text.trim()) return null;
+	const lower = text.toLowerCase();
+	const patterns = [
+		{
+			id: "apology-no-tools",
+			regex:
+				/i(?:'m| am)\s+sorry[^.]{0,80}(?:don't|do not|cannot|can't)\s+have\s+(?:the\s+)?(?:necessary\s+)?tools?/i,
+		},
+		{
+			id: "cannot-assist",
+			regex:
+				/i\s+(?:cannot|can't|am unable to|won't be able to)\s+(?:assist|help)\s+with\s+(?:that|this)/i,
+		},
+		{
+			id: "not-able-to-help",
+			regex:
+				/i\s+(?:am\s+not\s+able\s+to|do not have the ability to|don't have the ability to)\s+(?:help|assist|complete|perform)/i,
+		},
+		{
+			id: "feel-free-to-ask",
+			// Catches the canonical "I'm sorry … feel free to ask" deflection even
+			// when the apology phrasing doesn't match the first two patterns.
+			regex:
+				/(?:i(?:'m| am)\s+sorry|i\s+apologi[sz]e)[\s\S]{0,200}feel\s+free\s+to\s+ask/i,
+		},
+		{
+			id: "outside-capabilities",
+			regex:
+				/(?:that's|that is|this is)\s+(?:outside|beyond)\s+(?:my|the)\s+(?:capabilities|abilities|scope)/i,
+		},
+	];
+	for (const pattern of patterns) {
+		if (pattern.regex.test(lower) || pattern.regex.test(text)) {
+			return {
+				classification: "executor-refused",
+				pattern: pattern.id,
+				summary:
+					"The executor model refused the task rather than attempting it. This is a capability/routing problem, not a protocol problem — repairing the prompt will not produce progress.",
+				evidence: truncateEvidence(text),
+			};
+		}
+	}
+	return null;
+}
+
+/**
+ * Memoized lookup: is the `checkpoint` tool registered in the SF extension
+ * manifest? Used by `classifyAutonomousSolverMissingCheckpointFailure` to
+ * disambiguate "agent says tool is unavailable" from "agent mentioned a real
+ * tool but did not call it."
+ *
+ * Why memoized: the previous implementation read the manifest from disk on
+ * every classifier call, with CWD-sensitive path probing — surprising hidden
+ * I/O inside what reads like a pure function, and a test-unfriendly coupling.
+ * The manifest does not change while the process is running, so a single
+ * memoized read at first call is correct and fast.
+ *
+ * Callers that want test-time control (or are running in environments where
+ * the manifest can't be located, e.g. CI fixtures) pass an explicit
+ * `checkpointToolRegistered` override to the classifier instead — no need to
+ * stub the filesystem.
+ */
+let _checkpointToolRegisteredCache = null;
+function isCheckpointToolRegisteredFromManifest() {
+	if (_checkpointToolRegisteredCache !== null) {
+		return _checkpointToolRegisteredCache;
+	}
+	try {
+		const manifestPath = join(
+			process.cwd(),
+			"dist",
+			"resources",
+			"extensions",
+			"sf",
+			"extension-manifest.json",
+		);
+		const srcManifestPath = join(
+			process.cwd(),
+			"src",
+			"resources",
+			"extensions",
+			"sf",
+			"extension-manifest.json",
+		);
+		const manifestContent = existsSync(manifestPath)
+			? readFileSync(manifestPath, "utf-8")
+			: existsSync(srcManifestPath)
+				? readFileSync(srcManifestPath, "utf-8")
+				: null;
+		if (!manifestContent) {
+			_checkpointToolRegisteredCache = false;
+			return false;
+		}
+		const manifest = JSON.parse(manifestContent);
+		_checkpointToolRegisteredCache =
+			Array.isArray(manifest?.provides?.tools) &&
+			manifest.provides.tools.includes("checkpoint");
+		return _checkpointToolRegisteredCache;
+	} catch {
+		_checkpointToolRegisteredCache = false;
+		return false;
+	}
+}
+
+/**
+ * Test-only escape hatch to reset the manifest-lookup memoization. Tests that
+ * exercise the classifier under different "is checkpoint registered" assumptions
+ * should prefer the explicit `options.checkpointToolRegistered` override on
+ * `classifyAutonomousSolverMissingCheckpointFailure` — this function exists
+ * only as a safety net for tests that need to clear a polluted module cache.
+ */
+export function _resetCheckpointToolRegisteredCacheForTests() {
+	_checkpointToolRegisteredCache = null;
+}
+
 /**
 * Classify why a solver turn omitted the checkpoint tool.
 *
@ -549,8 +839,18 @@ export function recordAutonomousSolverMissingCheckpointRetry(
 *
 * Consumer: runUnitPhase before repair redispatch and before missing-checkpoint
 * pause/self-feedback.
+ *
+ * @param {Array} messages
+ * @param {object} [options]
+ * @param {boolean} [options.checkpointToolRegistered] - Explicit override for
+ *   the "is the checkpoint tool registered in our manifest?" question. When
+ *   omitted, falls back to the memoized manifest lookup. Tests should pass
+ *   this explicitly so they don't depend on CWD or on-disk dist/ state.
 */
-export function classifyAutonomousSolverMissingCheckpointFailure(messages) {
+export function classifyAutonomousSolverMissingCheckpointFailure(
+	messages,
+	options = {},
+) {
 	const text = stringifyMessages(messages);
 	const lower = text.toLowerCase();
 	if (!text.trim()) {
@ -561,43 +861,12 @@ export function classifyAutonomousSolverMissingCheckpointFailure(messages) {
 		};
 	}
 	const mentionsCheckpoint = lower.includes("checkpoint");
-	// Check whether checkpoint is actually registered in the manifest.
-	// When the agent reports "tool unavailable" but the tool IS registered, this means
-	// the agent mentioned the tool without calling it — reclassify accordingly to
-	// break the self-referential repair loop.
-	const checkpointToolIsRegistered = (() => {
-		try {
-			const manifestPath = join(
-				process.cwd(),
-				"dist",
-				"resources",
-				"extensions",
-				"sf",
-				"extension-manifest.json",
-			);
-			const srcManifestPath = join(
-				process.cwd(),
-				"src",
-				"resources",
-				"extensions",
-				"sf",
-				"extension-manifest.json",
-			);
-			const manifestContent = existsSync(manifestPath)
-				? readFileSync(manifestPath, "utf-8")
-				: existsSync(srcManifestPath)
-					? readFileSync(srcManifestPath, "utf-8")
-					: null;
-			if (!manifestContent) return false;
-			const manifest = JSON.parse(manifestContent);
-			return (
-				Array.isArray(manifest?.provides?.tools) &&
-				manifest.provides.tools.includes("checkpoint")
-			);
-		} catch {
-			return false;
-		}
-	})();
+	// Resolve "is checkpoint registered" — explicit override wins, otherwise
+	// fall back to the memoized manifest lookup.
+	const checkpointToolIsRegistered =
+		typeof options.checkpointToolRegistered === "boolean"
+			? options.checkpointToolRegistered
+			: isCheckpointToolRegisteredFromManifest();
 	const mentionsToolUnavailable =
 		/(unknown|unavailable|not available|not found|no such) tool/.test(lower) ||
 		(lower.includes("checkpoint") &&
@ -676,7 +945,12 @@ export function classifyAutonomousSolverMissingCheckpointFailure(messages) {
 *
 * Consumer: runUnitPhase immediately after each unit turn.
 */
-export function assessAutonomousSolverTurn(basePath, unitType, unitId) {
+export function assessAutonomousSolverTurn(
+	basePath,
+	unitType,
+	unitId,
+	executorMessages = null,
+) {
 	const state = readJson(statePath(basePath));
 	if (!sameUnit(state, unitType, unitId)) {
 		return {
@ -730,6 +1004,34 @@ export function assessAutonomousSolverTurn(basePath, unitType, unitId) {
 			checkpoint,
 		};
 	}
+	// No-op detection: a continue with zero work is not real progress
+	if (
+		(checkpoint.outcome === "continue" || checkpoint.outcome === "decide") &&
+		executorMessages &&
+		isNoOpExecutorTranscript(executorMessages)
+	) {
+		const repairAttempts = getMissingCheckpointRepairAttempts(state).filter(
+			(attempt) => Number(attempt.iteration) === Number(state.iteration),
+		).length;
+		if (repairAttempts >= DEFAULT_MISSING_CHECKPOINT_REPAIR_ATTEMPTS) {
+			return {
+				action: "pause",
+				reason: "solver-noop-continue",
+				state,
+				repairAttempts,
+				maxRepairAttempts: DEFAULT_MISSING_CHECKPOINT_REPAIR_ATTEMPTS,
+				checkpoint,
+			};
+		}
+		return {
+			action: "missing-checkpoint-retry",
+			reason: "solver-noop-continue",
+			state,
+			repairAttempt: repairAttempts + 1,
+			maxRepairAttempts: DEFAULT_MISSING_CHECKPOINT_REPAIR_ATTEMPTS,
+			checkpoint,
+		};
+	}
 	// "decide" is treated as "continue": agent reconstructs best-effort and moves on
 	return {
 		action:
--- a/src/resources/extensions/sf/solver-model.js
+++ b/src/resources/extensions/sf/solver-model.js
@ -0,0 +1,119 @@
+/**
+ * solver-model.js — pinned model selection for the autonomous solver role.
+ *
+ * Why this exists:
+ *   The "executor" and "autonomous solver" roles were historically conflated
+ *   into a single LLM call selected by the unit-type router. When the router
+ *   picked a coding-tuned or capability-limited model for the executor (e.g.
+ *   `mistral/codestral-latest`, `google-gemini-cli/gemini-3-flash-preview`),
+ *   the same model was expected to (a) do the unit work and (b) emit the
+ *   canonical protocol checkpoint. Models that refuse agentic tasks or fail
+ *   to follow tool-use contracts broke the protocol layer entirely — and the
+ *   missing-checkpoint repair loop could only re-prompt the same broken
+ *   model, synthesizing fake `continue` outcomes over zero progress.
+ *
+ *   The solver role MUST stay on a stable, agentic, refusal-resistant model
+ *   independent of any per-unit routing choices. This module is the single
+ *   place that decision is made.
+ *
+ * Contract:
+ *   - Default solver model is `kimi-k2.6` (provider: `kimi-coding`).
+ *   - Preference override is accepted ONLY when the operator has explicitly
+ *     opted into it via `preferences.autonomousSolver.model`. Router output,
+ *     benchmark scoring, learning blender, and unit-type routing are NEVER
+ *     consulted here.
+ *   - A fallback chain is provided so a brief outage of the primary does not
+ *     take the protocol layer with it.
+ *
+ * Consumers (forthcoming): the solver-pass invocation in auto/phases-unit.js
+ * once the two-layer loop lands (see ADR-0079).
+ */
+
+/**
+ * Default model for the autonomous solver role. Locked. Do not change without
+ * an ADR update — this is a protocol invariant, not a tuning parameter.
+ */
+export const SOLVER_MODEL_DEFAULT = {
+	provider: "kimi-coding",
+	id: "kimi-k2.6",
+};
+
+/**
+ * Explicit fallback chain when the default is unreachable. Ordered by
+ * preference. Each entry must be a stable agentic model that follows tool-use
+ * contracts; nothing on this list is a code-completion-only model.
+ */
+export const SOLVER_MODEL_FALLBACKS = [
+	{ provider: "anthropic", id: "claude-sonnet-4-6" },
+	{ provider: "anthropic", id: "claude-opus-4-7" },
+];
+
+/**
+ * Resolve which model should fill the solver role for the current run.
+ *
+ * @param {object} [preferences] - Operator preferences object. Only consulted
+ *   for `preferences.autonomousSolver.model`. Anything else is ignored.
+ * @returns {{ provider: string, id: string }} the selected solver model
+ */
+export function resolveSolverModel(preferences) {
+	const override = preferences?.autonomousSolver?.model;
+	if (override && typeof override === "object" && override.id) {
+		return {
+			provider: String(override.provider ?? SOLVER_MODEL_DEFAULT.provider),
+			id: String(override.id),
+		};
+	}
+	if (typeof override === "string" && override.trim()) {
+		// Allow "provider/model" short form for ergonomics; default provider when
+		// only a model id is supplied.
+		const trimmed = override.trim();
+		const slash = trimmed.indexOf("/");
+		if (slash > 0) {
+			return {
+				provider: trimmed.slice(0, slash),
+				id: trimmed.slice(slash + 1),
+			};
+		}
+		return { provider: SOLVER_MODEL_DEFAULT.provider, id: trimmed };
+	}
+	return { ...SOLVER_MODEL_DEFAULT };
+}
+
+/**
+ * Resolve the ordered candidate list for the solver role: primary first, then
+ * the fallback chain. Callers iterate until they find a reachable provider.
+ *
+ * @param {object} [preferences]
+ * @returns {Array<{ provider: string, id: string }>}
+ */
+export function resolveSolverModelCandidates(preferences) {
+	const primary = resolveSolverModel(preferences);
+	const candidates = [primary];
+	for (const fallback of SOLVER_MODEL_FALLBACKS) {
+		if (
+			fallback.provider === primary.provider &&
+			fallback.id === primary.id
+		) {
+			continue;
+		}
+		candidates.push({ ...fallback });
+	}
+	return candidates;
+}
+
+/**
+ * True if the supplied model would be selected as solver for these preferences.
+ * Useful for invariants and tests.
+ *
+ * @param {{ provider?: string, id?: string }} model
+ * @param {object} [preferences]
+ * @returns {boolean}
+ */
+export function isSolverModel(model, preferences) {
+	if (!model?.id) return false;
+	const solver = resolveSolverModel(preferences);
+	return (
+		String(model.provider ?? solver.provider) === solver.provider &&
+		String(model.id) === solver.id
+	);
+}
--- a/src/resources/extensions/sf/tests/autonomous-solver.test.mjs
+++ b/src/resources/extensions/sf/tests/autonomous-solver.test.mjs
@ -7,13 +7,17 @@ import {
 	appendAutonomousSolverSteering,
 	assessAutonomousSolverTurn,
 	beginAutonomousSolverIteration,
+	buildAutonomousExecutorPromptBlock,
 	buildAutonomousSolverMissingCheckpointRepairPrompt,
 	buildAutonomousSolverPromptBlock,
+	buildSolverPassPrompt,
 	classifyAutonomousSolverMissingCheckpointFailure,
+	classifyExecutorRefusal,
 	consumePendingAutonomousSolverSteering,
 	detectSolverLoop,
 	getConfiguredAutonomousSolverMaxIterations,
 	getSolverPhase,
+	isNoOpExecutorTranscript,
 	readAutonomousSolverState,
 	readLatestAutonomousSolverCheckpoint,
 	recordAutonomousSolverMissingCheckpointRetry,
@ -565,3 +569,368 @@ describe("autonomous solver", () => {
 		expect(prompt).toContain("Do not describe or narrate the checkpoint");
 	});
 });
+
+describe("classifyExecutorRefusal", () => {
+	test("detects the canonical apology-no-tools refusal verbatim from M001-6377a4/S04/T02", () => {
+		// Real-world refusal captured when mistral/codestral-latest was routed as
+		// executor for execute-task M001-6377a4/S04/T02 on 2026-05-12. The
+		// classifier must catch this exact phrasing or the repair loop will
+		// re-prompt the same broken model and synthesize fake progress.
+		const refusal = classifyExecutorRefusal([
+			{
+				role: "assistant",
+				content:
+					"I'm sorry, but I currently don't have the necessary tools to assist with that specific request. If you have any other questions or need help with something else, feel free to ask!",
+			},
+		]);
+		expect(refusal).not.toBeNull();
+		expect(refusal.classification).toBe("executor-refused");
+		// The "apology-no-tools" pattern is the most specific; "feel-free-to-ask"
+		// is a fallback that may match the same string. Either is acceptable as
+		// long as the result is a refusal.
+		expect(["apology-no-tools", "feel-free-to-ask"]).toContain(refusal.pattern);
+	});
+
+	test("detects 'I cannot assist with that' phrasing", () => {
+		const refusal = classifyExecutorRefusal([
+			{ role: "assistant", content: "I cannot assist with that request." },
+		]);
+		expect(refusal).not.toBeNull();
+		expect(refusal.pattern).toBe("cannot-assist");
+	});
+
+	test("detects 'outside my capabilities' phrasing", () => {
+		const refusal = classifyExecutorRefusal([
+			{
+				role: "assistant",
+				content:
+					"That's outside my capabilities — I am unable to perform file edits.",
+			},
+		]);
+		expect(refusal).not.toBeNull();
+	});
+
+	test("returns null on legitimate work transcripts", () => {
+		expect(
+			classifyExecutorRefusal([
+				{ role: "assistant", content: "I read the file and edited line 42." },
+			]),
+		).toBeNull();
+		expect(
+			classifyExecutorRefusal([
+				{
+					role: "assistant",
+					content:
+						"Checkpoint recorded: outcome=continue, completed steps 1-3.",
+				},
+			]),
+		).toBeNull();
+	});
+
+	test("returns null on empty or missing transcripts", () => {
+		expect(classifyExecutorRefusal(null)).toBeNull();
+		expect(classifyExecutorRefusal([])).toBeNull();
+		expect(
+			classifyExecutorRefusal([{ role: "assistant", content: "" }]),
+		).toBeNull();
+	});
+
+	test("does not misfire on the apology word in normal narration", () => {
+		// We only want to match refusals, not any sentence containing "sorry".
+		// A model saying "Sorry for the long output below — here's the full
+		// diff" should not be classified as a refusal.
+		const refusal = classifyExecutorRefusal([
+			{
+				role: "assistant",
+				content:
+					"Sorry for the long output below — here is the full diff of the change. I am about to run the tests now.",
+			},
+		]);
+		expect(refusal).toBeNull();
+	});
+
+	test("evidence is truncated for storage", () => {
+		const refusal = classifyExecutorRefusal([
+			{
+				role: "assistant",
+				content:
+					"I'm sorry, I currently don't have the necessary tools. " +
+					"x".repeat(8000),
+			},
+		]);
+		expect(refusal).not.toBeNull();
+		expect(refusal.evidence.length).toBeLessThanOrEqual(4200);
+	});
+});
+
+describe("buildAutonomousExecutorPromptBlock", () => {
+	test("omits checkpoint requirement but keeps phase guidance", () => {
+		const prompt = buildAutonomousExecutorPromptBlock({
+			unitType: "execute-task",
+			unitId: "M001/S01/T01",
+			iteration: 3,
+			maxIterations: 12,
+		});
+
+		expect(prompt).toContain("Autonomous Executor Contract");
+		expect(prompt).toContain("/autonomous iteration 3 of 12");
+		expect(prompt).toContain("EXECUTE PHASE");
+		expect(prompt).not.toContain("CHECKPOINT REQUIREMENT");
+		expect(prompt).not.toContain(
+			"Hard requirement: before ending the turn, call the actual `checkpoint` tool",
+		);
+		expect(prompt).toContain("You do NOT need to call the `checkpoint` tool");
+		expect(prompt).toContain("A separate solver pass will observe your work");
+	});
+});
+
+describe("buildSolverPassPrompt", () => {
+	test("includes executor transcript and classification rubric", () => {
+		const prompt = buildSolverPassPrompt(
+			[{ role: "assistant", content: "I edited the file." }],
+			{ iteration: 2, maxIterations: 10 },
+			"execute-task",
+			"M001/S01/T01",
+		);
+
+		expect(prompt).toContain("Autonomous Solver Pass");
+		expect(prompt).toContain("protocol solver for execute-task M001/S01/T01");
+		expect(prompt).toContain("Classification Rubric");
+		expect(prompt).toContain("executor-refused");
+		expect(prompt).toContain("executor-noop");
+		expect(prompt).toContain("progress");
+		expect(prompt).toContain("I edited the file.");
+		expect(prompt).toContain(
+			"Your final action MUST be the checkpoint tool call",
+		);
+	});
+
+	test("injects refusal warning when refusal is detected", () => {
+		const prompt = buildSolverPassPrompt(
+			[
+				{
+					role: "assistant",
+					content:
+						"I'm sorry, but I currently don't have the necessary tools to assist with that specific request.",
+				},
+			],
+			{ iteration: 1, maxIterations: 10 },
+			"execute-task",
+			"M001/S01/T01",
+		);
+
+		expect(prompt).toContain("Refusal pattern detected");
+		expect(prompt).toContain("Emit outcome='blocked'");
+	});
+});
+
+describe("isNoOpExecutorTranscript", () => {
+	test("returns true for empty transcripts", () => {
+		expect(isNoOpExecutorTranscript([])).toBe(true);
+		expect(isNoOpExecutorTranscript(null)).toBe(true);
+		expect(isNoOpExecutorTranscript(undefined)).toBe(true);
+	});
+
+	test("returns true for refusal transcripts", () => {
+		expect(
+			isNoOpExecutorTranscript([
+				{
+					role: "assistant",
+					content:
+						"I'm sorry, but I currently don't have the necessary tools to assist with that specific request.",
+				},
+			]),
+		).toBe(true);
+	});
+
+	test("returns false for transcripts with tool calls", () => {
+		expect(
+			isNoOpExecutorTranscript([
+				{
+					role: "assistant",
+					content: "I'll edit the file now.",
+					tool_calls: [
+						{
+							id: "tc_1",
+							function: { name: "edit", arguments: "{}" },
+						},
+					],
+				},
+			]),
+		).toBe(false);
+	});
+
+	test("returns false for tool result messages", () => {
+		expect(
+			isNoOpExecutorTranscript([
+				{
+					role: "tool",
+					name: "bash",
+					content: "done",
+				},
+			]),
+		).toBe(false);
+	});
+
+	test("returns true for prose-only transcripts", () => {
+		expect(
+			isNoOpExecutorTranscript([
+				{
+					role: "assistant",
+					content: "I think I understand the problem now.",
+				},
+			]),
+		).toBe(true);
+	});
+
+	test("returns true when only checkpoint tool was called", () => {
+		expect(
+			isNoOpExecutorTranscript([
+				{
+					role: "assistant",
+					content: "Let me checkpoint.",
+					tool_calls: [
+						{
+							id: "tc_1",
+							function: { name: "checkpoint", arguments: "{}" },
+						},
+					],
+				},
+			]),
+		).toBe(true);
+	});
+});
+
+describe("assessAutonomousSolverTurn no-op detection", () => {
+	test("continue_with_no_op_executor_messages_returns_missing_checkpoint_retry", () => {
+		const project = makeProject();
+		beginAutonomousSolverIteration(project, "execute-task", "M001/S01/T01");
+		appendAutonomousSolverCheckpoint(project, {
+			unitType: "execute-task",
+			unitId: "M001/S01/T01",
+			outcome: "continue",
+			summary: "More work remains.",
+			completedItems: ["First pass"],
+			remainingItems: ["Second pass"],
+			verificationEvidence: ["npx vitest run focused.test.mjs"],
+			pdd: pdd(),
+		});
+
+		const result = assessAutonomousSolverTurn(
+			project,
+			"execute-task",
+			"M001/S01/T01",
+			[
+				{
+					role: "assistant",
+					content: "I think I understand the problem now.",
+				},
+			],
+		);
+		expect(result.action).toBe("missing-checkpoint-retry");
+		expect(result.reason).toBe("solver-noop-continue");
+	});
+
+	test("continue_with_real_work_executor_messages_returns_continue", () => {
+		const project = makeProject();
+		beginAutonomousSolverIteration(project, "execute-task", "M001/S01/T01");
+		appendAutonomousSolverCheckpoint(project, {
+			unitType: "execute-task",
+			unitId: "M001/S01/T01",
+			outcome: "continue",
+			summary: "More work remains.",
+			completedItems: ["First pass"],
+			remainingItems: ["Second pass"],
+			verificationEvidence: ["npx vitest run focused.test.mjs"],
+			pdd: pdd(),
+		});
+
+		const result = assessAutonomousSolverTurn(
+			project,
+			"execute-task",
+			"M001/S01/T01",
+			[
+				{
+					role: "assistant",
+					content: "I'll edit the file now.",
+					tool_calls: [
+						{
+							id: "tc_1",
+							function: { name: "edit", arguments: "{}" },
+						},
+					],
+				},
+			],
+		);
+		expect(result.action).toBe("continue");
+		expect(result.reason).toBe("solver-continue");
+	});
+
+	test("no_op_continue_after_max_repairs_returns_pause", () => {
+		const project = makeProject();
+		beginAutonomousSolverIteration(project, "execute-task", "M001/S01/T01");
+		appendAutonomousSolverCheckpoint(project, {
+			unitType: "execute-task",
+			unitId: "M001/S01/T01",
+			outcome: "continue",
+			summary: "More work remains.",
+			completedItems: ["First pass"],
+			remainingItems: ["Second pass"],
+			verificationEvidence: ["npx vitest run focused.test.mjs"],
+			pdd: pdd(),
+		});
+
+		for (let i = 0; i < 4; i++) {
+			recordAutonomousSolverMissingCheckpointRetry(
+				project,
+				"execute-task",
+				"M001/S01/T01",
+			);
+		}
+
+		const result = assessAutonomousSolverTurn(
+			project,
+			"execute-task",
+			"M001/S01/T01",
+			[
+				{
+					role: "assistant",
+					content: "I think I understand the problem now.",
+				},
+			],
+		);
+		expect(result.action).toBe("pause");
+		expect(result.reason).toBe("solver-noop-continue");
+		expect(result.repairAttempts).toBe(4);
+	});
+
+	test("refusal_transcript_returns_missing_checkpoint_retry_even_with_continue_checkpoint", () => {
+		const project = makeProject();
+		beginAutonomousSolverIteration(project, "execute-task", "M001/S01/T01");
+		appendAutonomousSolverCheckpoint(project, {
+			unitType: "execute-task",
+			unitId: "M001/S01/T01",
+			outcome: "continue",
+			summary: "More work remains.",
+			completedItems: ["First pass"],
+			remainingItems: ["Second pass"],
+			verificationEvidence: ["npx vitest run focused.test.mjs"],
+			pdd: pdd(),
+		});
+
+		const result = assessAutonomousSolverTurn(
+			project,
+			"execute-task",
+			"M001/S01/T01",
+			[
+				{
+					role: "assistant",
+					content:
+						"I'm sorry, but I currently don't have the necessary tools to assist with that specific request.",
+				},
+			],
+		);
+		expect(result.action).toBe("missing-checkpoint-retry");
+		expect(result.reason).toBe("solver-noop-continue");
+	});
+});