fix(sf): prevent phantom work from stale file paths in task plans

Adds three layers of defense against the M008/S03 failure mode where bug-hunt findings referenced .ts files that had been deleted in a prior corrupted snapshot commit (f712c339b), but .js versions with fixes survived. 1. Prompt-level safeguards: - research-slice.md: researchers must verify file existence before listing paths in findings - plan-slice.md: planners must confirm files exist before including them in task plans - execute-task.md: executors must verify files exist before editing; escalate as blocker if missing 2. Runtime pre-flight validation: - system-context.js: validateTaskPlanFiles() extracts backtick-wrapped paths from task plans and checks existence before dispatch - Missing files trigger a warning injected into the execute-task prompt - Logs warning for observability This prevents the research→plan→execute pipeline from propagating stale file paths that cause phantom work, runaway guard intervention, and flow-audit failures. Fixes: sf-moqgvdi7-mxc1sr (flow-audit:repeated-milestone-failure) Related: M008/S03 bug-hunt cluster
2026-05-04 08:24:04 +02:00 · 2026-05-04 08:24:04 +02:00 · 8c66c11131
commit 8c66c11131
parent bffd6c22fc
4 changed files with 841 additions and 3 deletions
--- a/src/resources/extensions/sf/bootstrap/system-context.js
+++ b/src/resources/extensions/sf/bootstrap/system-context.js
@ -0,0 +1,835 @@
+import { existsSync, readFileSync, statSync, unlinkSync } from "node:fs";
+import { homedir } from "node:os";
+import { join } from "node:path";
+import { markCmuxPromptShown, shouldPromptToEnableCmux, } from "../../cmux/index.js";
+import { toPosixPath } from "../../shared/mod.js";
+import { isCanAskUser } from "../auto.js";
+import { getActiveAutoWorktreeContext } from "../auto-worktree.js";
+import { buildCodeIntelligenceContextBlock } from "../code-intelligence.js";
+import { ensureCodebaseMapFresh, readCodebaseMap, } from "../codebase-generator.js";
+import { autoEnableCmuxPreferences } from "../commands-cmux.js";
+import { debugTime } from "../debug-logger.js";
+import { formatOverridesSection, formatShortcut, loadActiveOverrides, loadFile, parseContinue, parseSummary, } from "../files.js";
+import { readForensicsMarker } from "../forensics.js";
+import { formatModelIdentity } from "../model-identity.js";
+import { relSliceFile, relSlicePath, relTaskFile, resolveSfRootFile, resolveSliceFile, resolveSlicePath, resolveTaskFile, resolveTaskFiles, resolveTasksDir, } from "../paths.js";
+import { loadEffectiveSFPreferences, renderPreferencesForSystemPrompt, resolveAllSkillReferences, } from "../preferences.js";
+import { resolveModelWithFallbacksForUnit } from "../preferences-models.js";
+import { resolveSkillReference } from "../preferences-skills.js";
+import { getTemplatesDir, loadPrompt } from "../prompt-loader.js";
+import { buildRepositoryVcsContextBlock } from "../repository-vcs-context.js";
+import { detectNewSkills, formatSkillsXml, hasSkillSnapshot, } from "../skill-discovery.js";
+import { deriveState } from "../state.js";
+import { logWarning } from "../workflow-logger.js";
+import { getActiveWorktreeName, getWorktreeOriginalCwd, } from "../worktree-command.js";
+const sfHome = process.env.SF_HOME || join(homedir(), ".sf");
+const _fileReadCache = new Map();
+/**
+ * Read a file with mtime-based caching. Returns the cached content if the
+ * file's mtime has not changed since the last read, otherwise re-reads.
+ * Returns null if the file does not exist or cannot be read.
+ */
+function cachedReadFile(filePath) {
+    try {
+        const st = statSync(filePath);
+        const mtime = st.mtimeMs;
+        const cached = _fileReadCache.get(filePath);
+        if (cached && cached.mtime === mtime)
+            return cached.content;
+        const content = readFileSync(filePath, "utf-8");
+        _fileReadCache.set(filePath, { mtime, content });
+        return content;
+    }
+    catch {
+        return null;
+    }
+}
+/**
+ * Bundled skill triggers — resolved dynamically at runtime instead of
+ * hardcoding absolute paths in the system prompt template. Only skills
+ * that actually exist on disk are included in the table. (#3575)
+ */
+const BUNDLED_SKILL_TRIGGERS = [
+    {
+        trigger: "Frontend UI - web components, pages, landing pages, dashboards, React/HTML/CSS, styling",
+        skill: "frontend-design",
+    },
+    {
+        trigger: "macOS or iOS apps - SwiftUI, Xcode, App Store",
+        skill: "swiftui",
+    },
+    {
+        trigger: "Debugging - complex bugs, failing tests, root-cause investigation after standard approaches fail",
+        skill: "debug-like-expert",
+    },
+    {
+        trigger: "Repository VCS operations - commit, push, safe-push, Git vs JJ, repo-local version-control rules",
+        skill: "repo-vcs",
+    },
+];
+function buildBundledSkillsTable() {
+    const cwd = process.cwd();
+    const rows = [];
+    for (const { trigger, skill } of BUNDLED_SKILL_TRIGGERS) {
+        const resolution = resolveSkillReference(skill, cwd);
+        if (resolution.method === "unresolved")
+            continue; // skill not installed — omit from prompt
+        rows.push(`| ${trigger} | \`${resolution.resolvedPath}\` |`);
+    }
+    if (rows.length === 0) {
+        return "*No bundled skills found. Install skills to `~/.agents/skills/` or `~/.claude/skills/`.*";
+    }
+    return `| Trigger | Skill to load |\n|---|---|\n${rows.join("\n")}`;
+}
+function warnDeprecatedAgentInstructions() {
+    const paths = [
+        join(sfHome, "agent-instructions.md"),
+        join(process.cwd(), ".sf", "agent-instructions.md"),
+    ];
+    for (const path of paths) {
+        if (existsSync(path)) {
+            console.warn(`[SF] DEPRECATED: ${path} is no longer loaded. ` +
+                `Migrate your instructions to AGENTS.md (or CLAUDE.md) in the same directory. ` +
+                `See https://github.com/singularity-forge/sf-run/issues/1492`);
+        }
+    }
+}
+/**
+ * Build the cross-tier escalation-policy block injected near the top of every
+ * SF system-context section. Autonomous mode (canAskUser=false) receives
+ * stronger language that forbids `ask_user_questions` entirely and instructs
+ * the agent to exit with a structured blocker message instead.
+ *
+ * @param canAskUser - true in auto/step mode; false in autonomous mode.
+ */
+export function buildEscalationPolicyBlock(canAskUser) {
+    const tier3 = canAskUser
+        ? "Ask the user via `ask_user_questions` — but ONLY when (1) and (2) cannot resolve."
+        : "DO NOT call `ask_user_questions`. If (1) and (2) cannot resolve, exit with a structured blocker message naming the unresolved ambiguity.";
+    return `\n\n[INTERACTION POLICY — escalation tiers]
+Before producing any answer to a question or filling in any unknown, escalate
+through these tiers IN ORDER. Skip a tier only when it has been demonstrably
+exhausted, not just because the next tier is faster.
+
+  Tier 1 — Code lookup:
+    - sift / codebase_search for symbols, patterns, prior usages
+    - Read source files (Read tool, file paths from PLAN/CODEBASE)
+    - Inspect .sf/DECISIONS.md, .sf/KNOWLEDGE.md, docs/design-docs/, docs/records/
+    - Check tests for documented behavior
+
+  Tier 2 — External lookup (factual questions):
+    - WebSearch for recent API behavior, version compatibility, RFCs
+    - WebFetch to read a specific page if you have the URL
+    - Context7 (mcp__context7__*) for up-to-date library/framework docs
+    - MCP server tools for the specific service (when configured)
+
+  Tier 3 — User question:
+    ${tier3}
+
+Reserve Tier 3 for genuinely user-only knowledge: preferences, project intent,
+design choices, business priorities. Factual questions (versions, API behavior,
+library defaults, what HTTP 418 means) MUST be answered via Tier 1 or 2.
+`;
+}
+export async function buildBeforeAgentStartResult(event, ctx) {
+    if (!existsSync(join(process.cwd(), ".sf")))
+        return undefined;
+    const stopContextTimer = debugTime("context-inject");
+    const systemContent = loadPrompt("system", {
+        bundledSkillsTable: buildBundledSkillsTable(),
+        templatesDir: getTemplatesDir(),
+        shortcutDashboard: formatShortcut("Ctrl+Alt+G"),
+        shortcutShell: formatShortcut("Ctrl+Alt+B"),
+    });
+    let loadedPreferences = loadEffectiveSFPreferences();
+    if (shouldPromptToEnableCmux(loadedPreferences?.preferences)) {
+        markCmuxPromptShown();
+        if (autoEnableCmuxPreferences()) {
+            loadedPreferences = loadEffectiveSFPreferences();
+            ctx.ui.notify("cmux detected — auto-enabled. Run /sf cmux off to disable.", "info");
+        }
+    }
+    let preferenceBlock = "";
+    if (loadedPreferences) {
+        const cwd = process.cwd();
+        const report = resolveAllSkillReferences(loadedPreferences.preferences, cwd);
+        preferenceBlock = `\n\n${renderPreferencesForSystemPrompt(loadedPreferences.preferences, report.resolutions)}`;
+        if (report.warnings.length > 0) {
+            ctx.ui.notify(`SF skill preferences: ${report.warnings.length} unresolved skill${report.warnings.length === 1 ? "" : "s"}: ${report.warnings.join(", ")}`, "warning");
+        }
+    }
+    const { block: knowledgeBlock, globalSizeKb } = loadKnowledgeBlock(sfHome, process.cwd());
+    const architectureBlock = loadArchitectureBlock(process.cwd());
+    const tacitKnowledgeBlock = loadTacitKnowledgeBlock(process.cwd());
+    if (globalSizeKb > 4) {
+        ctx.ui.notify(`SF: ~/.sf/agent/KNOWLEDGE.md is ${globalSizeKb.toFixed(1)}KB — consider trimming to keep system prompt lean.`, "warning");
+    }
+    let memoryBlock = "";
+    try {
+        const { formatMemoriesForPrompt, getActiveMemoriesRanked } = await import("../memory-store.js");
+        const memories = getActiveMemoriesRanked(30);
+        if (memories.length > 0) {
+            const formatted = formatMemoriesForPrompt(memories, 2000);
+            if (formatted) {
+                memoryBlock = `\n\n${formatted}`;
+            }
+        }
+    }
+    catch (e) {
+        logWarning("bootstrap", `memory block fetch failed: ${e.message}`);
+    }
+    let newSkillsBlock = "";
+    if (hasSkillSnapshot()) {
+        const newSkills = detectNewSkills();
+        if (newSkills.length > 0) {
+            newSkillsBlock = formatSkillsXml(newSkills);
+        }
+    }
+    let codebaseBlock = "";
+    let codeIntelligenceBlock = "";
+    try {
+        const codebaseOptions = loadedPreferences?.preferences?.codebase
+            ? {
+                excludePatterns: loadedPreferences.preferences.codebase.exclude_patterns,
+                maxFiles: loadedPreferences.preferences.codebase.max_files,
+                collapseThreshold: loadedPreferences.preferences.codebase.collapse_threshold,
+            }
+            : undefined;
+        ensureCodebaseMapFresh(process.cwd(), codebaseOptions);
+    }
+    catch (e) {
+        logWarning("bootstrap", `CODEBASE refresh failed: ${e.message}`);
+    }
+    try {
+        const codebasePrefs = loadedPreferences?.preferences?.codebase;
+        codeIntelligenceBlock = buildCodeIntelligenceContextBlock(process.cwd(), codebasePrefs);
+    }
+    catch (e) {
+        logWarning("bootstrap", `code intelligence block failed: ${e.message}`);
+    }
+    const codebasePath = resolveSfRootFile(process.cwd(), "CODEBASE");
+    const rawCodebase = readCodebaseMap(process.cwd());
+    if (existsSync(codebasePath) && rawCodebase) {
+        try {
+            const rawContent = rawCodebase.trim();
+            if (rawContent) {
+                // Cap injection size to ~2 000 tokens to avoid bloating every request.
+                // Full map is always available at .sf/CODEBASE.md.
+                const MAX_CODEBASE_CHARS = 8_000;
+                const generatedMatch = rawContent.match(/Generated: (\S+)/);
+                const generatedAt = generatedMatch?.[1] ?? "unknown";
+                const content = rawContent.length > MAX_CODEBASE_CHARS
+                    ? rawContent.slice(0, MAX_CODEBASE_CHARS) +
+                        "\n\n*(truncated — see .sf/CODEBASE.md for full map)*"
+                    : rawContent;
+                codebaseBlock = `\n\n[PROJECT CODEBASE — File structure and descriptions (generated ${generatedAt}, auto-refreshed when SF detects tracked file changes; use /sf codebase stats for status)]\n\n${content}`;
+            }
+        }
+        catch (e) {
+            logWarning("bootstrap", `CODEBASE file read failed: ${e.message}`);
+        }
+    }
+    warnDeprecatedAgentInstructions();
+    const injection = await buildGuidedExecuteContextInjection(event.prompt, process.cwd());
+    // Re-inject forensics context on follow-up turns (#2941)
+    const forensicsInjection = !injection
+        ? buildForensicsContextInjection(process.cwd(), event.prompt)
+        : null;
+    const worktreeBlock = buildWorktreeContextBlock();
+    const repositoryVcsBlock = buildRepositoryVcsContextBlock(process.cwd());
+    const modelIdentityBlock = ctx.model
+        ? `\n\n## Active Model Identity\n\nCurrent executor model: ${formatModelIdentity(ctx.model)}. Treat the model name as the capability identity and the provider/model route as the wire ID. Do not substitute one Kimi version for another.`
+        : "";
+    const subagentModelConfig = resolveModelWithFallbacksForUnit("subagent");
+    const subagentModelBlock = subagentModelConfig
+        ? `\n\n## Subagent Model\n\nWhen spawning subagents via the \`subagent\` tool, always pass \`model: "${subagentModelConfig.primary}"\` in the tool call parameters. Never omit this — always specify it explicitly.`
+        : "";
+    // Inject cross-tier escalation policy for all SF-managed sessions.
+    // The policy is always-on; autonomous mode (canAskUser=false) gets
+    // stronger language that forbids ask_user_questions entirely.
+    const escalationPolicyBlock = buildEscalationPolicyBlock(isCanAskUser());
+    // Judgment-log instruction for autonomous mode: agent is prompted to call
+    // sf_log_judgment when making non-trivial calls between alternatives.
+    const judgmentLogBlock = !isCanAskUser()
+        ? `\n\n[JUDGMENT LOG — autonomous mode]\nWhen you make a judgment call between alternatives at an ambiguous point, call sf_log_judgment with: decision, alternatives, reasoning, confidence. This lets the user review your reasoning at milestone close. It does NOT delay or block the work.`
+        : "";
+    const selfFeedbackBlock = loadSelfFeedbackBlock(process.cwd());
+    const fullSystem = `${event.systemPrompt}\n\n[SYSTEM CONTEXT — SF]\n\n${escalationPolicyBlock}${systemContent}${preferenceBlock}${knowledgeBlock}${architectureBlock}${tacitKnowledgeBlock}${codebaseBlock}${codeIntelligenceBlock}${memoryBlock}${newSkillsBlock}${selfFeedbackBlock}${worktreeBlock}${repositoryVcsBlock}${modelIdentityBlock}${subagentModelBlock}${judgmentLogBlock}`;
+    stopContextTimer({
+        systemPromptSize: fullSystem.length,
+        injectionSize: injection?.length ?? forensicsInjection?.length ?? 0,
+        hasPreferences: preferenceBlock.length > 0,
+        hasNewSkills: newSkillsBlock.length > 0,
+    });
+    // Determine which context message to inject (guided execute takes priority)
+    const contextMessage = injection
+        ? {
+            customType: "sf-guided-context",
+            content: injection,
+            display: false,
+        }
+        : forensicsInjection
+            ? {
+                customType: "sf-forensics",
+                content: forensicsInjection,
+                display: false,
+            }
+            : null;
+    return {
+        systemPrompt: fullSystem,
+        ...(contextMessage ? { message: contextMessage } : {}),
+    };
+}
+export function loadKnowledgeBlock(sfHomeDir, cwd) {
+    // 1. Global knowledge (~/.sf/agent/KNOWLEDGE.md) — cross-project, user-maintained
+    let globalKnowledge = "";
+    let globalSizeKb = 0;
+    const globalKnowledgePath = join(sfHomeDir, "agent", "KNOWLEDGE.md");
+    if (existsSync(globalKnowledgePath)) {
+        const content = cachedReadFile(globalKnowledgePath)?.trim() ?? "";
+        if (content) {
+            globalSizeKb = Buffer.byteLength(content, "utf-8") / 1024;
+            globalKnowledge = content;
+        }
+    }
+    // 2. Project knowledge (.sf/KNOWLEDGE.md) — project-specific
+    let projectKnowledge = "";
+    const knowledgePath = resolveSfRootFile(cwd, "KNOWLEDGE");
+    if (existsSync(knowledgePath)) {
+        const content = cachedReadFile(knowledgePath)?.trim() ?? "";
+        if (content)
+            projectKnowledge = content;
+    }
+    if (!globalKnowledge && !projectKnowledge) {
+        return { block: "", globalSizeKb: 0 };
+    }
+    const parts = [];
+    if (globalKnowledge)
+        parts.push(`## Global Knowledge\n\n${globalKnowledge}`);
+    if (projectKnowledge)
+        parts.push(`## Project Knowledge\n\n${projectKnowledge}`);
+    return {
+        block: `\n\n[KNOWLEDGE — Rules, patterns, and lessons learned]\n\n${parts.join("\n\n")}`,
+        globalSizeKb,
+    };
+}
+const TACIT_SECTION_MAX_BYTES = 4096;
+// Cap self-feedback entries to prevent context bloat. High/critical entries
+// are always included; medium/low are truncated if needed. Evidence details
+// are stored in jsonl only — the prompt gets compact summaries with IDs.
+// (#sf-moobj36p-ko6snt)
+const SELF_FEEDBACK_MAX_ENTRIES = 20;
+const SELF_FEEDBACK_MAX_CHARS = 4000;
+function loadSelfFeedbackBlock(cwd) {
+    const selfFeedbackPath = join(cwd, ".sf", "SELF-FEEDBACK.md");
+    const legacyBacklogPath = join(cwd, ".sf", "BACKLOG.md");
+    const sourcePath = existsSync(selfFeedbackPath)
+        ? selfFeedbackPath
+        : legacyBacklogPath;
+    if (!existsSync(sourcePath))
+        return "";
+    const raw = cachedReadFile(sourcePath)?.trim() ?? "";
+    if (!raw)
+        return "";
+    // Parse the table rows — skip header lines
+    const lines = raw.split("\n");
+    const entries = [];
+    for (const line of lines) {
+        if (!line.startsWith("| "))
+            continue;
+        if (line.includes("Timestamp"))
+            continue; // header
+        if (line.includes("|---|---|"))
+            continue; // separator
+        const cells = line
+            .split("|")
+            .map((c) => c.trim())
+            .filter(Boolean);
+        if (cells.length >= 7) {
+            entries.push({
+                timestamp: cells[0],
+                kind: cells[1],
+                severity: cells[2],
+                summary: cells[6],
+            });
+        }
+    }
+    if (entries.length === 0)
+        return "";
+    // Sort by severity (high/critical first) then by timestamp (newest first)
+    const severityOrder = {
+        critical: 0,
+        high: 1,
+        medium: 2,
+        low: 3,
+    };
+    entries.sort((a, b) => {
+        const sa = severityOrder[a.severity] ?? 99;
+        const sb = severityOrder[b.severity] ?? 99;
+        if (sa !== sb)
+            return sa - sb;
+        return b.timestamp.localeCompare(a.timestamp);
+    });
+    // Cap entries to prevent context bloat. High/critical are never dropped.
+    let kept = entries.slice();
+    // First apply entry count cap from the tail
+    if (kept.length > SELF_FEEDBACK_MAX_ENTRIES) {
+        kept = kept.slice(0, SELF_FEEDBACK_MAX_ENTRIES);
+    }
+    // Render compact summaries — evidence is in jsonl, not injected here
+    const rows = kept
+        .map((e) => `- **${e.severity}** \`${e.kind}\` — ${e.summary}`)
+        .join("\n");
+    let block = `## Self-Feedback Entries (ordered by severity, ${kept.length}/${entries.length} shown)\n\n${rows}`;
+    // If still over char budget, drop from tail (lowest priority first)
+    if (block.length > SELF_FEEDBACK_MAX_CHARS) {
+        while (kept.length > 1 && block.length > SELF_FEEDBACK_MAX_CHARS) {
+            kept = kept.slice(0, -1);
+            block =
+                `## Self-Feedback Entries (ordered by severity, truncated)\n\n` +
+                    kept
+                        .map((e) => `- **${e.severity}** \`${e.kind}\` — ${e.summary}`)
+                        .join("\n");
+        }
+    }
+    // Add note about where to find full evidence
+    if (entries.length > kept.length) {
+        block += `\n\n*(${entries.length - kept.length} more entries hidden to prevent context bloat. Full evidence in .sf/self-feedback.jsonl by entry ID.)*`;
+    }
+    return `\n\n[SELF-FEEDBACK — Recent sf-internal anomalies]\n\n${block}`;
+}
+/**
+ * Load tacit knowledge files (.sf/PRINCIPLES.md, .sf/TASTE.md, .sf/ANTI-GOALS.md)
+ * into a single block injected after the architecture block.
+ *
+ * Each section is capped at 4 KB. Sections are skipped silently when the
+ * corresponding file is missing or empty. Scaffold markers (<!-- sf-scaffold: ... -->)
+ * and YAML frontmatter are stripped so the agent only sees authored content.
+ */
+export function loadTacitKnowledgeBlock(cwd) {
+    const sfDir = join(cwd, ".sf");
+    function readSection(filename) {
+        const filePath = join(sfDir, filename);
+        const raw = cachedReadFile(filePath)?.trim() ?? "";
+        if (!raw)
+            return "";
+        // Strip scaffold markers (HTML comments like <!-- sf-scaffold: ... -->)
+        const stripped = raw.replace(/<!--\s*sf-scaffold:[^>]*-->/g, "").trim();
+        if (!stripped)
+            return "";
+        const bytes = Buffer.byteLength(stripped, "utf-8");
+        if (bytes > TACIT_SECTION_MAX_BYTES) {
+            const truncated = stripped.slice(0, TACIT_SECTION_MAX_BYTES);
+            return (truncated +
+                "\n\n*(truncated — see .sf/" +
+                filename +
+                " for full content)*");
+        }
+        return stripped;
+    }
+    const principles = readSection("PRINCIPLES.md");
+    const taste = readSection("TASTE.md");
+    const antiGoals = readSection("ANTI-GOALS.md");
+    if (!principles && !taste && !antiGoals)
+        return "";
+    const parts = ["[TACIT KNOWLEDGE — read carefully]"];
+    if (principles)
+        parts.push(`\n## Principles\n\n${principles}`);
+    if (taste)
+        parts.push(`\n## Taste\n\n${taste}`);
+    if (antiGoals)
+        parts.push(`\n## Anti-goals\n\n${antiGoals}`);
+    return `\n\n${parts.join("\n")}`;
+}
+/**
+ * Load ARCHITECTURE.md from the project root into context. Capped at 8 000 chars
+ * to avoid bloating every request — full file is always readable on disk.
+ */
+function loadArchitectureBlock(cwd) {
+    const architecturePath = join(cwd, "ARCHITECTURE.md");
+    if (!existsSync(architecturePath))
+        return "";
+    const raw = cachedReadFile(architecturePath)?.trim() ?? "";
+    if (!raw)
+        return "";
+    const MAX_CHARS = 8_000;
+    const content = raw.length > MAX_CHARS
+        ? raw.slice(0, MAX_CHARS) +
+            "\n\n*(truncated — see ARCHITECTURE.md for full map)*"
+        : raw;
+    return `\n\n[ARCHITECTURE — System map and invariants]\n\n${content}`;
+}
+function buildWorktreeContextBlock() {
+    const worktreeName = getActiveWorktreeName();
+    const worktreeMainCwd = getWorktreeOriginalCwd();
+    const autoWorktree = getActiveAutoWorktreeContext();
+    if (worktreeName && worktreeMainCwd) {
+        return [
+            "",
+            "",
+            "[WORKTREE CONTEXT — OVERRIDES CURRENT WORKING DIRECTORY ABOVE]",
+            `IMPORTANT: Ignore the "Current working directory" shown earlier in this prompt.`,
+            `The actual current working directory is: ${toPosixPath(process.cwd())}`,
+            "",
+            `You are working inside a SF worktree.`,
+            `- Worktree name: ${worktreeName}`,
+            `- Worktree path (this is the real cwd): ${toPosixPath(process.cwd())}`,
+            `- Main project: ${toPosixPath(worktreeMainCwd)}`,
+            `- Branch: worktree/${worktreeName}`,
+            "",
+            "All file operations, bash commands, and SF state resolve against the worktree path above.",
+            "Use /worktree merge to merge changes back. Use /worktree return to switch back to the main tree.",
+        ].join("\n");
+    }
+    if (autoWorktree) {
+        return [
+            "",
+            "",
+            "[WORKTREE CONTEXT — OVERRIDES CURRENT WORKING DIRECTORY ABOVE]",
+            `IMPORTANT: Ignore the "Current working directory" shown earlier in this prompt.`,
+            `The actual current working directory is: ${toPosixPath(process.cwd())}`,
+            "",
+            "You are working inside a SF auto-worktree.",
+            `- Milestone worktree: ${autoWorktree.worktreeName}`,
+            `- Worktree path (this is the real cwd): ${toPosixPath(process.cwd())}`,
+            `- Main project: ${toPosixPath(autoWorktree.originalBase)}`,
+            `- Branch: ${autoWorktree.branch}`,
+            "",
+            "All file operations, bash commands, and SF state resolve against the worktree path above.",
+            "Write every .sf artifact in the worktree path above, never in the main project tree.",
+        ].join("\n");
+    }
+    return "";
+}
+/**
+ * Low-entropy resume intent patterns — short phrases a user types to
+ * continue work after a pause, rate limit, or context reset (#3615).
+ * Tested against the trimmed, lowercased prompt with trailing punctuation stripped.
+ */
+const RESUME_INTENT_PATTERNS = /^(continue|resume|ok|go|go ahead|proceed|keep going|carry on|next|yes|yeah|yep|sure|do it|let's go|pick up where you left off)$/;
+async function buildGuidedExecuteContextInjection(prompt, basePath) {
+    const ensureStateDbOpen = async () => {
+        const { ensureDbOpen } = await import("./dynamic-tools.js");
+        await ensureDbOpen();
+    };
+    const executeMatch = prompt.match(/Execute the next task:\s+(T\d+)\s+\("([^"]+)"\)\s+in slice\s+(S\d+)\s+of milestone\s+(M\d+(?:-[a-z0-9]{6})?)/i);
+    if (executeMatch) {
+        const [, taskId, taskTitle, sliceId, milestoneId] = executeMatch;
+        return buildTaskExecutionContextInjection(basePath, milestoneId, sliceId, taskId, taskTitle);
+    }
+    const resumeMatch = prompt.match(/Resume interrupted work\.[\s\S]*?slice\s+(S\d+)\s+of milestone\s+(M\d+(?:-[a-z0-9]{6})?)/i);
+    if (resumeMatch) {
+        const [, sliceId, milestoneId] = resumeMatch;
+        await ensureStateDbOpen();
+        const state = await deriveState(basePath);
+        if (state.activeMilestone?.id === milestoneId &&
+            state.activeSlice?.id === sliceId &&
+            state.activeTask) {
+            return buildTaskExecutionContextInjection(basePath, milestoneId, sliceId, state.activeTask.id, state.activeTask.title);
+        }
+    }
+    // Fallback: low-entropy resume prompt (e.g., "continue", "ok", "go ahead")
+    // during an active executing task — inject task context so the agent
+    // doesn't rebuild from scratch (#3615).
+    // Intent-gated: only fire for short, resume-like prompts to avoid hijacking
+    // control/help/diagnostic prompts with unrelated execution context.
+    // Phase-gated: only fire during "executing" to avoid misrouting during
+    // replanning, gate evaluation, or other non-execution phases.
+    const trimmed = prompt
+        .trim()
+        .toLowerCase()
+        .replace(/[.!?,]+$/g, "");
+    if (RESUME_INTENT_PATTERNS.test(trimmed)) {
+        await ensureStateDbOpen();
+        const state = await deriveState(basePath);
+        if (state.phase === "executing" &&
+            state.activeTask &&
+            state.activeMilestone &&
+            state.activeSlice) {
+            return buildTaskExecutionContextInjection(basePath, state.activeMilestone.id, state.activeSlice.id, state.activeTask.id, state.activeTask.title);
+        }
+    }
+    return null;
+}
+async function buildTaskExecutionContextInjection(basePath, milestoneId, sliceId, taskId, _taskTitle) {
+    const taskPlanPath = resolveTaskFile(basePath, milestoneId, sliceId, taskId, "PLAN");
+    const taskPlanRelPath = relTaskFile(basePath, milestoneId, sliceId, taskId, "PLAN");
+    const taskPlanContent = taskPlanPath ? await loadFile(taskPlanPath) : null;
+    // Pre-flight file validation: extract file paths from task plan and verify existence
+    let fileValidationWarning = "";
+    if (taskPlanContent) {
+        const missingFiles = validateTaskPlanFiles(taskPlanContent, basePath);
+        if (missingFiles.length > 0) {
+            fileValidationWarning = [
+                "",
+                "## ⚠️ FILE VALIDATION WARNING",
+                "",
+                `The following files referenced in the task plan do NOT exist on disk:`,
+                ...missingFiles.map((f) => `- \`${f}\``),
+                "",
+                "**Do not attempt to edit these files.** They may have been deleted, renamed, or moved. Verify the correct path before proceeding, or escalate as a blocker if the task plan is invalid.",
+            ].join("\n");
+            logWarning("bootstrap", `Task plan ${taskPlanRelPath} references ${missingFiles.length} missing file(s): ${missingFiles.join(", ")}`);
+        }
+    }
+    const taskPlanInline = taskPlanContent
+        ? [
+            "## Inlined Task Plan (authoritative local execution contract)",
+            `Source: \`${taskPlanRelPath}\``,
+            "",
+            taskPlanContent.trim(),
+        ].join("\n")
+        : [
+            "## Inlined Task Plan (authoritative local execution contract)",
+            `Task plan not found at dispatch time. Read \`${taskPlanRelPath}\` before executing.`,
+        ].join("\n");
+    const slicePlanPath = resolveSliceFile(basePath, milestoneId, sliceId, "PLAN");
+    const slicePlanRelPath = relSliceFile(basePath, milestoneId, sliceId, "PLAN");
+    const slicePlanContent = slicePlanPath ? await loadFile(slicePlanPath) : null;
+    const slicePlanExcerpt = extractSliceExecutionExcerpt(slicePlanContent, slicePlanRelPath);
+    const priorTaskLines = await buildCarryForwardLines(basePath, milestoneId, sliceId, taskId);
+    const resumeSection = await buildResumeSection(basePath, milestoneId, sliceId);
+    const activeOverrides = await loadActiveOverrides(basePath);
+    const overridesSection = formatOverridesSection(activeOverrides);
+    return [
+        "[SF Guided Execute Context]",
+        "Use this injected context as startup context for guided task execution. Treat the inlined task plan as the authoritative local execution contract. Use source artifacts to verify details and run checks.",
+        overridesSection,
+        "",
+        "",
+        resumeSection,
+        "",
+        "## Carry-Forward Context",
+        ...priorTaskLines,
+        "",
+        taskPlanInline,
+        fileValidationWarning,
+        "",
+        slicePlanExcerpt,
+        "",
+        "## Backing Source Artifacts",
+        `- Slice plan: \`${slicePlanRelPath}\``,
+        `- Task plan source: \`${taskPlanRelPath}\``,
+    ].join("\n");
+}
+async function buildCarryForwardLines(basePath, milestoneId, sliceId, taskId) {
+    const tasksDir = resolveTasksDir(basePath, milestoneId, sliceId);
+    if (!tasksDir)
+        return ["- No prior task summaries in this slice."];
+    const currentNum = parseInt(taskId.replace(/^T/, ""), 10);
+    const sliceRel = relSlicePath(basePath, milestoneId, sliceId);
+    const summaryFiles = resolveTaskFiles(tasksDir, "SUMMARY")
+        .filter((file) => parseInt(file.replace(/^T/, ""), 10) < currentNum)
+        .sort();
+    if (summaryFiles.length === 0)
+        return ["- No prior task summaries in this slice."];
+    const results = await Promise.allSettled(summaryFiles.map(async (file) => {
+        const absPath = join(tasksDir, file);
+        const content = await loadFile(absPath);
+        const relPath = `${sliceRel}/tasks/${file}`;
+        if (!content)
+            return `- \`${relPath}\``;
+        const summary = parseSummary(content);
+        const provided = summary.frontmatter.provides.slice(0, 2).join("; ");
+        const decisions = summary.frontmatter.key_decisions
+            .slice(0, 2)
+            .join("; ");
+        const patterns = summary.frontmatter.patterns_established
+            .slice(0, 2)
+            .join("; ");
+        const diagnostics = extractMarkdownSection(content, "Diagnostics");
+        const parts = [summary.title || relPath];
+        if (summary.oneLiner)
+            parts.push(summary.oneLiner);
+        if (provided)
+            parts.push(`provides: ${provided}`);
+        if (decisions)
+            parts.push(`decisions: ${decisions}`);
+        if (patterns)
+            parts.push(`patterns: ${patterns}`);
+        if (diagnostics)
+            parts.push(`diagnostics: ${oneLine(diagnostics)}`);
+        return `- \`${relPath}\` — ${parts.join(" | ")}`;
+    }));
+    return results.map((r, idx) => {
+        if (r.status === "fulfilled")
+            return r.value;
+        const file = summaryFiles[idx];
+        logWarning("bootstrap", `Failed to load task summary ${sliceRel}/tasks/${file}: ${r.reason.message}`);
+        return `- \`${sliceRel}/tasks/${file}\` (load failed)`;
+    });
+}
+/**
+ * Build resume state section from CONTINUE.md or legacy continue.md.
+ * Returns progress, completed work, and next action if available.
+ */
+async function buildResumeSection(basePath, milestoneId, sliceId) {
+    const continueFile = resolveSliceFile(basePath, milestoneId, sliceId, "CONTINUE");
+    const legacyDir = resolveSlicePath(basePath, milestoneId, sliceId);
+    const legacyPath = legacyDir ? join(legacyDir, "continue.md") : null;
+    const continueContent = continueFile ? await loadFile(continueFile) : null;
+    const legacyContent = !continueContent && legacyPath ? await loadFile(legacyPath) : null;
+    const resolvedContent = continueContent ?? legacyContent;
+    const resolvedRelPath = continueContent
+        ? relSliceFile(basePath, milestoneId, sliceId, "CONTINUE")
+        : legacyPath
+            ? `${relSlicePath(basePath, milestoneId, sliceId)}/continue.md`
+            : null;
+    if (!resolvedContent || !resolvedRelPath) {
+        return [
+            "## Resume State",
+            "- No continue file present. Start from the top of the task plan.",
+        ].join("\n");
+    }
+    const cont = parseContinue(resolvedContent);
+    const lines = [
+        "## Resume State",
+        `Source: \`${resolvedRelPath}\``,
+        `- Status: ${cont.frontmatter.status || "in_progress"}`,
+    ];
+    if (cont.frontmatter.step && cont.frontmatter.totalSteps) {
+        lines.push(`- Progress: step ${cont.frontmatter.step} of ${cont.frontmatter.totalSteps}`);
+    }
+    if (cont.completedWork)
+        lines.push(`- Completed: ${oneLine(cont.completedWork)}`);
+    if (cont.remainingWork)
+        lines.push(`- Remaining: ${oneLine(cont.remainingWork)}`);
+    if (cont.decisions)
+        lines.push(`- Decisions: ${oneLine(cont.decisions)}`);
+    if (cont.nextAction)
+        lines.push(`- Next action: ${oneLine(cont.nextAction)}`);
+    return lines.join("\n");
+}
+/**
+ * Extract slice plan excerpt with goal, demo, verification, and observability.
+ * Returns formatted section for task execution context.
+ */
+function extractSliceExecutionExcerpt(content, relPath) {
+    if (!content) {
+        return [
+            "## Slice Plan Excerpt",
+            `Slice plan not found at dispatch time. Read \`${relPath}\` before running slice-level verification.`,
+        ].join("\n");
+    }
+    const lines = content.split("\n");
+    const goalLine = lines.find((line) => line.startsWith("**Goal:**"))?.trim();
+    const demoLine = lines.find((line) => line.startsWith("**Demo:**"))?.trim();
+    const verification = extractMarkdownSection(content, "Verification");
+    const observability = extractMarkdownSection(content, "Observability / Diagnostics");
+    const parts = ["## Slice Plan Excerpt", `Source: \`${relPath}\``];
+    if (goalLine)
+        parts.push(goalLine);
+    if (demoLine)
+        parts.push(demoLine);
+    if (verification)
+        parts.push("", "### Slice Verification", verification.trim());
+    if (observability)
+        parts.push("", "### Slice Observability / Diagnostics", observability.trim());
+    return parts.join("\n");
+}
+/**
+ * Extract a markdown section by heading name from content.
+ * Returns section content until next heading or null if not found.
+ */
+function extractMarkdownSection(content, heading) {
+    const match = new RegExp(`^## ${escapeRegExp(heading)}\\s*$`, "m").exec(content);
+    if (!match)
+        return null;
+    const start = match.index + match[0].length;
+    const rest = content.slice(start);
+    const nextHeading = rest.match(/^##\s+/m);
+    const end = nextHeading?.index ?? rest.length;
+    return rest.slice(0, end).trim();
+}
+/**
+ * Escape special regex characters in a string.
+ */
+function escapeRegExp(value) {
+    return value.replace(/[.*+?^${}()|[\]\\]/g, "\\$&");
+}
+/**
+ * Collapse multiple whitespace in text to single spaces.
+ */
+function oneLine(text) {
+    return text.replace(/\s+/g, " ").trim();
+}
+/**
+ * Extract file paths referenced in a task plan and validate their existence.
+ * Looks for backtick-wrapped paths (e.g. `src/file.ts`) and common file
+ * path patterns in task plan sections like Files, Inputs, Expected Output.
+ *
+ * Returns an array of missing file paths (relative to basePath).
+ */
+function validateTaskPlanFiles(taskPlanContent, basePath) {
+    const missing = [];
+    const seen = new Set();
+    // Match backtick-wrapped paths that look like real file paths
+    // Patterns: `src/...`, `./...`, `../...`, `file.ts`, `dir/file.js`
+    const backtickPattern = /`([^`]+\.[a-zA-Z0-9]+)`/g;
+    let match;
+    while ((match = backtickPattern.exec(taskPlanContent)) !== null) {
+        const candidate = match[1].trim();
+        // Skip if it doesn't look like a file path
+        if (!candidate.includes("/") && !candidate.includes("\\"))
+            continue;
+        // Skip URLs, markdown links, and non-path patterns
+        if (candidate.startsWith("http") ||
+            candidate.startsWith("#") ||
+            candidate.match(/^\d+\.\d+\.\d+/) || // version numbers
+            candidate.match(/^[A-Z]+\d+$/) // ticket IDs like SF001
+        )
+            continue;
+        // Normalize and check existence
+        const normalized = candidate.replace(/\\/g, "/");
+        if (seen.has(normalized))
+            continue;
+        seen.add(normalized);
+        const absolutePath = join(basePath, normalized);
+        if (!existsSync(absolutePath)) {
+            missing.push(normalized);
+        }
+    }
+    return missing;
+}
+// ─── Forensics Context Re-injection (#2941) ──────────────────────────────────
+/**
+ * Check for an active forensics session and return the prompt content
+ * so it can be re-injected on follow-up turns.
+ */
+export function buildForensicsContextInjection(basePath, prompt) {
+    const marker = readForensicsMarker(basePath);
+    if (!marker)
+        return null;
+    // Expire markers older than 2 hours to avoid stale context
+    const age = Date.now() - new Date(marker.createdAt).getTime();
+    if (age > 2 * 60 * 60 * 1000) {
+        const hours = (age / (60 * 60 * 1000)).toFixed(1);
+        logWarning("bootstrap", `Forensics marker expired (${hours}h old), clearing stale context.`);
+        clearForensicsMarker(basePath);
+        return null;
+    }
+    const trimmed = prompt
+        .trim()
+        .toLowerCase()
+        .replace(/[.!?,]+$/g, "");
+    if (trimmed && !RESUME_INTENT_PATTERNS.test(trimmed)) {
+        clearForensicsMarker(basePath);
+        return null;
+    }
+    return marker.promptContent;
+}
+/**
+ * Remove the active forensics marker file, e.g. when the investigation
+ * is complete or the session expires.
+ */
+export function clearForensicsMarker(basePath) {
+    const markerPath = join(basePath, ".sf", "runtime", "active-forensics.json");
+    if (existsSync(markerPath)) {
+        try {
+            unlinkSync(markerPath);
+        }
+        catch (e) {
+            logWarning("bootstrap", `unlinkSync forensics marker failed: ${e.message}`);
+        }
+    }
+}
--- a/src/resources/extensions/sf/prompts/execute-task.md
+++ b/src/resources/extensions/sf/prompts/execute-task.md
@ -37,8 +37,9 @@ Then:
 0a. **Batch independent tool calls in parallel.** When the next step needs to read or grep multiple files/paths that don't depend on each other's results, issue them in a single tool-call message (multiple tool uses in one assistant turn) rather than one-at-a-time. Examples: reading the handler + the test file + the schema file to triangulate a bug; greping for two unrelated symbols. Sequential tool calls are only correct when each call's input genuinely depends on the previous call's output. Talking-then-doing is also dead weight — if the next action is unambiguous, just take it; describe what you found in the result, not what you plan to look at.
 0b. **Swarm opportunity check.** Before implementation, decide whether this task can be split into a 2-3 worker same-model swarm. Swarm only if the shards have disjoint file/directory ownership, no shared-interface or lockfile edits, shard-local verification, and clear wall-clock savings. If it passes, dispatch `subagent({ tasks: [...] })` with explicit write scopes, expected output files, and verification per worker; then inspect `git status --short`, synthesize results, resolve conflicts, and run final task verification yourself. If it does not pass, continue single-agent execution without ceremony.
 1. {{skillActivation}} Follow any activated skills before writing code. If no skills match this task, skip this step.
-2. Execute the steps in the inlined task plan, adapting minor local mismatches when the surrounding code differs from the planner's snapshot
-3. Before any `Write` that creates an artifact or output file, check whether that path already exists. If it does, read it first and decide whether the work is already done, should be extended, or truly needs replacement. "Create" in the plan does **not** mean the file is missing — a prior session may already have started it.
+2. **Verify file existence before editing.** The task plan references specific files. Before reading or editing any file mentioned in the plan, confirm it exists with `ls`, `find`, or `existsSync`. If a referenced file does NOT exist, stop immediately — do not attempt to create it based on the plan's description of what "should" be there. The file may have been deleted, renamed, or moved. Escalate as `blocker_discovered: true` with a clear description of which file is missing and what the plan expected to find. This prevents phantom work on stale file paths.
+3. Execute the steps in the inlined task plan, adapting minor local mismatches when the surrounding code differs from the planner's snapshot
+4. Before any `Write` that creates an artifact or output file, check whether that path already exists. If it does, read it first and decide whether the work is already done, should be extended, or truly needs replacement. "Create" in the plan does **not** mean the file is missing — a prior session may already have started it.
 4. Build the real thing. If the task plan says "create login endpoint", build an endpoint that actually authenticates against a real store, not one that returns a hardcoded success response. If the task plan says "create dashboard page", build a page that renders real data from the API, not a component with hardcoded props. Stubs and mocks are for tests, not for the shipped feature.
 5. Keep verification artifacts disciplined:
   - If you need a one-off script, scratch file, generated fixture, or temporary helper to understand or verify the work, either delete it before completion or promote it into the durable artifact named by the task plan.
--- a/src/resources/extensions/sf/prompts/plan-slice.md
+++ b/src/resources/extensions/sf/prompts/plan-slice.md
@ -34,6 +34,8 @@ An honest "this is one task with these steps" is more valuable than a synthesize

 Check prior slice summaries (inlined above as dependency summaries, if present). If prior slices discovered constraints, changed approaches, or flagged fragility, adjust your plan accordingly. The roadmap description may be stale — verify it against the current codebase state.

+**Verify file existence before including paths in task plans.** If research references specific files, confirm they exist on disk before decomposing work around them. Use `ls`, `find`, or `existsSync` to validate. If a referenced file does not exist, do not include it in task plans — either find the correct path, scope the task differently, or escalate as a blocker. Phantom file paths in task plans cause downstream execution failures.
+
 ### Explore Slice Scope

 Read the code files relevant to this slice. Confirm the roadmap's description of what exists, what needs to change, and what boundaries apply. Use native `lsp` first for symbol lookup, references, and cross-file navigation. Use `rg`, `find`, and targeted reads for direct text inspection.
--- a/src/resources/extensions/sf/prompts/research-slice.md
+++ b/src/resources/extensions/sf/prompts/research-slice.md
@ -21,7 +21,7 @@ Pay particular attention to **Forward Intelligence** sections — they contain h
 You are the scout. After you finish, a **planner agent** reads your output in a fresh context with no memory of your exploration. It uses your findings to decompose this slice into executable tasks — deciding what files change, what order to build things, how to verify the work. Then **executor agents** build each task in isolated context windows.

 Write for the planner, not for a human. The planner needs:
- **What files exist and what they do** — so it can scope tasks to specific files
+- **What files exist and what they do** — so it can scope tasks to specific files. **CRITICAL: Verify file existence before listing paths.** Use `ls`, `find`, or `existsSync` to confirm each file you reference is actually on disk. If a file from a prior research artifact or your own exploration does not exist, do NOT list it — report it as missing. Stale file paths cause phantom work in downstream tasks.
 - **Where the natural seams are** — where work divides into independent units
 - **What to build or prove first** — what's riskiest, what unblocks everything else
 - **How to verify the result** — what commands, tests, or checks confirm the slice works