singularity/singularity-forge

Author	SHA1	Message	Date
Jeremy McSpadden	17471ea280	feat(model-routing): enable dynamic routing by default (#3120 ) * feat(model-routing): enable dynamic routing by default Change defaultRoutingConfig().enabled from false to true so that dynamic model routing (tier-based downgrading for light/standard tasks) is active out of the box. Users can still disable it via dynamic_routing.enabled: false in PREFERENCES.md. This is a behavioral change: sessions that previously used the configured model for all tasks will now automatically downgrade to cheaper models for light and standard complexity tasks. * test(model-routing): verify dynamic routing enabled by default Tests that defaultRoutingConfig returns enabled: true and all routing features are active.	2026-03-31 11:47:38 -06:00
Tom Boucher	081c5dc52f	fix: surface nativeCommit errors in reconcileMergeState instead of silently swallowing (#3052 ) The catch block in reconcileMergeState silently swallowed all nativeCommit exceptions, including real failures (permissions, corrupt git state, hook rejections). This caused auto-mode to report success and return true (dirty, re-derive) even when the merge commit actually failed, leading to an infinite loop where auto-mode repeatedly attempted worktree finalization. Now the catch block logs the error via ctx.ui.notify at "error" level and returns false to signal that reconciliation failed, allowing upstream logic to react appropriately. The nativeCommit return value is also checked — a null return (nothing to commit) gets its own info notification distinct from a successful commit SHA. Closes #2542 Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-30 14:50:28 -06:00
Tom Boucher	46d798a1bf	fix(parallel): scope commits to milestone boundaries in parallel mode (#3047 ) When GSD_MILESTONE_LOCK is set (parallel worker mode), smartStage() now excludes .gsd/milestones/<M>/ directories for all milestones other than the locked one. This prevents a parallel worker (e.g., M033) from staging and committing fabricated artifacts for a milestone it does not own (e.g., M032). Previously, smartStage() ran `git add -A` with only runtime path exclusions, allowing cross-milestone pollution when workers share the same .gsd/ directory (git.isolation: "none"). The GSD_MILESTONE_LOCK env var only filtered what deriveState() sees but did not prevent file staging. Closes #1991 Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-30 14:50:21 -06:00
Tom Boucher	de9ba8aeb7	fix: add windowsHide to all web-mode subprocess spawns (#2628 ) (#3046 ) On Windows, child_process.spawn() and execFile() open a visible console window by default. The web server spawn, RPC bridge, browser opener, and all 15 web service subprocess calls were missing windowsHide: true, causing constant console window flashing when running gsd --web. Closes #2628 Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-30 14:50:13 -06:00
Tom Boucher	466c7dea18	fix: skip auto-mode pause on empty-content aborted messages (#2695 ) (#3045 ) When the LLM sends an assistant message with empty content[] and stopReason "aborted", this is a non-fatal agent stop — not a crash. The abort handler now checks for empty content and missing errorMessage before deciding to pause. Empty-content aborts are routed to resolveAgentEnd instead, breaking the stuck re-dispatch loop. Closes #2695 Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-30 14:50:05 -06:00
Tom Boucher	0b36977804	fix: detect and remove nested .git dirs in worktree cleanup to prevent data loss (#3044 ) Scaffolding tools (create-next-app, cargo init, etc.) create nested .git directories inside worktrees. Git records these as gitlinks (mode 160000) without .gitmodules, so worktree cleanup destroys the only copy of the nested object database — causing permanent silent data loss. Added findNestedGitDirs() helper that recursively scans worktree for nested .git directories (skipping node_modules and other non-project dirs). The removeWorktree() function now calls this before cleanup and removes any nested .git dirs so files are tracked as regular content. Closes #2616 Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-30 14:49:54 -06:00
Tom Boucher	9384641b25	fix: prevent data loss when git isolation default changes (#2625 ) (#3043 ) When the default isolation mode flipped from "worktree" to "none" between versions, mergeAndExit() returned early for mode "none" without checking whether the session was physically inside an active worktree. This silently skipped the merge, orphaning committed work on the milestone branch. The fix moves the worktree-presence check (isInAutoWorktree + originalBasePath) before the mode-none early return. If we are inside a worktree, mergeAndExit proceeds with the worktree merge path regardless of the configured mode. Also fixes the misleading JSDoc on GitPreferences.isolation that claimed "worktree" was the default when the runtime default is actually "none". Closes #2625 Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-30 14:49:03 -06:00
Tom Boucher	893c525578	fix(read-tool): clamp offset to file bounds instead of throwing (#3007 ) (#3042 ) When an agent requests read(file, offset: 30) on a 13-line file, the read tool threw "Offset 30 is beyond end of file" which propagated as invalid JSON downstream during milestone completion. Now clamps the offset to the last line and prepends a notice, allowing the agent to continue with valid content. Fixes both read.ts and hashline-read.ts variants. Closes #3007 Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-30 14:48:01 -06:00
Tom Boucher	70e3d9d6c2	fix(gsd): preserve queued milestones with worktrees in ghost detection (#3041 ) isGhostMilestone() now checks for DB rows and worktree directories before falling back to content-file detection. A milestone with a DB row or a worktree is a legitimate milestone that hasn't been populated yet, not a ghost from a killed session. Fixes #2921 Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-30 14:47:53 -06:00
Tom Boucher	3a1cedd7de	fix(compaction): add chunked fallback when messages exceed model context window (#3038 ) When a session grows beyond the context window of available models, generateSummary() now detects the overflow and falls back to chunked summarization: split messages into context-fitting chunks, summarize the first chunk, then iteratively merge subsequent chunks using the existing UPDATE_SUMMARIZATION_PROMPT path. Closes #2932 Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-30 14:47:41 -06:00
Tom Boucher	dfb4fbecef	fix: preserve interactive terminal across tab switches and project changes (#3055 ) Two root causes destroyed terminal state during normal navigation: 1. The pagehide handler fired a shutdown beacon unconditionally, but on mobile/Safari tab switches pagehide fires with event.persisted=true (bfcache entry). This killed the server and all PTY sessions when the user merely switched browser tabs. Fix: check event.persisted and skip the beacon when the page is being cached, not unloaded. 2. ShellTerminal used project-agnostic session IDs ("default"), so switching projects and switching back either collided with the old session or spawned a new one, losing terminal state. Fix: scope session IDs by project path (e.g. "default:/path/to/project") so the server's getOrCreateSession returns the existing live PTY on reconnect. Closes #2701 Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-30 14:46:09 -06:00
Tom Boucher	fb10141e9b	fix: call cleanupQuickBranch on turn_end to squash-merge quick branch back (#3054 ) cleanupQuickBranch() was exported from quick.ts but never called anywhere. After a /gsd quick task completed, the user was left on the quick branch with orphaned state in quick-return.json. Register a turn_end hook in register-hooks.ts that calls cleanupQuickBranch() after each agent turn. The function is already idempotent (no-op when no quick-return state is pending), so it is safe to call on every turn. Closes #2668 Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-30 14:46:03 -06:00
Tom Boucher	dfb18c6e62	fix: align run-uat artifact path to ASSESSMENT, preventing false stuck retries (#3053 ) The run-uat prompt instructs the agent to save results via gsd_summary_save with artifact_type: "ASSESSMENT", which writes S##-ASSESSMENT.md. But resolveExpectedArtifactPath and diagnoseExpectedArtifact expected S##-UAT.md, causing artifact verification to fail and auto-mode to retry indefinitely. Align all three contract points (prompt uatResultPath, artifact resolution, and diagnostic message) to use ASSESSMENT as the canonical artifact type. Closes #2873 Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-30 14:45:57 -06:00
Tom Boucher	fb0fb5582e	fix: replace invalid Discord invite links with canonical URL (#3056 ) Closes #2699 The Discord badge in README.md pointed to https://discord.gg/gsd (expired vanity URL) and the Pi ecosystem doc used an old invite code. Both now use the canonical invite https://discord.com/invite/nKXTsAcmbT that was established in commit 0a1dad9a. Adds a regression test that validates all Discord invite links in user-facing files match the canonical URL. Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-30 14:45:32 -06:00
Tom Boucher	fad23944e7	fix: add Windows shell guard to remaining spawn sites (#3058 ) Three spawn call sites were missing `shell: process.platform === "win32"`, causing ENOENT/EINVAL errors on Windows where npm-installed tools are .cmd batch scripts that require shell resolution: - exec.ts: hardcoded `shell: false` -> platform-guarded - lsp/index.ts: missing shell option on project-type command spawn - lsp/lspmux.ts: missing shell option on lspmux binary spawn Adds a structural regression test that scans all spawn sites invoking user-facing binaries and asserts the Windows shell guard is present. Closes #2854 Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-30 14:44:20 -06:00
Tom Boucher	05b7cb95cb	fix: route `gsd auto` to headless runner to prevent hang on piped stdin/stdout (#3057 ) `gsd auto` was not handled as a subcommand — it fell through to the interactive TUI, which hangs indefinitely when stdin/stdout are piped (non-TTY). Add `auto` as a recognized subcommand that rewrites argv and delegates to `runHeadless(parseHeadlessArgs(...))`, matching the existing `gsd headless auto` behavior. Also adds `gsd auto` to TTY error hints and help text. Closes #2732 Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-30 14:44:04 -06:00
Tom Boucher	df9e06cfa5	fix: respect .gitignore for .gsd/ in rethink prompt (#3059 ) * fix: respect .gitignore for .gsd/ in rethink prompt (#2570) The rethink.md prompt template hardcoded `git add .gsd/` which caused the executing agent to force-add .gsd/ files (via `git add -f`) when .gsd was listed in .gitignore. This silently overrode the user's gitignore configuration, tracking planning artifacts they explicitly excluded. - Add `isGsdGitignored()` utility that uses `git check-ignore` to detect when .gsd is covered by .gitignore rules - Replace hardcoded `git add .gsd/` in rethink.md with the `{{commitInstruction}}` template variable (consistent with all other prompt templates) - Pass gitignore-aware commit instruction from rethink.ts: skip commit when .gsd is gitignored, include git add only when it is not Closes #2570 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * ci: re-trigger checks --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-30 14:43:56 -06:00
Tom Boucher	e71de432ab	fix: migrate unit ownership from JSON to SQLite to eliminate read-modify-write race (#3061 ) The JSON-based unit-claims storage had a lost-update race under concurrent multi-agent use: two agents could both read the file as unclaimed, then both write their claim, with the second silently overwriting the first. Replace with a SQLite-backed store using INSERT OR IGNORE on a PRIMARY KEY constraint for atomic first-writer-wins claim semantics. claimUnit() now returns boolean (true = claimed, false = already claimed by another agent). Closes #2728 Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-30 14:43:44 -06:00
Tom Boucher	e78dca41d4	fix(roadmap): handle numbered, bracketed, and indented prose H3 headers in slice parser (#3063 ) The prose slice header fallback parser failed to extract slices when LLMs generated common formatting variants: numbered prefixes (### 1. S01), parenthetical numbering (### (1) S01), bracketed IDs (### [S01]), or indented headings ( ### S01). This caused auto-mode to permanently block with "No slice eligible" when the plan-milestone prompt produced these formats inside a ## Slices section. Broadened the parseProseSliceHeaders regex to accept optional leading whitespace, numeric prefixes, parenthetical numbering, and square brackets around slice IDs. Closes #2567 Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-30 14:43:33 -06:00
Tom Boucher	8b680179e2	fix: add worktree-merge to resolveModelWithFallbacksForUnit switch and update KNOWN_UNIT_TYPES (#3066 ) Closes #2900 Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-30 14:43:22 -06:00
Tom Boucher	571b382075	fix: clean up MERGE_HEAD on all error paths in mergeMilestoneToMain (#2912 ) (#3068 ) libgit2's merge implementation creates MERGE_HEAD even for squash merges, unlike CLI git. When the merge fails with conflicts, the error paths in mergeMilestoneToMain cleaned SQUASH_MSG and MERGE_MSG but left MERGE_HEAD on disk. This blocked all subsequent merge attempts and caused doctor to report corrupt merge state. Add MERGE_HEAD cleanup (via nativeMergeAbort + explicit unlink) to: - The code-conflict error path (before MergeConflictError throw) - The dirty-working-tree error path (defensive) - The success path (alongside existing SQUASH_MSG cleanup) Closes #2912 Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-30 14:43:02 -06:00
Tom Boucher	9dc6a6a97d	fix: prevent LLM from confusing background task output with user input (#3069 ) * fix: wrap custom messages with system notification prefix in LLM context Background job completion notifications (delivered as custom messages via sendMessage with deliverAs: "followUp") were converted to plain role: "user" messages in convertToLlm(), making the LLM indistinguishable from actual human input. This caused the agent to confuse background task output with user messages, responding to job completions as if the user had typed them. Wrap all custom messages with a clear system notification prefix that includes the customType and an explicit instruction that the content is an automated system event, not user input. This follows the same pattern used by branchSummary and compactionSummary messages which already use structured prefixes/suffixes. Closes #3026 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: resolve TS import extension and type errors in messages test Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-30 14:42:56 -06:00
Tom Boucher	43ece11be7	fix: add openai-codex provider and modern OpenAI models to MODEL_CAPABILITY_TIER and cost tables (#3070 ) Closes #2885 The MODEL_CAPABILITY_TIER map in model-router.ts and the BUNDLED_COST_TABLE in model-cost-table.ts were missing all openai-codex provider models (gpt-5.1, gpt-5.2, gpt-5.3-codex, gpt-5.4, etc.) and modern OpenAI models (o4-mini, gpt-4.1, gpt-5, gpt-5-mini, gpt-5-nano, gpt-5-pro). This caused dynamic routing to treat these models as unknown (falling back to the isKnownModel guard) and cost comparisons to assign them 999 (the "unknown, assume expensive" fallback). Added 17 new model entries to MODEL_CAPABILITY_TIER across all three tiers, matching the tier assignments from the issue. Added corresponding entries to both MODEL_COST_PER_1K_INPUT (model-router.ts) and BUNDLED_COST_TABLE (model-cost-table.ts). Updated the #2192 test fixture that used gpt-5.4 as an "unknown" model since it is now known. Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-30 14:41:13 -06:00
Tom Boucher	cb26d71483	fix: preserve active tab when switching projects (#3071 ) Closes #2711 Two changes fix the tab-reset-to-dashboard bug: 1. Remove the forced `gsd:navigate-view` dispatch to "dashboard" in ProjectsPanel.handleSelectProject — this was unconditionally resetting the view on every project switch. 2. Add a useEffect in WorkspaceChrome that resets `viewRestored` when `projectPath` changes, so the per-project sessionStorage view restore fires for the newly-selected project. Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-30 14:41:05 -06:00
Tom Boucher	501fb83606	fix: include project name in desktop notifications (#3072 ) Desktop notifications now display "GSD — projectName" instead of just "GSD", making it clear which project a notification belongs to when multiple projects are active. Closes #2708 Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-30 14:40:58 -06:00
Tom Boucher	5f660bf3ce	fix: recover from many-image dimension overflow by stripping older images (#3075 ) When a session accumulates many images (screenshots, file reads), the Anthropic API enforces a 2000px dimension limit for "many-image requests" and returns a 400 error. Previously this error was not classified as retryable, causing the session to get permanently stuck in an error loop with no recovery path. This adds automatic recovery: detect the specific "image dimensions exceed max allowed size for many-image requests" error, strip older images from the conversation history (keeping the 5 most recent), and auto-retry. Also handles manual retry (continue/retry) by downsizing before retrying. Closes #2874 Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-30 14:40:35 -06:00
Tom Boucher	a3fc400c09	fix: resolve bare model IDs to anthropic over claude-code provider (#3076 ) When the claude-code-cli extension is active and the session provider is claude-code, bare model IDs in PREFERENCES.md (e.g. "claude-sonnet-4-6") silently resolve to claude-code/* instead of anthropic/*, routing all dispatch through the Claude Code CLI subprocess with different context, tool visibility, and cost characteristics. The fix introduces provider precedence in resolveModelId(): extension providers like claude-code are deprioritized for bare ID resolution. First-class API providers (anthropic, bedrock, openai, azure, etc.) retain the existing current-provider preference behavior. When the session provider is an extension, resolution falls through to prefer anthropic, then any non-extension provider, preserving backward compatibility for all existing provider combinations. Closes #2905 Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-30 14:40:26 -06:00
Tom Boucher	e7351fbd75	fix(auto): move selectAndApplyModel before updateProgressWidget (#3079 ) Closes #2907 selectAndApplyModel was called after updateProgressWidget and the prompt injection block, so the dashboard showed the previous unit's model label and a second call (if added by a future #2899 fix) would overwrite the first result. Move the single selectAndApplyModel call to before updateProgressWidget so the model is resolved before the widget renders and there is exactly one call per unit dispatch. Adds a structural regression test that asserts selectAndApplyModel appears exactly once in runUnitPhase and before updateProgressWidget. Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-30 14:40:03 -06:00
Tom Boucher	cbb9c2edd9	fix: detect project relocation and recover state without data loss (#3080 ) * fix: detect project relocation and recover state without data loss For repos with a remote URL, compute identity as SHA256(remoteUrl) only, dropping the git root path from the hash. This makes the identity stable across directory moves/renames -- the most common cause of silent data loss. For local-only repos, write a .gsd-id marker file in the project root that records the identity hash. After a move, ensureGsdSymlink reads the marker, finds the orphaned state directory, and migrates data to the new identity path automatically. Also handles the upgrade migration: when an existing .gsd symlink points to a valid state dir under the old hash format, data is transparently migrated to the new remote-only hash path. Closes #2750 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: handle existing symlink in project-relocation recovery test Add defensive unlinkSync calls before symlinkSync in ensureGsdSymlinkCore to prevent EEXIST race conditions when a dangling or residual symlink exists at the .gsd path during project relocation recovery. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-30 14:39:50 -06:00
Tom Boucher	6ac30ae9d9	fix: add free-text input to ask-user-questions when "None of the above" is selected (#3081 ) Fixes #2715 The ask-user-questions UI trapped users in a re-asking loop when the agent needed a free-text explanation rather than a fixed choice. Two paths were affected: - RPC fallback (ctx.ui.select): selecting "None of the above" recorded the label but never prompted for a free-text explanation. Now follows up with ctx.ui.input() so the user can type their answer. - Custom interview UI: selecting "None of the above" advanced to the next question without opening the notes editor. Now auto-focuses the notes field so the user can immediately type a free-text response. Also updated the "None of the above" description from "Press TAB to add optional notes." to "Select to type your own answer." for discoverability. Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-30 14:39:35 -06:00
Tom Boucher	a09ae1de26	fix: block work execution during /gsd queue mode (#2545 ) (#3082 ) When /gsd queue was active, the agent had unrestricted access to all tools and would execute described work instead of creating milestones. The queue prompt instructed milestone-only behavior, but the system prompt's "execute with full commitment" directive dominated. Add a mechanical tool gate (shouldBlockQueueExecution) that blocks write/edit to non-.gsd/ paths and mutating bash commands when queue phase is active. Read-only tools, discussion tools, and .gsd/ artifact writes remain allowed. This enforces the queue contract at the tool layer rather than relying solely on prompt compliance. Closes #2545 Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-30 14:39:26 -06:00
Tom Boucher	eb40f74cfe	fix: detect worktree basePath in gsdRoot() to prevent escaping to project root (#3083 ) When gsdRoot() is called with a basePath inside .gsd/worktrees/<name>/, the git-root probe and walk-up logic can escape to the project root's .gsd directory. This causes ensurePreconditions() to create slice directories in the wrong location and deriveState() to read stale project-root state instead of worktree-local state. Add isInsideGsdWorktree() guard that detects the .gsd/worktrees/<name>/ pattern in the basePath before the git rev-parse probe runs. When detected, return the worktree-local .gsd path immediately. Also check the symlink-resolved path for the pattern (handles macOS /tmp -> /private/tmp). Closes #2594 Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-30 14:39:15 -06:00
Tom Boucher	a9d881ad8c	fix: invalidate stale quick-task captures across milestone boundaries (#3084 ) Closes #2872 Quick-task captures resolved in a prior milestone were re-executed in subsequent sessions because loadActionableCaptures() used the Executed flag as its sole staleness gate. Milestone completion never marked captures as executed, so captures whose issues were fixed by planned work remained permanently actionable. Three changes fix this: 1. Track which milestone a capture was resolved in (new Milestone: field in CAPTURES.md, written by markCaptureResolved and the triage prompt). loadActionableCaptures() now accepts an optional currentMilestoneId and excludes captures from prior milestones. 2. Add a verification step to buildQuickTaskPrompt() instructing the agent to confirm the issue still exists before making changes. 3. Add stampCaptureMilestone() as a reconciliation safety net -- executeTriageResolutions() stamps actionable captures that are missing the Milestone field with the current milestone ID. Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-30 14:39:07 -06:00
Tom Boucher	6e22a20580	fix: defer model validation until after extensions register (#3089 ) * fix: defer model validation until after extensions register (#2626) Extension-provided models (e.g. claude-code/claude-sonnet-4-6) were silently overwritten on every startup because the model validation ran before createAgentSession(), which is where extensions register their models in the ModelRegistry. At validation time, extension models did not exist in the registry, so the user's valid choice was replaced with a built-in fallback. Extract validation into validateConfiguredModel() and call it after createAgentSession() in both print-mode and interactive-mode paths. Closes #2626 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: align MinimalSettingsManager interface with SettingsManager The MinimalSettingsManager interface used `string` for thinking level types, but SettingsManager uses a specific union type and returns `undefined`. This caused TS2345 at cli.ts lines 448 and 587. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-30 14:38:10 -06:00
Tom Boucher	2f3ffbfc10	fix: repair YAML bullet lists in malformed tool-call JSON (#3090 ) * fix: repair YAML bullet lists in malformed tool-call JSON (#2660) When LLMs copy YAML template formatting into tool-call arguments, they produce `"key": - item` instead of `"key": ["item"]`, causing JSON parse errors that block milestone completion. Add a repairToolJson() utility that detects and converts YAML-style bullet lists into JSON arrays before parsing. Integrated into both the PartialMessageBuilder (claude-code-cli) and the anthropic-shared streaming provider, with fallback in parseStreamingJson for all other providers. Closes #2660 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: use .js import extension in repair-tool-json test Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-30 14:37:09 -06:00
Tom Boucher	c924f9f1f8	fix: unify SUMMARY.md render paths for projection fidelity (#3091 ) * fix: unify SUMMARY.md render paths for projection fidelity Closes #2720 renderSummaryMarkdown (complete-task.ts) and renderSummaryContent (workflow-projections.ts) produced structurally different output for the same data — different frontmatter format, different sections, different formatting. Deleting a SUMMARY.md and regenerating it via projection yielded a different file than the original. Fix: make renderSummaryContent the single source of truth. complete-task now builds a TaskRow from params and delegates to renderSummaryContent. The projection renderer passes verification evidence from the DB so both paths produce identical output including the Verification Evidence table, Files Created/Modified section, and YAML-format frontmatter. Added getVerificationEvidence() to gsd-db for projection-time evidence retrieval, and a 22-assertion parity test that prevents future drift. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: safe type assertion for verification evidence query result Cast through `unknown` to satisfy TS2352 — better-sqlite3's `.all()` returns `Record<string, unknown>[]` which doesn't directly overlap with `VerificationEvidenceRow[]`. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-30 14:36:37 -06:00
Tom Boucher	3e78270cad	fix: chat mode misrepresents terminal output, looks stuck, omits user messages (#3092 ) Three root causes addressed: 1. PtyChatParser: user input echoed after a bare prompt line (e.g. "❯ \n" followed by "hello\n") was misclassified as assistant content. Added _awaitingInput flag that flips true on prompt boundary and classifies the next content line as role=user. 2. Chat mode "looks stuck": when the session is idle (connected, not streaming, has timeline content), no visual cue indicated GSD was waiting for input. Added a "Ready for your input" indicator with a pulsing dot. 3. Transcript overflow misalignment: chatUserMessages was not trimmed when liveTranscript/completedTurnSegments overflowed MAX_TRANSCRIPT_BLOCKS, causing index-based interleaving to pair user messages with wrong assistant responses. Also exposed isAwaitingInput() on PtyChatParser so chat UIs can query whether the session is waiting for user input, and widened the > and $ prompt marker regexes to match bare prompts after trimEnd strips trailing whitespace. Closes #2707 Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-30 14:36:21 -06:00
Tom Boucher	4327b4bb3f	fix: resolve 4 state corruption bugs in milestone/slice completion (#2945 ) (#3093 ) * fix: resolve 4 state corruption bugs in milestone/slice completion workflow Closes #2945 Bug 1 - ROADMAP corrupted by inline UAT content: renderRoadmapContent and renderPlanContent used slice.full_uat_md as a fallback when the demo field was empty. This injected multi-line UAT content (preconditions, steps, expected results) into table cells, corrupting the markdown table and making subsequent slices invisible to the parser. Fix: use "TBD" fallback instead of full_uat_md. Bug 2 - complete-milestone accepts pending slices via event replay: workflow-reconcile's replayEvents blindly called updateSliceStatus("done") for complete_slice events without validating that all tasks in the slice were actually complete. During API overload or partial execution, this allowed slices with pending tasks to be marked done, which then let complete-milestone succeed. Fix: extract replaySliceComplete function that validates task completion before updating slice status. Bug 3 - Worktree directory not cleaned up after merge: WorktreeResolver._mergeWorktreeMode delegated worktree cleanup to mergeMilestoneToMain's internal best-effort removeWorktree call, which can silently fail. Fix: add secondary teardownAutoWorktree call after successful merge to ensure cleanup. Bug 4 - Quality gate records not written by validate-milestone: handleValidateMilestone wrote to the assessments table and rendered VALIDATION.md to disk, but never persisted quality_gates records in the DB. Fix: insert milestone-level quality gates (MV01-MV04) alongside the assessment record. Extended GateScope and GateId types to support milestone-level validation. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: align test type literals with MilestoneRow, SliceRow, and AutoSession Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: update tests for removed full_summary_md fallback and FK constraints - workflow-projections: renderPlanContent test now expects TBD fallback instead of full_summary_md (removed in #2945 to prevent corruption) - validate-milestone: insert slice rows before validation so quality_gates FK constraint (milestone_id, slice_id) is satisfied - worktree-resolver: update teardownAutoWorktree assertion from 0 to 1 to account for secondary cleanup added in #2945 Bug 3 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-30 14:36:07 -06:00
Tom Boucher	a02b140f61	fix: isolate guided-flow session state and key discussion milestone queries (#2985 ) (#3094 ) * fix: resolve 4 correctness bugs in GSD extension core (#2985) Bug 1 — preferences.ts process.cwd() side-channel: loadEffectiveGSDPreferences() and loadProjectGSDPreferences() now accept an optional projectRoot parameter. When provided, preferences are loaded from the specified project directory instead of relying on process.cwd(). All 37+ callers continue to work unchanged (parameter defaults to cwd). Bug 2 — state.ts DB writes inside read functions (CQS violation): Extracted disk-to-DB milestone reconciliation into a new exported function reconcileDiskMilestonesToDb(). deriveState() and deriveStateFromDb() no longer write to the DB as a side effect of reading state. Callers that need reconciliation (auto-start.ts, guided-flow.ts, register-hooks.ts) now call it explicitly before reading state. Bug 3 — guided-flow.ts module-level session state: Converted pendingAutoStart from a module-level singleton to a Map keyed by basePath. Concurrent discuss sessions for different projects are now independent — the second session no longer silently overwrites the first. Bug 4 — getDiscussionMilestoneId() unkeyed query: getDiscussionMilestoneId() now accepts an optional basePath parameter for keyed lookup. When multiple sessions exist and no basePath is provided, it returns null instead of an arbitrary entry. Single-session backward compatibility is preserved. Closes #2985 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * refactor: narrow scope to complement igouss PRs #2986 and #2987 Revert changes to preferences.ts (Bug 1), state.ts, auto-start.ts, register-hooks.ts (Bug 2), and their test files. Those fixes are covered by @igouss in PRs #2986 and #2987. This PR now only contains: - Bug 3: guided-flow.ts pendingAutoStart singleton → Map (session isolation) - Bug 4: getDiscussionMilestoneId() keyed by basePath - Supporting unitType additions in preferences-models.ts Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: align source code with test expectations after scope narrowing The refactor commit (6972c97c) reverted source changes to state.ts, preferences.ts, and auto-start.ts but left their corresponding test assertions in place, causing 8 CI failures: - isValidationTerminal: treat any extracted verdict as terminal (#2769) - parseHeadingListFormat: handle raw YAML blocks under headings (#2794) - bootstrapAutoSession: snapshot ctx.model before guided-flow (#2829) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: open project DB before initial deriveState on cold bootstrap (#2841) When auto-mode starts cold (no prior DB handle), deriveState silently falls back to markdown-only data for DB-backed helpers (queue-order, task status), producing stale or incomplete state. Add openProjectDbIfPresent() helper that resolves the project-root DB path and opens it before the first deriveState call, ensuring full data visibility from the start. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-30 14:33:30 -06:00
Tom Boucher	45bd2572ac	fix(guided-flow): route dispatchWorkflow through dynamic routing pipeline (#3153 ) Closes #2958 Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-30 14:33:23 -06:00
Tom Boucher	46dff43e21	fix: skip external state migration inside git worktrees (#2970 ) (#3227 ) Add isInsideWorktree() guard at the top of migrateToExternalState() so migration never runs when basePath is a git worktree. Worktrees share the same repoIdentity hash as the main repo, so migration would create a junction to the wrong target and orphan .gsd.migrating. Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-30 14:33:16 -06:00
Tom Boucher	fb2ef25250	fix: coerce non-numeric strings in DB columns during manifest serialization (#2962 ) (#3229 ) SQLite can store string placeholders like "-", "N/A", or "" in INTEGER columns after schema migrations or manual inserts. snapshotState() was passing these through as-is via type assertions, producing JSON that fails to parse on round-trip. Add toNumeric() helper and apply it to all numeric columns (exit_code, duration_ms, sequence, seq). Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-30 14:32:58 -06:00
Tom Boucher	9cebf19559	fix: route allDiscussed and zero-slices paths to queued milestone discussion (#3150 ) (#3230 ) The allDiscussed early-return and pendingSlices.length===0 guard in showDiscuss() both hard-returned without checking for queued milestones, blocking users from discussing pending milestones when the active milestone's slices were all discussed or complete. Now both paths check for pending milestones and route to showDiscussQueuedMilestone() first. Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-30 14:32:49 -06:00
Tom Boucher	8b26ec7803	fix: use loose equality for null checks in secure_env_collect (#2997 ) (#3231 ) When ctx.ui.custom() returns undefined instead of null, strict equality checks (=== null / !== null) let the undefined value pass through to writeEnvKey, which crashes on .replace(). Switch to loose equality (== null / != null) in both the manifest orchestrator and extension execute paths so both null and undefined are treated as "skipped". Also add a type guard in writeEnvKey for defense in depth. Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-30 14:32:40 -06:00
Tom Boucher	872b70bf71	fix: prevent prompt explosion from $' in template replacement values (#2968 ) (#3232 ) Replace `replaceAll` with `split/join` in loadPrompt to avoid JavaScript's special replacement patterns ($', $`, $&) being interpreted in variable values. This caused exponential prompt expansion when values contained patterns like `grep -q '^0$'`. Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-30 14:32:33 -06:00
Tom Boucher	d41ce60530	fix: resolve OAuth API key in buildMemoryLLMCall via modelRegistry (#2959 ) (#3233 ) buildMemoryLLMCall called completeSimple without passing an API key, which routed to streamSimpleAnthropic -> getEnvApiKey (env vars only). OAuth users (Claude Max/Pro) store tokens in auth.json, so getEnvApiKey returned undefined, the call threw, and memory extraction silently failed. Now resolves the key eagerly via ctx.modelRegistry.getApiKey() which checks auth.json through authStorage, matching how streamAnthropic and the compaction orchestrator resolve credentials. Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-30 14:32:22 -06:00
Tom Boucher	dae746b905	fix(forensics): read completion status from DB instead of legacy file (#3129 ) (#3234 ) The forensics command showed "Completed Keys: 0" because it read from completed-units.json, which is never populated during normal auto-mode completion. Now queries the DB (milestones/slices/tasks tables) for authoritative completion counts, falling back to the legacy file only when DB is unavailable. Also fixes STATE.md showing "Active Milestone" for completed milestones — now shows "Last Completed Milestone" when phase is complete. Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-30 14:31:57 -06:00
Tom Boucher	dff73009c8	fix: use camelCase parameter names in execute-task and complete-slice prompts (#2933 ) (#3236 ) The prompts told the LLM to pass snake_case params (milestone_id, slice_id, task_id) but the TypeBox schemas expect camelCase (milestoneId, sliceId, taskId), causing "Missing named parameter" validation errors. Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-30 14:31:38 -06:00
Tom Boucher	4a82bc01dc	fix: check bootstrap completeness in init wizard gate, not just .gsd/ existence (#2942 ) (#3237 ) A zombie .gsd/ state (symlink exists but missing PREFERENCES.md and milestones/) caused the init wizard to be skipped entirely, resulting in an uninitialized project session. - guided-flow.ts: Replace bare `!existsSync(gsdRoot(basePath))` with a compound check for PREFERENCES.md or milestones/ bootstrap artifacts - auto-start.ts: Check milestones/ path directly instead of .gsd/ which ensureGsdSymlink already created (was dead code) - Add zombie-gsd-state.test.ts verifying both fixes Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-30 14:31:10 -06:00
Tom Boucher	0cedaf5fb9	fix: specify write tool for PROJECT.md in milestone/slice prompts (#3238 ) * fix: specify write tool for PROJECT.md in complete-milestone/slice prompts (#2946) The prompts for complete-milestone step 11 and complete-slice step 13 gave ambiguous instructions to "update PROJECT.md" without naming which tool to use. This caused LLMs to call `edit` with only `newText`, missing the required `path` and `oldText` parameters. Now both prompts explicitly instruct the LLM to use the `write` tool with the correct parameters, since PROJECT.md updates are full-document refreshes. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: preserve 'refresh current state if needed' phrase in complete-slice step 13 The PR's rewrite of step 13 to specify the write tool accidentally removed the phrase that an existing test asserts on. Re-add it while keeping the explicit write-tool instruction. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-30 14:31:04 -06:00

1 2 3 4 5 ...

2410 commits