Commit graph

2560 commits

Author SHA1 Message Date
Tom Boucher
ba71db3b28 fix(web): use safePackageRootFromImportUrl for cross-platform package root (#1881) (#1893)
Replace raw fileURLToPath in getDefaultPackageRoot with safePackageRootFromImportUrl
which returns null instead of throwing when the URL is not a valid local file URL.
This prevents the standalone bundle from crashing on Windows when import.meta.url
is baked in at build time with a Linux file:// path.

Co-authored-by: trek-e <trek-e@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-05 07:43:46 -04:00
Tom Boucher
b155c92708 fix: isolate CmuxClient stdio to prevent TUI hangs in CMUX (#3306)
* fix(cmux): isolate CmuxClient stdio to prevent TUI hangs (#1922)

Replace execFileAsync (promisify) with spawn in runAsync to allow explicit
stdio isolation. Both runSync and runAsync now set stdio: ["ignore", "pipe",
"pipe"] so the cmux CLI child process cannot inherit the parent's stdin/stderr
and steal keyboard input or corrupt TUI rendering.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* ci: retrigger after integration flake

---------

Co-authored-by: trek-e <trek-e@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-05 07:43:21 -04:00
Tom Boucher
a81701979f feat(parallel): slice-level parallelism with dependency-aware dispatch (#3315)
* feat(parallel): add slice-level parallelism with dependency-aware dispatch

Fixes #2340

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(parallel): handle missing slice lock, add worktree cleanup, remove dead code

- state.ts: When GSD_SLICE_LOCK is set but the locked slice ID is not
  found in activeMilestoneSlices, log a warning and return a blocked
  state with a clear error message instead of silently continuing with
  activeSlice=undefined. Applied in both DB-backed and legacy paths.

- slice-parallel-orchestrator.ts: Add worktree cleanup via removeWorktree
  in stopSliceParallel (after killing workers) and in the catch block of
  startSliceParallel (for partially created worktrees). Store basePath in
  SliceOrchestratorState so stopSliceParallel can reference it.

- status-guards.ts: isInactiveStatus does not exist on this branch
  (only isClosedStatus is defined), so no removal needed.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(state): remove duplicate logWarning import after rebase conflict resolution

The rebase merge left two import lines for logWarning from workflow-logger.
Consolidated into a single import including logError.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: trek-e <trek-e@users.noreply.github.com>
2026-04-05 01:55:20 -04:00
Tom Boucher
56617891d9 fix: worktree health check walks parent dirs for monorepo support (#3313)
* fix: worktree health check walks parent dirs for monorepo support (#2347)

The health check only looked for project markers (package.json,
Cargo.toml, etc.) in s.basePath directly. In monorepos, these files
live in a parent directory, causing false rejections.

Now walks up parent directories to find project markers before
triggering the greenfield warning.

Fixes #2347

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(health-check): cap parent walk at git repo boundary

The parent directory walk in the worktree health check was unbounded,
walking all the way to the filesystem root. Ancestor directories like ~
or /usr/local may contain unrelated package.json files, causing false
positive health checks. Now stops at the .git boundary.

Also adds a test assertion verifying the .git boundary guard exists.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(ts): remove duplicate imports introduced during rebase

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: trek-e <trek-e@users.noreply.github.com>
2026-04-05 01:28:05 -04:00
Tom Boucher
5fa34d438b fix(gsd): promote milestone status from queued to active in plan-milestone (#3317)
* fix(gsd): promote milestone status from queued to active in plan-milestone

upsertMilestonePlanning() did not include title or status in its UPDATE
statement. When a milestone row was pre-created by ensureMilestoneDbRow
with status "queued", the INSERT OR IGNORE in insertMilestone() silently
skipped the row, and upsertMilestonePlanning() never updated the status.
This left the milestone permanently stuck as "queued", preventing proper
state-machine phase transitions during milestone completion.

Add title and status columns to the upsertMilestonePlanning() UPDATE
statement and pass them from handlePlanMilestone(). Uses COALESCE with
NULLIF to preserve existing values when empty strings are passed.

Closes #3022

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(ts): remove extra title arg from upsertMilestonePlanning call

* fix(test): move title into planning object for 2-param upsertMilestonePlanning

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: trek-e <trek-e@users.noreply.github.com>
2026-04-05 01:27:25 -04:00
Tom Boucher
1c95b94720 fix(worktree): correct merge failure notification command from /complete-milestone to /gsd dispatch complete-milestone (#1901)
Fixes #1891

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 01:05:12 -04:00
Tom Boucher
2a40f3f35d feat(mcp-client): add OAuth auth provider for HTTP transport (#3295)
* feat(mcp-client): add OAuth auth provider for HTTP transport

MCP HTTP transport was created without any auth options, causing 401
errors when connecting to remote servers (Sentry, Linear, etc.) that
require OAuth or bearer token authentication.

Add a new auth module that builds StreamableHTTPClientTransport options
from two config fields:
- `headers`: static headers with ${VAR} env resolution (for bearer tokens)
- `oauth`: full OAuthClientProvider for servers implementing MCP OAuth

The config is parsed from .mcp.json / .gsd/mcp.json and passed through
to the SDK transport constructor.

Fixes #2160

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* chore: retrigger CI

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: trek-e <trek-e@users.noreply.github.com>
2026-04-05 01:05:10 -04:00
Tom Boucher
d9cea627bf fix: detect and block Gemini CLI OAuth tokens used as API keys (#3296)
* fix: detect and block Gemini CLI OAuth tokens used as API keys

Users who install Google's standalone Gemini CLI may inadvertently set
GEMINI_API_KEY to an OAuth access token (ya29.*) instead of an AI Studio
API key (AIza*). These tokens fail at the Google API with a confusing
error. This adds early detection at three entry points:

- AuthStorage.set(): throws when storing ya29.* as api_key for "google"
- AuthStorage.getApiKey(): blocks ya29.* from runtime overrides (--api-key)
- AuthStorage.getApiKey(): blocks ya29.* from environment variables

Each path provides a clear error message explaining the issue and
directing users to either get an API key from aistudio.google.com or
use /login google-gemini-cli for OAuth-based access.

Fixes #2157

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* chore: retrigger CI

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: trek-e <trek-e@users.noreply.github.com>
2026-04-05 01:05:08 -04:00
Tom Boucher
290b8e9104 fix(auto): break retry loop on tool invocation errors (malformed JSON) (#3298)
* fix(auto): break retry loop on tool invocation errors (malformed JSON)

When a GSD tool (e.g. gsd_complete_slice) is invoked with truncated or
malformed JSON arguments, the tool execution fails but postUnitPreVerification
only checks artifact existence — it cannot distinguish "tool never ran" from
"tool ran but forgot to write artifact". This causes a stuck retry loop where
the same malformed invocation is re-dispatched indefinitely.

Add isToolInvocationError() classifier to detect JSON validation/parse errors
in tool results. Track the last tool invocation error on AutoSession via the
tool_execution_end hook. In postUnitPreVerification, when artifact verification
fails AND a tool invocation error was recorded, pause auto-mode with a
diagnostic message instead of setting up pendingVerificationRetry.

Closes #2883

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* chore: retrigger CI

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: trek-e <trek-e@users.noreply.github.com>
2026-04-05 01:05:05 -04:00
Tom Boucher
fa3ca6206e fix(git): use git add -u in symlink .gsd fallback to prevent hang (#3299)
* fix(git): use git add -u in symlink .gsd fallback to prevent hang on large repos (#1977)

When .gsd is a symlink, nativeAddAllWithExclusions falls back from
pathspec exclusions (which git rejects with "beyond a symbolic link").
The previous fallback used nativeAddAll (git add -A), which traverses
the entire working tree — hanging indefinitely on repos with large
untracked data directories (444GB+ observed).

Change the fallback to git add -u, which only stages changes to
already-tracked files. This is O(tracked) instead of O(filesystem),
making it safe for repos with massive untracked trees.

The tradeoff — new untracked files are not staged in the symlink
fallback path — is acceptable because auto-commit primarily captures
changes the agent made to existing tracked files.

Regression of #1712 which introduced the symlink fallback but chose
git add -A. That fix assumed .gitignore covered all large directories,
but scientific/ML repos often have large untracked data dirs not in
.gitignore.

Fixes #1977

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* chore: retrigger CI

* fix(test): update symlink autoCommit test to use tracked file modification

The PR changed the symlink fallback from git add -A to git add -u.
git add -u only stages changes to tracked files; untracked files are
skipped. The existing test created a new untracked file which git add -u
ignores, causing autoCommit to return null (nothing to commit).

Pre-commit the source file before the scenario and modify it so git add
-u can stage the tracked change.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: trek-e <trek-e@users.noreply.github.com>
2026-04-05 01:05:03 -04:00
Tom Boucher
c91b6eca9c fix: handle complete-slice context exhaustion to unblock downstream slices (#3300)
* fix: handle complete-slice context exhaustion to unblock downstream slices

Closes #2653

Three interacting bugs caused complete-slice to silently fail on context
exhaustion, permanently blocking all downstream slices:

1. Zero-tool-call guard only covered execute-task — extended to all unit
   types so any unit completing with 0 tool calls (context exhaustion) is
   treated as failed and retried in a fresh context.

2. Verification retry had no max-attempt limit — added a cap of 3 retries
   before escalating to writeBlockerPlaceholder, preventing infinite retry
   loops when the unit never produces its artifact.

3. writeBlockerPlaceholder only updated DB for execute-task — extended to
   also update slice status for complete-slice, breaking the circular
   dependency where verification needs DB status "complete" but only
   gsd_slice_complete sets that.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: import diagnoseExpectedArtifact from auto-recovery

* fix(auto): replace empty catch comments with logWarning in auto-recovery

The silent-catch-diagnostics test requires migrated files to have no
empty catch bodies. The /* non-fatal */ inline catches in auto-recovery
were flagged as empty since comments don't count as executable code.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: trek-e <trek-e@users.noreply.github.com>
2026-04-05 01:05:00 -04:00
Tom Boucher
db7a6372a6 fix: cap consecutive tool validation failures to prevent stuck-loop (#3301)
* fix: cap consecutive tool validation failures to prevent stuck-loop (#2783)

When the LLM repeatedly emits tool calls with arguments that fail schema
validation, the agent loop retries indefinitely — each failed validation
returns an error tool result, the LLM retries with the same broken args,
and the cycle burns budget with no progress.

Add a consecutive-failure counter in runLoop that tracks turns where ALL
tool calls fail. After MAX_CONSECUTIVE_VALIDATION_FAILURES (3) consecutive
all-error turns, the loop emits a diagnostic stop message and terminates
cleanly. The counter resets whenever any tool call in a turn succeeds, so
intermittent failures do not trigger early termination.

Closes #2783

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* chore: retrigger CI

* fix(test): repair agent-loop.test.ts — close unclosed blocks, merge imports

Two test suites were concatenated without closing the first suite's
it+describe blocks, placing the second suite's imports inside a function
body and triggering 'Unexpected "{" ' from esbuild. Merged into a single
well-structured file with consolidated imports and proper closings.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: trek-e <trek-e@users.noreply.github.com>
2026-04-05 01:04:58 -04:00
Tom Boucher
4f6b3433d6 fix: make enrichment tool params optional for limited-toolcall models (#3302)
* fix: make enrichment tool params optional for limited-toolcall models

Models with limited tool-calling capability (kimi-k2.5, glm-5-turbo)
cannot populate 23 top-level parameters in a single tool call, causing
stuck-loop retries that burn tokens. This reduces required params to
only core identification and content fields (7 for slice_complete, 4
for plan_milestone, 6 for task_complete, 5 for complete_milestone, 4
for plan_slice) by making enrichment/metadata arrays optional with
sensible defaults in the handlers.

Closes #2771

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(complete-task): add null-coalescing defaults to renderSummaryMarkdown

Add regression test verifying handleCompleteTask does not crash when
optional enrichment fields (keyFiles, keyDecisions, verificationEvidence,
blockerDiscovered) are omitted. The code paths already apply ?? defaults
after the #2720 refactor, but no test covered the minimal-params case.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* chore: retrigger CI

* fix: add null coalescing for optional string params in complete-task

params.deviations and params.knownIssues are optional (string | undefined)
but the task row expects string. Add ?? "" to satisfy TS2322.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: trek-e <trek-e@users.noreply.github.com>
2026-04-05 01:04:55 -04:00
Tom Boucher
7bbc0dd621 fix: add filesystem safety guard to complete-slice.md (#3304)
* fix: add filesystem safety guard to complete-slice.md (closes #2935)

Port the EISDIR prevention instruction from complete-milestone.md to
complete-slice.md so the LLM never passes a directory path to the
read tool when task summaries are truncated by the 30k-char cap.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* chore: retrigger CI (flaky #2912 MERGE_HEAD test)

* chore: retrigger CI

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: trek-e <trek-e@users.noreply.github.com>
2026-04-05 01:04:52 -04:00
Tom Boucher
34e73b01d1 fix(extensions): use bundledExtensionKeys for conflict detection instead of broken path heuristic (#3305)
Fixes #2075

The isBuiltIn check in detectExtensionConflicts used a path-based
heuristic that excluded paths containing /.gsd/agent/extensions/.
Since initResources() syncs all bundled extensions into that same
directory, the heuristic could never return true for bundled extensions,
preventing the "supersedes" hint from being appended. This caused
downstream cli.ts to display "Extension load error" instead of the
intended "Extension conflict" for benign precedence collisions.

Replace the path heuristic with an explicit bundledExtensionKeys set
(already computed by buildResourceLoader) threaded through to
detectExtensionConflicts. The set is matched against the extension
directory name extracted from the owner path.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 01:04:50 -04:00
Tom Boucher
85ece7ea69 fix: scope tools during discuss flows to prevent grammar overflow (#3307)
* fix: scope tools during discuss flows to prevent grammar overflow (#2949)

xAI/Grok (and other providers using grammar-based constrained decoding)
reject tool schemas that exceed their grammar complexity limit with HTTP
400 "Grammar is too complex." The full GSD tool set registers ~33 tools
with deeply nested schemas; discuss flows only need a small subset.

Add DISCUSS_TOOLS_ALLOWLIST to constants.ts listing the 10 GSD tools
(5 canonical + 5 aliases) actually referenced by discuss prompts. In
dispatchWorkflow, when unitType starts with "discuss-", filter active
tools to exclude heavy planning/execution/completion tools before
dispatching.

Closes #2949

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* chore: retrigger CI

* fix: remove resolveModelWithFallbacksForUnit import from guided-flow.ts

This import was incorrectly added as part of the discuss grammar changes.
The regression guard test (#2958) requires this function not be imported
in guided-flow.ts.

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: trek-e <trek-e@users.noreply.github.com>
2026-04-05 01:04:48 -04:00
Tom Boucher
1fe316477d fix(preferences): warn on silent parse failure for non-frontmatter files (#3310)
* fix(preferences): warn when preferences file lacks YAML frontmatter delimiters

parsePreferencesMarkdown() silently returned null for non-frontmatter
content, causing all preferences (including git.isolation: none) to be
ignored. The system fell back to default worktree isolation with no
indication to the user.

Now emits a stderr warning when a non-empty preferences file cannot be
parsed due to missing --- fences, so users know their file was skipped.

Fixes #2036

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* ci: retrigger after perf test fix

* fix(test): update preferences test to expect heading+list parser result

The heading+list fallback parser (parseHeadingListFormat) now handles
non-frontmatter markdown content like "## Git\n- isolation: none",
so the test should expect a parsed object instead of null.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* chore: retrigger CI

* fix: add missing closing brace for test block in preferences.test.ts

The 'unrecognized format warning' test block was missing its closing });
after the finally clause, causing TS1005 syntax error at line 535.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(preferences): include 'unrecognized format' in warn-once message (#2373)

The test filters for "unrecognized format" but the message only said
"does not use YAML frontmatter delimiters". Add the phrase so the
warn-once regression test can find its expected output.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: trek-e <trek-e@users.noreply.github.com>
2026-04-05 01:04:46 -04:00
Tom Boucher
5c2d8988bb fix: track remote-questions in managed-resources manifest (#3312)
* fix: track remote-questions extension in managed-resources manifest

writeManagedResourceManifest only checked for index.js/index.ts when
deciding if a subdirectory is an extension. remote-questions uses mod.ts
as its entry point and was missed, causing it to be pruned on upgrades.

Also check for extension-manifest.json which is the canonical marker for
bundled extensions.

Fixes #2367

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* chore: retrigger CI

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: trek-e <trek-e@users.noreply.github.com>
2026-04-05 01:04:44 -04:00
Tom Boucher
5d194d8701 fix(auto): add timeout guard for postUnitPostVerification in runFinalize (#3314)
* fix(auto): add timeout guard for postUnitPostVerification in runFinalize (#2344)

After plan-slice completes, the auto-loop can hang indefinitely if
postUnitPostVerification() never resolves (e.g., module import deadlock,
SQLite transaction hang). The terminal becomes unresponsive with no
error, no notification, and no recovery path.

Changes:
- Add withTimeout() utility in auto/finalize-timeout.ts that races a
  promise against a configurable timeout, returning a discriminated
  result instead of throwing
- Wrap the postUnitPostVerification() call in runFinalize() with a 60s
  timeout guard — on timeout, log a warning and force-return "next" so
  the loop continues to the next iteration
- Emit iteration-end journal event in the catch block of autoLoop() so
  the journal always records iteration completion, even on error

Fixes #2344

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* chore: retrigger CI

* fix(ts): remove duplicate imports introduced during rebase

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: trek-e <trek-e@users.noreply.github.com>
2026-04-05 01:04:41 -04:00
Tom Boucher
ff5015431e fix(gsd): handle large markdown parameters in complete-milestone JSON parsing (#3316)
* fix(gsd): sanitize complete-milestone params to handle LLM JSON quirks

When smaller models (e.g. claude-haiku) generate tool-call JSON with large
markdown parameters, the deserialized params can arrive with wrong types:
numbers instead of strings, null instead of arrays, string "true" instead
of boolean true. This caused crashes in handleCompleteMilestone.

Add sanitizeCompleteMilestoneParams() that normalizes all fields before
the handler runs, preventing TypeError crashes on type mismatches.

Closes #3013

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: add type narrowing for optional fields in sanitize-complete-milestone

followUps and deviations are required strings in CompleteMilestoneParams,
so use toStr() directly (which returns "" for nullish values) instead of
conditionally returning undefined.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: update test assertion to match toStr empty string default

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* chore: retrigger CI

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: trek-e <trek-e@users.noreply.github.com>
2026-04-05 01:04:38 -04:00
Tom Boucher
3e7a8d8556 fix(metrics): deduplicate idle-watchdog entries and fix forensics false-positives (#1973)
* fix(metrics): deduplicate idle-watchdog entries on ledger load and fix forensics false-positives

The idle watchdog creates duplicate metrics entries with the same
(type, id, startedAt) triple, inflating reported cost by ~35% and
causing false-positive stuck-loop anomalies in forensics.

Add a deduplicateUnits() pass in loadLedger() that collapses entries
sharing the same (type, id, startedAt) key, keeping the one with the
highest finishedAt. The cleaned ledger is persisted back to disk so
duplicates do not re-accumulate across sessions.

Fix detectStuckLoops() in forensics.ts to count distinct startedAt
values per type/id instead of raw entry count, so idle-watchdog
snapshots of the same dispatch are not flagged as stuck loops.

Fixes #1943

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: export ForensicAnomaly type and use in test

The test declared a local ForensicAnomaly interface with `type: string`
which was incompatible with the real union literal type, causing CI
typecheck failure.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 00:54:44 -04:00
Tom Boucher
c8f11019f5 fix: prevent milestone/slice artifact rendering corruption (#3293)
* fix: prevent milestone/slice artifact rendering corruption (#2630)

Three renderer bugs caused markdown artifact corruption after slice
completion and milestone closeout:

1. Milestone title double-prefixing: when params.title already contained
   the milestone ID prefix (e.g. "M001: Topic"), renderers produced
   "M001: M001: Topic". Added stripIdPrefix() helper and applied it in
   renderRoadmapContent, renderStateContent, and
   renderMilestoneSummaryMarkdown.

2. full_uat_md demo fallback: renderPlanContent and renderRoadmapContent
   fell back to raw full_uat_md (multi-line UAT documents) when
   slice.demo was empty, injecting entire UAT bodies into PLAN.md and
   ROADMAP.md table cells. Changed fallback to "TBD" instead.

3. STATE.md active milestone/slice/registry entries also double-prefixed
   titles. Applied stripIdPrefix() to all title renderings in
   renderStateContent.

Closes #2630

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: correct Phase type and null/undefined in artifact-corruption test

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(workflow-projections): apply stripIdPrefix in renderPlanContent

renderPlanContent was missed in the original fix — sliceRow.title
containing an ID prefix (e.g. "S04: Title") produced double-prefixed
PLAN.md headings ("# S04: S04: Title"). Apply stripIdPrefix consistent
with renderRoadmapContent and renderStateContent.

Adds two test cases covering prefixed and non-prefixed slice titles.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* chore: retrigger CI

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: trek-e <trek-e@users.noreply.github.com>
2026-04-05 00:53:02 -04:00
Tom Boucher
4d9b8acadb fix(doctor): strip --fix flag before positional parse (#1919) (#1926)
The flag-stripping regex in handleDoctor() removed --json, --dry-run,
--build, and --test but not --fix. When a user ran `/gsd doctor --fix`,
"--fix" leaked into requestedScope, mode stayed "doctor", and fix was
never enabled -- silently suppressing all issues and fixes.

Extract parseDoctorArgs() as a pure, exported function. Add --fix to
the stripping regex and propagate fixFlag into the fix option passed
to runGSDDoctor.

Fixes #1919

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 00:52:11 -04:00
Tom Boucher
c609d813d3 fix: resolve external-state worktree DB path (#2952) (#3303)
* fix: resolve external-state worktree DB path in resolveProjectRootDbPath (#2952)

Add regex check for ~/.gsd/projects/<hash>/worktrees/<MID> path layout
so DB writes from external-state worktrees target the canonical project
DB instead of creating an isolated local DB that is lost on teardown.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* chore: retrigger CI (flaky #2912 MERGE_HEAD test)

* chore: retrigger CI

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: trek-e <trek-e@users.noreply.github.com>
2026-04-05 00:51:46 -04:00
Tom Boucher
b6fcc1bb49 fix(gsd): worktree teardown path validation prevents data loss (#3311)
* fix(gsd): validate worktree paths before removal to prevent data loss

Fixes #2365

removeWorktree() overrides its computed path with whatever git reports
from `git worktree list`. When .gsd/ was a symlink, git resolves it at
creation time, so the registered path can point to an external directory.
Teardown with --force on that path destroys user data.

Add isInsideWorktreesDir() containment check that resolves ".." traversals
and validates the target is under .gsd/worktrees/ before any destructive
operation (nativeWorktreeRemove --force, rmSync). Paths outside containment
get a non-force git worktree remove only, with a warning logged.

Also guard the fallback rmSync in teardownAutoWorktree() with the same
containment check.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* refactor: convert worktree-teardown-safety test to node:test format

Replace main() function wrapper pattern with proper describe/it blocks
using node:test, matching the project's standard test conventions.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: replace empty catch blocks with logWarning in worktree-manager

* fix(test): increase removeWorktree slice window from 3000 to 6000 chars

The PR's path-validation additions expanded removeWorktree beyond the
3000-char snapshot window, pushing the submodule safety section out of
scope and causing Tests 3 and 4 to see empty strings.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: trek-e <trek-e@users.noreply.github.com>
2026-04-05 00:51:19 -04:00
Tom Boucher
fb63c15bc9 fix: prevent auto-mode from dispatching deferred slices (#3309)
* fix: prevent auto-mode from dispatching deferred slices (#2661)

When a decision deferred a slice via gsd_decision_save, the deferral was
recorded in DECISIONS.md but the slice status in the DB remained "active".
The dispatcher reads slice status (not DECISIONS.md) to choose work, so it
kept dispatching the deferred slice — burning tokens on the wrong work.

Three changes fix the split-brain:

1. status-guards.ts: add isDeferredStatus() and isInactiveStatus() predicates
2. state.ts: skip slices with "deferred" status in active-slice selection
3. db-writer.ts: when saveDecisionToDb detects a deferral decision referencing
   a slice (M###/S## pattern in scope/choice/decision), update that slice's DB
   status to "deferred" so the state machine sees it immediately

Closes #2661

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(db-writer): consistent deferral regex + add extractDeferredSliceRef tests

The scope regex was missing "ring" and "s" variants (deferring, defers)
while choice/decision had the complete pattern. Unified all three to use
/\bdefer(?:ral|red|ring|s)?\b/i.

Added 8 unit tests for extractDeferredSliceRef covering: scope/choice/
decision deferral detection, missing M###/S## patterns, "deferring" and
"defers" variants, multiple pattern matches, and non-deferral keywords.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* chore: retrigger CI

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: trek-e <trek-e@users.noreply.github.com>
2026-04-05 00:50:44 -04:00
Tom Boucher
e0a5f6866e fix: preserve completed slice status on plan-milestone re-plan (#3318)
* fix: preserve completed slice status on plan-milestone re-plan (#2558)

When plan-milestone re-plans a milestone that has already-completed slices,
the handler now checks existing slice status before inserting. Completed
slices retain their status instead of being reset to "pending".

Three changes:

1. handlePlanMilestone() checks getSlice() before insertSlice() and passes
   the existing completed/done status instead of hardcoding "pending".

2. insertSlice() changed from INSERT OR IGNORE to INSERT ... ON CONFLICT
   upsert that updates non-status fields (title, risk, depends, demo,
   planning metadata) but preserves completed/done status at the DB layer.

3. reconcileWorktreeDb() slice and task merges now use LEFT JOIN to detect
   existing completed rows in the main DB and never downgrade their status
   when merging stale worktree data.

Closes #2558

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: relax completed-slice guard to allow re-plan when slices are retained

The #2960 guard blocked re-planning entirely when any completed slices
existed, conflicting with the #2558 preserve-completed-status logic.
Now only blocks when the new plan would drop completed slices.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(gsd-db): prevent insertSlice ON CONFLICT from wiping populated fields

The ON CONFLICT clause unconditionally overwrote all non-status fields with
excluded values. Callers like complete-task.ts and complete-slice.ts use
insertSlice as an idempotent "ensure row exists" guard with only id and
milestoneId, causing defaults (empty strings, 0) to silently destroy
populated titles, demos, goals, and planning data.

Fix: use raw sentinel bind params (NULL when caller omitted the field) in
CASE guards so the ON CONFLICT UPDATE only overwrites fields the caller
actually provided. Initial INSERTs still get proper defaults.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* chore: retrigger CI

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: trek-e <trek-e@users.noreply.github.com>
2026-04-05 00:49:59 -04:00
Tom Boucher
0908adcb4e fix: reopen DB on cold resume, recognize heavy check mark (#3319)
* fix: reopen DB on cold resume and recognize U+2714 check mark

The paused-session resume path in auto.ts called rebuildState/deriveState
without first opening the project database, causing state derivation to
fall back to markdown parsing. This misparsed roadmap table rows with
glyph done markers and could redispatch wrong slices.

Export openProjectDbIfPresent from auto-start.ts and call it in the
resume path before rebuildState, matching the fresh bootstrap ordering.

Also add U+2714 (heavy check mark) to the table parser done-detection
regex alongside the existing U+2705/U+2611/U+2713 glyphs.

Closes #2940

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: remove duplicate openProjectDbIfPresent from rebase conflict

The rebase onto main introduced a duplicate `openProjectDbIfPresent`
function declaration (one from the PR, one from main), causing TS2393.
Keep the exported version that uses the static import.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: use logWarning instead of process.stderr in openProjectDbIfPresent

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: trek-e <trek-e@users.noreply.github.com>
2026-04-05 00:49:29 -04:00
Tom Boucher
b1ae782876 fix: dashboard model label shows dispatched model, not stale previous unit (#3320)
* fix: dashboard model label shows dispatched model, not stale previous unit model

Move updateProgressWidget and ensurePreconditions after selectAndApplyModel
in phases.ts so the widget's first render tick reads the correct model.
Store currentDispatchedModelId in session state after model selection + hook
overrides, expose it via widgetStateAccessors, and update auto-dashboard.ts
to prefer the dispatched model ID over cmdCtx.model which can be stale.

Closes #2899

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(test): widen runUnitPhase slice in ordering test to accommodate grown function body

The structural test for selectAndApplyModel ordering sliced only 8000
chars of runUnitPhase, but the function grew past that limit after
rebase, causing updateProgressWidget to fall outside the window.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* chore: retrigger CI

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: trek-e <trek-e@users.noreply.github.com>
2026-04-05 00:48:50 -04:00
Tom Boucher
9b6ff01471 docs: add provider setup guide for third-party LLM providers (#3294)
* docs: add provider setup guide and improve onboarding hints

Fixes #2161

Add docs/providers.md with step-by-step setup instructions for every
supported LLM provider: OpenRouter, Ollama, LM Studio, vLLM, SGLang,
and all built-in providers. Includes env var names, example configs,
common pitfalls, and verification steps.

Improve onboarding wizard:
- Add URL hints to provider selection list
- Show common local endpoints when choosing Custom (OpenAI-compatible)
- Add post-setup guidance for OpenRouter and custom endpoints
- Reference docs/providers.md for compat troubleshooting

Update cross-references in getting-started.md, troubleshooting.md,
docs/README.md, and help-text.ts to link to the new guide.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* test: verify config help mentions OpenRouter, Ollama, and docs/providers.md

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: trek-e <trek-e@users.noreply.github.com>
2026-04-05 00:48:19 -04:00
github-actions[bot]
647954098a release: v2.63.0 2026-04-05 03:48:56 +00:00
Jeremy McSpadden
ab38f60cc2 Merge pull request #3525 from jeremymcs/fix/diagnostic-messaging
fix(gsd): enrich vague diagnostic messages with root-cause context
2026-04-04 21:20:43 -05:00
Jeremy
53639aee5d fix(gsd): enrich vague diagnostic messages with root-cause context
Closes #3524
2026-04-04 21:04:26 -05:00
github-actions[bot]
b8cb665ba5 release: v2.62.1 2026-04-05 01:33:15 +00:00
github-actions[bot]
546c93cae7 release: v2.62.1 2026-04-05 01:33:15 +00:00
Jeremy McSpadden
9e3ee58619 Merge pull request #3514 from jeremymcs/fix/ask-user-questions-dedup
fix(gsd): prevent duplicate ask_user_questions dispatches
2026-04-04 20:12:48 -05:00
Jeremy
ca55b5c269 fix(test): reset dedup cache between ask-user-freetext tests
The per-turn dedup cache introduced in the parent commit persists across
test cases since they all use the same question signature. Test 1 populates
the cache, causing tests 2 and 3 to get cached results instead of exercising
their intended code paths.
2026-04-04 19:59:54 -05:00
Jeremy McSpadden
74d5200cfd Merge pull request #3516 from jeremymcs/feat/mcp-server-readonly-tools
feat(mcp-server): add 6 read-only tools for project state queries
2026-04-04 19:56:49 -05:00
Jeremy McSpadden
34baf57e40 Merge pull request #3518 from jeremymcs/fix/bootstrap-prefer-gsd-preferences
fix(gsd): prefer PREFERENCES.md over settings.json for bootstrap model
2026-04-04 19:50:09 -05:00
Jeremy McSpadden
56153c9096 Merge pull request #3520 from jeremymcs/fix/worktree-orphaned-wal-shm
fix(db): delete orphaned WAL/SHM files alongside empty gsd.db
2026-04-04 19:47:14 -05:00
Jeremy
e4987f5337 fix(db): delete orphaned WAL/SHM files alongside empty gsd.db (#2478)
syncProjectRootToWorktree deleted empty gsd.db but left companion
-wal and -shm files on disk. On Node 24, node:sqlite attempts WAL
recovery from orphaned files, triggering a synchronous CPU spin loop
(227% CPU, 1.4GB RSS). Now deletes gsd.db-wal and gsd.db-shm when
the main DB is deleted or already missing.
2026-04-04 19:36:43 -05:00
Jeremy McSpadden
b0697f24f6 Merge pull request #3519 from jeremymcs/fix/auto-wrapup-inflight-interrupt
fix(gsd): prevent auto-wrapup from interrupting in-flight tool calls
2026-04-04 19:26:02 -05:00
Jeremy
fd96a1a30b fix(gsd): prevent auto-wrapup from interrupting in-flight tool calls (#3512)
Gate triggerTurn behind getInFlightToolCount() === 0 for both soft
timeout and context-pressure wrapup messages. Add clearQueue() to
stopAuto() and pauseAuto() to flush late async follow-ups.
2026-04-04 19:14:23 -05:00
Jeremy McSpadden
246e706991 Merge pull request #3495 from Tibsfox/fix/tool-argument-json-robustness
fix(pi-ai): extend repairToolJson to handle XML tags and truncated numbers
2026-04-04 19:02:28 -05:00
Jeremy McSpadden
12ef95024c Merge pull request #3451 from deseltrus/fix/stale-retries-after-model-switch
fix: cancel stale retries after model switch
2026-04-04 18:27:33 -05:00
Jeremy McSpadden
dbaf37ae78 Merge pull request #3491 from Tibsfox/fix/claude-code-skill-directory-support
fix(gsd): add Claude Code official skill directories to skill resolution
2026-04-04 18:24:51 -05:00
Jeremy McSpadden
677ca806df Merge pull request #3494 from Tibsfox/fix/decision-save-transaction-race
fix(gsd): wrap decision and requirement saves in transaction to prevent ID races
2026-04-04 18:23:15 -05:00
Jeremy
eed833138d chore: untrack .repowise/ and add to .gitignore
Machine-local indexing state (LanceDB, sync cursors, job files) was
being tracked in Git, causing merge conflicts and stale cursor
propagation across branches. Gitignore alone doesn't affect
already-tracked files, so this removes them from the index while
keeping them on disk.
2026-04-04 18:22:39 -05:00
Jeremy
70c76d9a1a fix(gsd): handle bare model IDs in resolveDefaultSessionModel (#3517)
resolveDefaultSessionModel() previously only returned a result for
provider/model format strings, silently ignoring valid bare model IDs
like "gpt-5.4". This meant preferences could fail to override stale
settings.json defaults when users configured models without explicit
provider prefixes.

Now accepts sessionProvider param (ctx.model?.provider) to resolve bare
IDs. Also handles object configs without explicit provider field.
2026-04-04 18:10:50 -05:00
Tibsfox
c70eacea89 fix(gsd): wrap decision and requirement saves in transaction to prevent ID races
nextDecisionId() and nextRequirementId() compute the next ID via
SELECT MAX then pass it to a separate upsertDecision/upsertRequirement
call. When parallel tool calls hit these functions concurrently, both
read the same MAX value and produce the same ID — the second insert
silently overwrites the first.

Move the SELECT MAX + INSERT into a single transaction() call from
gsd-db.ts, which uses BEGIN/COMMIT/ROLLBACK and works on both
better-sqlite3 and node:sqlite providers. The transaction is
re-entrant safe (nested calls skip the BEGIN).

Same fix applied to saveRequirementToDb for consistency.

Closes #3326, closes #3339, closes #3459
2026-04-04 15:16:52 -07:00