When merging a milestone back to main, `git checkout main` fails if
untracked .gsd/ state files (STATE.md, completed-units.json, auto.lock)
in the working tree conflict with tracked files on the branch.
Remove these known GSD-managed state files before checkout. They are
runtime artifacts regenerated by doctor/rebuildState and are not
meaningful in the main working tree — the worktree had the real state.
OAuthSelectorComponent calls its onSelect callback synchronously (no
await), but the callback was async — calling showLoginDialog which
throws 'Login cancelled' on Escape. The unhandled rejection bubbled
up to the uncaughtException handler and crashed GSD.
Wrap the async work in a named function with .catch() so cancellation
errors are swallowed gracefully. showLoginDialog already handles its
own error display internally.
Worktree initialization only copied DECISIONS.md, REQUIREMENTS.md,
PROJECT.md, and QUEUE.md. The missing STATE.md caused the pre-dispatch
health check in doctor-proactive.ts to block dispatch with
'STATE.md missing'.
Add STATE.md, KNOWLEDGE.md, and OVERRIDES.md to the copy list so
worktrees start with complete planning state.
Every new pi session writes a fresh syncedAt timestamp to
managed-resources.json, causing a running auto-mode session to falsely
detect a GSD update and stop. The actual version (gsdVersion) only
changes on real upgrades.
Switch the staleness check from syncedAt (timestamp) to gsdVersion
(semver string) so that launching a second session no longer triggers
a false positive.
When multiple tool calls (e.g. concurrent gsd_save_decision) target the
same markdown file, the deterministic .tmp suffix caused ENOENT on
rename() because one caller consumed the temp file before another could
rename it.
Replace the static `.tmp` suffix with a per-call random suffix so each
concurrent writer gets its own temp file. Also clean up orphaned temp
files on rename failure.
The .planning → .gsd migration creates roadmaps and summaries but not
VALIDATION files. deriveState() requires a terminal validation file
(verdict: pass) to consider a milestone complete. Without it, every
migrated milestone enters validating-milestone phase, blocking progress
to the actual current milestone.
For milestones where all slices are done, write a pass-through
VALIDATION.md (verdict: pass, migrated: true) and SUMMARY.md so
deriveState() skips them correctly.
Updated integration test to verify VALIDATION/SUMMARY files are written
and deriveState returns 'complete' phase with activeMilestone pointing
to the last completed entry (expected behavior).
When unique_milestone_ids is enabled, the LLM cannot generate random
suffixes itself. Previously only the first milestone got a correct ID
(pre-generated in TS), while subsequent milestones in multi-milestone
projects got bare M002/M003 without suffixes.
Added a gsd_generate_milestone_id tool that the LLM calls to get each
milestone ID. The tool scans disk for existing milestones and respects
the unique_milestone_ids preference, making it impossible to produce
wrong-format IDs.
Updated discuss, discuss-headless, and queue prompts to instruct the
LLM to use the tool instead of inventing milestone IDs.
* feat: worker NDJSON monitoring, budget enforcement, PID-based stop fallback
Closes three gaps in parallel orchestration:
1. **Worker stdout monitoring** — Workers now run with `--mode json` so
they emit NDJSON events. The coordinator parses stdout line-by-line,
extracting cost/token data from `message_end` events. This keeps
per-worker cost tracking in sync with actual API spend and updates
session status files for live dashboard visibility.
2. **Budget enforcement before spawn** — `startParallel()` now checks
`isBudgetExceeded()` before each worker spawn. When the aggregate
cost across all workers reaches the configured ceiling, no new
workers are started.
3. **PID-based stop fallback** — `stopParallel()` now falls back to
`process.kill(pid, "SIGTERM")` when the ChildProcess handle is null
(e.g., after coordinator restart when handles aren't available).
Previously, orphaned workers could not be stopped.
Includes 11 new tests covering NDJSON format validation, cost
aggregation, budget ceiling comparison, and PID-based kill patterns.
All 54 existing parallel-orchestration tests still pass.
Relates to #672
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: currentUnit type must match SessionStatus interface (object | null, not string)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: render native web search tool calls in TUI
The Anthropic streaming parser silently dropped server_tool_use and
web_search_tool_result content blocks, making native web search
invisible. Add ServerToolUseContent and WebSearchResultContent types,
handle both block types in the streaming parser and conversation replay,
and render them as ToolExecutionComponent in the interactive TUI.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: add PREFER_BRAVE_SEARCH env var to bypass native web search
Set PREFER_BRAVE_SEARCH=1 to keep Brave/custom search tools active
on Anthropic models instead of injecting native server-side web search.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: skip non-toolCall blocks in Mistral provider conversation replay
The ServerToolUseContent and WebSearchResultContent types added for
native web search don't have id/name/arguments properties, causing
TypeScript errors when the Mistral provider tried to push them as
tool calls.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: meaningful commit messages from task summaries (#785, #784)
Post-task commits now derive messages from the task summary one-liner,
inferred type, and key files. Planning prompts respect commit_docs: false.
Commit type inference expanded with perf type and oneLiner parameter.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: replace invalidateStateCache with invalidateAllCaches in crash recovery
PR #799 reintroduced invalidateStateCache() calls in the phantom skip
loop crash recovery paths. These should use invalidateAllCaches() which
is the renamed function.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: resolve CI failures from main merge conflicts
- Replace invalidateStateCache() → invalidateAllCaches() in crash
recovery paths (reintroduced by PR #799 merge)
- Expand smart-entry-draft test chunk window from 3000 to 4000 chars
to accommodate commitInstruction additions
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Cherry-picked from #797 — adds clearArtifacts() to gsd-db.ts and wires
it into invalidateAllCaches() so the DB artifact table is cleared
alongside state/path/parse caches.
Co-Authored-By: 0xLeathery (PR #797)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When a crash lock references a unit from a fully-completed milestone,
crash recovery injects stale context that fights the skip-loop breaker,
creating an infinite evict/repair cycle with selfHealRuntimeRecords.
Fix 1: Validate recovered unit's milestone before synthesizing recovery
context. If the milestone has a SUMMARY file (complete), discard the
stale recovery context and clear the lock without injection.
Fix 2: Skip-loop breaker cross-checks whether the evicted unit belongs
to a completed milestone. If so, the eviction is counterproductive —
clear the skip counter and re-dispatch from fresh state instead.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* chore: add .audits/ to .gitignore
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: add auto-learned project memory system
Extracts durable knowledge from completed unit activity logs via
background LLM calls (Haiku-preferred) and injects ranked memories
into the system prompt. Includes DB schema v3 migration, memory store
CRUD with confidence/hit-count ranking, secret redaction, decay, and
cap enforcement.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: add timestamp to UserMessage in memory extractor
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: update schema version assertion in md-importer test
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Root cause:
The skip-loop breaker in dispatchNextUnit() called invalidateStateCache()
which only clears the in-memory _stateCache. Path caches (nativeTreeCache,
dirEntryCache) and parse caches (_parseCache) remained warm. On the next
deriveState() call, the nativeBatchParseGsdFiles/cachedLoadFile path used
the stale cache to build fileContentCache, returning the same stale unit
— looping forever even though disk had the correct [x] state.
rebuildState() in doctor.ts had the same bug: it called deriveState()
without invalidating caches first, writing stale state to STATE.md.
Fix:
1. auto.ts: replace all invalidateStateCache() calls in dispatch hot paths
with invalidateAllCaches() so path, parse, and state caches are all
cleared before the next deriveState() call.
2. doctor.ts: call invalidateAllCaches() at the top of rebuildState() so
STATE.md is always written from fresh disk reads.
3. Remove now-unused invalidateStateCache import from auto.ts.
Test:
Added #793 test to auto-recovery.test.ts: creates a fixture with T01
active, simulates task completion on disk (plan [x] + summary written),
calls invalidateAllCaches(), then asserts deriveState() returns T02 as
active (not T01 again). 30 passed, 0 failed.
Root cause:
When auto-mode stops via crash (not graceful stop), the milestone branch
HEAD lags behind the filesystem state at the project root. This is because
syncStateToProjectRoot() runs after every task completion and writes [x]
checkboxes to the project root, but the auto-commit that would record
those changes to the milestone branch may not have fired before the crash.
On restart, createAutoWorktree() re-attaches the worktree to the milestone
branch HEAD (which still has [ ] for the crashed task). When dispatchNextUnit()
then runs verifyExpectedArtifact(), it reads the plan from the worktree and
finds [ ], treats the completion record as stale, evicts the key, and
re-dispatches — entering an infinite skip/dispatch loop.
Fix:
Add reconcilePlanCheckboxes(projectRoot, wtPath, milestoneId) which is called
when re-attaching to an existing branch. It walks every markdown file in the
milestone directory at the project root and forward-applies any [x] checkbox
states that are ahead of the worktree version (never downgrades [x] → [ ]).
It also forward-merges completed-units.json.
The project root is the correct source of truth here because
syncStateToProjectRoot() is called after every task completion and keeps
it up-to-date. If the branch HEAD is behind due to a crash, the root
filesystem copy is the last known good state.
This is safe: we only ever advance checkbox state ([ ] → [x]), never
regress it. The milestone branch content is preserved for all other files.
Test:
Added test case to auto-worktree.test.ts covering the exact scenario:
- T01 [x] committed to milestone branch, T02 [x] written to project root
(by syncStateToProjectRoot) but commit never fired
- Worktree re-created from milestone branch HEAD (T02 still [ ])
- After reconciliation, worktree plan has T02 [x], T03 stays [ ]
Issues addressed:
1. guided-flow.ts: Remove 12 unnecessary 'ctx as any' casts
- ctx is already ExtensionCommandContext, matching showNextAction/showConfirm signatures
- The casts masked type-checking with no benefit
2. triage-ui.ts: Remove 1 unnecessary 'ctx as any' cast (same issue as #1)
3. migrate/command.ts: Remove 2 unnecessary 'ctx as any' casts (same issue as #1)
4. models-resolver.ts: Remove dead exports hasBothModelsFiles() and getModelsPaths()
- Never imported outside the module or in any test file
- resolveModelsJsonPath() (the only consumer) remains
5. resource-loader.ts: Remove dead export readManagedResourceSyncedAt()
- Exported but never imported anywhere in the entire codebase
6. bg-shell/overlay.ts: Extract processStatusHeader() helper
- DRYs the duplicated status icon + name + uptime + tab indicator
construction shared between renderOutput() and renderEvents()
7. get-secrets-from-user.ts: Merge duplicate vercel/convex deployment blocks
- Both had identical exec → check result code → push applied/errors pattern
- Merged into single conditional with destination-specific command string
Documented but not changed (boundary constraints):
- src/mcp-server.ts ↔ src/resources/extensions/gsd/mcp-server.ts
(compiled/jiti boundary prevents sharing)
- src/remote-questions-config.ts ↔ remote-questions/remote-command.ts
(same compiled/jiti boundary per #592)
- cli.ts internal duplication of session setup (structural, different resource loader configs)
npm ≥7 suppresses lifecycle script output by default, so the clack
banner/spinner was invisible during `npm install -g`. The user-facing
onboarding experience already lives at first `gsd` launch (onboarding.ts),
making the postinstall UI redundant dead code.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Follow-up to #774. When GSD runs in worktree isolation mode,
completed-units.json can fragment across project root and worktree
locations. If a session crashes or the worktree is removed after
milestone merge, keys written to the worktree are lost — causing
already-completed units to be re-dispatched.
Two fixes:
1. syncStateToProjectRoot() now performs a set-union merge of
completed-units.json from worktree into project root.
2. After worktree entry at startup, loadPersistedKeys() runs against
both project root and worktree so the in-memory completedKeySet
contains the union of both locations.
Co-authored-by: Lex Christopherson <lex@glittercowboy.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: interactive update prompt on startup (#770)
When a newer version of gsd-pi is available, show an interactive
prompt at startup with two options:
[1] Update now (runs npm install -g gsd-pi@latest)
[2] Skip
- Adds checkAndPromptForUpdates() to update-check.ts
- Reuses existing 24h cache so the registry is hit at most once/day
- Shows a boxed banner with current → latest versions
- Runs npm install -g gsd-pi@latest if the user picks [1]
- Exits after a successful update so the user relaunches with the new build
- Cleans up stdin state (listeners + raw mode) so the TUI starts cleanly
- Updates cli.ts to call checkAndPromptForUpdates() instead of the
fire-and-forget checkForUpdates() in interactive mode
- Skipped in print/RPC/MCP/headless modes (isPrintMode guard)
* fix: update-check prompt cleanup and robustness (#770)
- Remove duplicate NPM_PACKAGE constant (was shadowing NPM_PACKAGE_NAME)
- Fix hardcoded box width: measure visible text width dynamically so the
border aligns correctly for any version string length
- Add 30s timeout to rl.question so the prompt auto-skips in non-TTY
or piped-stdin edge cases that slip past the isPrintMode guard
* fix: address review feedback on update prompt (#770)
Three issues from @glittercowboy's review:
1. Box rendering bug: mid line was built as '║' + content + '║' then
sliced with .slice(1,-1) which cuts into ANSI escape sequences.
Fix: build midContent without delimiters and wrap with chalk.yellow('║')
directly, keeping a separate plain-text midVisible for width measurement.
2. Missing TTY guard: !isPrintMode alone isn't sufficient — a piped
stdin without --print would sit waiting 30s silently.
Fix: gate checkAndPromptForUpdates() on process.stdin.isTTY; fall back
to the passive checkForUpdates() banner for non-TTY interactive mode.
3. Dead import: checkForUpdates was imported but unused after the
previous refactor. Now used again as the non-TTY fallback — no
dead code.
* fix: downgrade missing_tasks_dir to warning for completed slices (#726)
When a worktree is removed and artifacts are rebuilt, tasks/ directories
aren't recreated. For completed slices this is cosmetic scaffolding, not
a structural error. Downgrade severity from "error" to "warning" so
completed milestones can render in /gsd visualize.
Also skip the missing_slice_plan warning entirely for completed slices,
since a plan file serves no purpose after completion.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: use slice.done instead of non-existent frontmatter.status
The SummaryFrontmatter type doesn't have a `status` property.
Use `slice.done` from the roadmap parser instead, which is the
canonical completion signal already available in scope.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When auto-mode pauses due to a rate limit, schedule automatic resumption
after the rate limit window elapses. Shows a countdown notification so
the user knows what's happening. Non-rate-limit errors still pause
indefinitely for manual intervention.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Three fixes for the worktree isolation stuck-state bug:
1. selfHealRuntimeRecords on initial start used the function parameter
`base` (main project root) instead of `basePath` (worktree path after
entry). This meant stale runtime records in the worktree were never
found or healed, leaving dispatched records that block auto-mode.
2. syncStateToProjectRoot now copies runtime/units/ records alongside
milestone data. This provides defense-in-depth: even if selfHeal runs
before worktree re-entry, stale records from a prior sync are visible.
3. initMetrics and initRoutingHistory also corrected from `base` to
`basePath` — same class of bug (stale function parameter after
worktree entry).
Adds test verifying selfHealRuntimeRecords resolves artifacts and clears
records correctly when pointed at a worktree base path.
Anthropic rate limit reset windows are typically 60-120s. The previous 60s
default, combined with the +1s buffer in extractRetryAfterMs(), meant that
virtually all rate limit retries were immediately abandoned.
300s (5 min) covers the vast majority of rate limit windows and lets the
built-in retry logic work as intended.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Nothing reads ~/.gsd/agent/AGENTS.md, and the script was incorrectly
pointing it at agents/researcher.md.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>