The .planning → .gsd migration creates roadmaps and summaries but not
VALIDATION files. deriveState() requires a terminal validation file
(verdict: pass) to consider a milestone complete. Without it, every
migrated milestone enters validating-milestone phase, blocking progress
to the actual current milestone.
For milestones where all slices are done, write a pass-through
VALIDATION.md (verdict: pass, migrated: true) and SUMMARY.md so
deriveState() skips them correctly.
Updated integration test to verify VALIDATION/SUMMARY files are written
and deriveState returns 'complete' phase with activeMilestone pointing
to the last completed entry (expected behavior).
When unique_milestone_ids is enabled, the LLM cannot generate random
suffixes itself. Previously only the first milestone got a correct ID
(pre-generated in TS), while subsequent milestones in multi-milestone
projects got bare M002/M003 without suffixes.
Added a gsd_generate_milestone_id tool that the LLM calls to get each
milestone ID. The tool scans disk for existing milestones and respects
the unique_milestone_ids preference, making it impossible to produce
wrong-format IDs.
Updated discuss, discuss-headless, and queue prompts to instruct the
LLM to use the tool instead of inventing milestone IDs.
* feat: worker NDJSON monitoring, budget enforcement, PID-based stop fallback
Closes three gaps in parallel orchestration:
1. **Worker stdout monitoring** — Workers now run with `--mode json` so
they emit NDJSON events. The coordinator parses stdout line-by-line,
extracting cost/token data from `message_end` events. This keeps
per-worker cost tracking in sync with actual API spend and updates
session status files for live dashboard visibility.
2. **Budget enforcement before spawn** — `startParallel()` now checks
`isBudgetExceeded()` before each worker spawn. When the aggregate
cost across all workers reaches the configured ceiling, no new
workers are started.
3. **PID-based stop fallback** — `stopParallel()` now falls back to
`process.kill(pid, "SIGTERM")` when the ChildProcess handle is null
(e.g., after coordinator restart when handles aren't available).
Previously, orphaned workers could not be stopped.
Includes 11 new tests covering NDJSON format validation, cost
aggregation, budget ceiling comparison, and PID-based kill patterns.
All 54 existing parallel-orchestration tests still pass.
Relates to #672
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: currentUnit type must match SessionStatus interface (object | null, not string)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: render native web search tool calls in TUI
The Anthropic streaming parser silently dropped server_tool_use and
web_search_tool_result content blocks, making native web search
invisible. Add ServerToolUseContent and WebSearchResultContent types,
handle both block types in the streaming parser and conversation replay,
and render them as ToolExecutionComponent in the interactive TUI.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: add PREFER_BRAVE_SEARCH env var to bypass native web search
Set PREFER_BRAVE_SEARCH=1 to keep Brave/custom search tools active
on Anthropic models instead of injecting native server-side web search.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: skip non-toolCall blocks in Mistral provider conversation replay
The ServerToolUseContent and WebSearchResultContent types added for
native web search don't have id/name/arguments properties, causing
TypeScript errors when the Mistral provider tried to push them as
tool calls.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: meaningful commit messages from task summaries (#785, #784)
Post-task commits now derive messages from the task summary one-liner,
inferred type, and key files. Planning prompts respect commit_docs: false.
Commit type inference expanded with perf type and oneLiner parameter.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: replace invalidateStateCache with invalidateAllCaches in crash recovery
PR #799 reintroduced invalidateStateCache() calls in the phantom skip
loop crash recovery paths. These should use invalidateAllCaches() which
is the renamed function.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: resolve CI failures from main merge conflicts
- Replace invalidateStateCache() → invalidateAllCaches() in crash
recovery paths (reintroduced by PR #799 merge)
- Expand smart-entry-draft test chunk window from 3000 to 4000 chars
to accommodate commitInstruction additions
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Cherry-picked from #797 — adds clearArtifacts() to gsd-db.ts and wires
it into invalidateAllCaches() so the DB artifact table is cleared
alongside state/path/parse caches.
Co-Authored-By: 0xLeathery (PR #797)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When a crash lock references a unit from a fully-completed milestone,
crash recovery injects stale context that fights the skip-loop breaker,
creating an infinite evict/repair cycle with selfHealRuntimeRecords.
Fix 1: Validate recovered unit's milestone before synthesizing recovery
context. If the milestone has a SUMMARY file (complete), discard the
stale recovery context and clear the lock without injection.
Fix 2: Skip-loop breaker cross-checks whether the evicted unit belongs
to a completed milestone. If so, the eviction is counterproductive —
clear the skip counter and re-dispatch from fresh state instead.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* chore: add .audits/ to .gitignore
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: add auto-learned project memory system
Extracts durable knowledge from completed unit activity logs via
background LLM calls (Haiku-preferred) and injects ranked memories
into the system prompt. Includes DB schema v3 migration, memory store
CRUD with confidence/hit-count ranking, secret redaction, decay, and
cap enforcement.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: add timestamp to UserMessage in memory extractor
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: update schema version assertion in md-importer test
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Root cause:
The skip-loop breaker in dispatchNextUnit() called invalidateStateCache()
which only clears the in-memory _stateCache. Path caches (nativeTreeCache,
dirEntryCache) and parse caches (_parseCache) remained warm. On the next
deriveState() call, the nativeBatchParseGsdFiles/cachedLoadFile path used
the stale cache to build fileContentCache, returning the same stale unit
— looping forever even though disk had the correct [x] state.
rebuildState() in doctor.ts had the same bug: it called deriveState()
without invalidating caches first, writing stale state to STATE.md.
Fix:
1. auto.ts: replace all invalidateStateCache() calls in dispatch hot paths
with invalidateAllCaches() so path, parse, and state caches are all
cleared before the next deriveState() call.
2. doctor.ts: call invalidateAllCaches() at the top of rebuildState() so
STATE.md is always written from fresh disk reads.
3. Remove now-unused invalidateStateCache import from auto.ts.
Test:
Added #793 test to auto-recovery.test.ts: creates a fixture with T01
active, simulates task completion on disk (plan [x] + summary written),
calls invalidateAllCaches(), then asserts deriveState() returns T02 as
active (not T01 again). 30 passed, 0 failed.
Root cause:
When auto-mode stops via crash (not graceful stop), the milestone branch
HEAD lags behind the filesystem state at the project root. This is because
syncStateToProjectRoot() runs after every task completion and writes [x]
checkboxes to the project root, but the auto-commit that would record
those changes to the milestone branch may not have fired before the crash.
On restart, createAutoWorktree() re-attaches the worktree to the milestone
branch HEAD (which still has [ ] for the crashed task). When dispatchNextUnit()
then runs verifyExpectedArtifact(), it reads the plan from the worktree and
finds [ ], treats the completion record as stale, evicts the key, and
re-dispatches — entering an infinite skip/dispatch loop.
Fix:
Add reconcilePlanCheckboxes(projectRoot, wtPath, milestoneId) which is called
when re-attaching to an existing branch. It walks every markdown file in the
milestone directory at the project root and forward-applies any [x] checkbox
states that are ahead of the worktree version (never downgrades [x] → [ ]).
It also forward-merges completed-units.json.
The project root is the correct source of truth here because
syncStateToProjectRoot() is called after every task completion and keeps
it up-to-date. If the branch HEAD is behind due to a crash, the root
filesystem copy is the last known good state.
This is safe: we only ever advance checkbox state ([ ] → [x]), never
regress it. The milestone branch content is preserved for all other files.
Test:
Added test case to auto-worktree.test.ts covering the exact scenario:
- T01 [x] committed to milestone branch, T02 [x] written to project root
(by syncStateToProjectRoot) but commit never fired
- Worktree re-created from milestone branch HEAD (T02 still [ ])
- After reconciliation, worktree plan has T02 [x], T03 stays [ ]
Issues addressed:
1. guided-flow.ts: Remove 12 unnecessary 'ctx as any' casts
- ctx is already ExtensionCommandContext, matching showNextAction/showConfirm signatures
- The casts masked type-checking with no benefit
2. triage-ui.ts: Remove 1 unnecessary 'ctx as any' cast (same issue as #1)
3. migrate/command.ts: Remove 2 unnecessary 'ctx as any' casts (same issue as #1)
4. models-resolver.ts: Remove dead exports hasBothModelsFiles() and getModelsPaths()
- Never imported outside the module or in any test file
- resolveModelsJsonPath() (the only consumer) remains
5. resource-loader.ts: Remove dead export readManagedResourceSyncedAt()
- Exported but never imported anywhere in the entire codebase
6. bg-shell/overlay.ts: Extract processStatusHeader() helper
- DRYs the duplicated status icon + name + uptime + tab indicator
construction shared between renderOutput() and renderEvents()
7. get-secrets-from-user.ts: Merge duplicate vercel/convex deployment blocks
- Both had identical exec → check result code → push applied/errors pattern
- Merged into single conditional with destination-specific command string
Documented but not changed (boundary constraints):
- src/mcp-server.ts ↔ src/resources/extensions/gsd/mcp-server.ts
(compiled/jiti boundary prevents sharing)
- src/remote-questions-config.ts ↔ remote-questions/remote-command.ts
(same compiled/jiti boundary per #592)
- cli.ts internal duplication of session setup (structural, different resource loader configs)
npm ≥7 suppresses lifecycle script output by default, so the clack
banner/spinner was invisible during `npm install -g`. The user-facing
onboarding experience already lives at first `gsd` launch (onboarding.ts),
making the postinstall UI redundant dead code.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Follow-up to #774. When GSD runs in worktree isolation mode,
completed-units.json can fragment across project root and worktree
locations. If a session crashes or the worktree is removed after
milestone merge, keys written to the worktree are lost — causing
already-completed units to be re-dispatched.
Two fixes:
1. syncStateToProjectRoot() now performs a set-union merge of
completed-units.json from worktree into project root.
2. After worktree entry at startup, loadPersistedKeys() runs against
both project root and worktree so the in-memory completedKeySet
contains the union of both locations.
Co-authored-by: Lex Christopherson <lex@glittercowboy.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: interactive update prompt on startup (#770)
When a newer version of gsd-pi is available, show an interactive
prompt at startup with two options:
[1] Update now (runs npm install -g gsd-pi@latest)
[2] Skip
- Adds checkAndPromptForUpdates() to update-check.ts
- Reuses existing 24h cache so the registry is hit at most once/day
- Shows a boxed banner with current → latest versions
- Runs npm install -g gsd-pi@latest if the user picks [1]
- Exits after a successful update so the user relaunches with the new build
- Cleans up stdin state (listeners + raw mode) so the TUI starts cleanly
- Updates cli.ts to call checkAndPromptForUpdates() instead of the
fire-and-forget checkForUpdates() in interactive mode
- Skipped in print/RPC/MCP/headless modes (isPrintMode guard)
* fix: update-check prompt cleanup and robustness (#770)
- Remove duplicate NPM_PACKAGE constant (was shadowing NPM_PACKAGE_NAME)
- Fix hardcoded box width: measure visible text width dynamically so the
border aligns correctly for any version string length
- Add 30s timeout to rl.question so the prompt auto-skips in non-TTY
or piped-stdin edge cases that slip past the isPrintMode guard
* fix: address review feedback on update prompt (#770)
Three issues from @glittercowboy's review:
1. Box rendering bug: mid line was built as '║' + content + '║' then
sliced with .slice(1,-1) which cuts into ANSI escape sequences.
Fix: build midContent without delimiters and wrap with chalk.yellow('║')
directly, keeping a separate plain-text midVisible for width measurement.
2. Missing TTY guard: !isPrintMode alone isn't sufficient — a piped
stdin without --print would sit waiting 30s silently.
Fix: gate checkAndPromptForUpdates() on process.stdin.isTTY; fall back
to the passive checkForUpdates() banner for non-TTY interactive mode.
3. Dead import: checkForUpdates was imported but unused after the
previous refactor. Now used again as the non-TTY fallback — no
dead code.
* fix: downgrade missing_tasks_dir to warning for completed slices (#726)
When a worktree is removed and artifacts are rebuilt, tasks/ directories
aren't recreated. For completed slices this is cosmetic scaffolding, not
a structural error. Downgrade severity from "error" to "warning" so
completed milestones can render in /gsd visualize.
Also skip the missing_slice_plan warning entirely for completed slices,
since a plan file serves no purpose after completion.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: use slice.done instead of non-existent frontmatter.status
The SummaryFrontmatter type doesn't have a `status` property.
Use `slice.done` from the roadmap parser instead, which is the
canonical completion signal already available in scope.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When auto-mode pauses due to a rate limit, schedule automatic resumption
after the rate limit window elapses. Shows a countdown notification so
the user knows what's happening. Non-rate-limit errors still pause
indefinitely for manual intervention.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Three fixes for the worktree isolation stuck-state bug:
1. selfHealRuntimeRecords on initial start used the function parameter
`base` (main project root) instead of `basePath` (worktree path after
entry). This meant stale runtime records in the worktree were never
found or healed, leaving dispatched records that block auto-mode.
2. syncStateToProjectRoot now copies runtime/units/ records alongside
milestone data. This provides defense-in-depth: even if selfHeal runs
before worktree re-entry, stale records from a prior sync are visible.
3. initMetrics and initRoutingHistory also corrected from `base` to
`basePath` — same class of bug (stale function parameter after
worktree entry).
Adds test verifying selfHealRuntimeRecords resolves artifacts and clears
records correctly when pointed at a worktree base path.
Anthropic rate limit reset windows are typically 60-120s. The previous 60s
default, combined with the +1s buffer in extractRetryAfterMs(), meant that
virtually all rate limit retries were immediately abandoned.
300s (5 min) covers the vast majority of rate limit windows and lets the
built-in retry logic work as intended.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Nothing reads ~/.gsd/agent/AGENTS.md, and the script was incorrectly
pointing it at agents/researcher.md.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add three remaining features:
1. Dashboard multi-session view: New worker registry
(subagent/worker-registry.ts) tracks active parallel subagent sessions
with batch grouping and status lifecycle. Dashboard overlay now renders
a "Parallel Workers" section showing per-batch worker status with
agent names, task previews, and elapsed time.
2. Budget approach notification at 80%: Added 80% threshold to the
existing 75/90/100 budget alert levels. Fires an "Approaching budget
ceiling" notification with desktop alert at the 80% mark, giving
users earlier warning before hitting enforcement thresholds.
3. End-to-end testing across milestones: New E2E test validates parallel
worker lifecycle across M001/M002 milestones, metrics accumulation,
full budget alert progression (0→75→80→90→100), cost prediction with
multi-milestone data, and combined worker+budget scenarios.
Worker registry unit tests cover registration, batch grouping, status
updates, and edge cases.
Worker spawning (parallel-orchestrator.ts):
- spawnWorker() creates child processes via spawn() with
GSD_MILESTONE_LOCK env var for state isolation
- GSD_PARALLEL_WORKER env var prevents nested parallel sessions
- Workers run `gsd --print "/gsd auto"` in their worktree cwd
- Exit handler updates worker state on completion/crash
- Graceful error handling for spawn failures (ENOENT, etc.)
- SIGTERM sent on stopParallel for immediate process termination
Worktree creation:
- createMilestoneWorktree() creates git worktrees using
milestone/<MID> branch naming without chdir (coordinator stays put)
- Reuses existing milestone branches to preserve prior work
- Runs post-create hooks for user scripts (.env copy, etc.)
GSD_MILESTONE_LOCK in state.ts:
- deriveState() filters to only the locked milestone
- getActiveMilestoneId() short-circuits when lock is set
- Complete worker isolation — each process sees one milestone
Signal consumption in auto.ts:
- handleAgentEnd() checks for coordinator signals between units
- Responds to "stop" and "pause" signals immediately
/gsd parallel merge command:
- Merge specific or all completed milestones back to main
976/976 full test suite passing, zero regressions.
GSD_MILESTONE_LOCK in state.ts:
- deriveState() filters milestoneIds to only the locked milestone
- getActiveMilestoneId() short-circuits when lock is set
- Each parallel worker sees only its assigned milestone
Signal consumption in auto.ts:
- handleAgentEnd() checks for coordinator signals before dispatching
- Responds to "stop" (calls stopAuto) and "pause" (calls pauseAuto)
- Only active when GSD_MILESTONE_LOCK env var is set
/gsd parallel merge command:
- /gsd parallel merge [mid] — merge specific or all completed milestones
- Wired into commands.ts with argument completions
Worker spawning stub:
- spawnWorker() validates state and documents the implementation plan
- Actual process forking deferred to auto-mode integration
976/976 full test suite passing, zero regressions.