Three fixes for the worktree isolation stuck-state bug:
1. selfHealRuntimeRecords on initial start used the function parameter
`base` (main project root) instead of `basePath` (worktree path after
entry). This meant stale runtime records in the worktree were never
found or healed, leaving dispatched records that block auto-mode.
2. syncStateToProjectRoot now copies runtime/units/ records alongside
milestone data. This provides defense-in-depth: even if selfHeal runs
before worktree re-entry, stale records from a prior sync are visible.
3. initMetrics and initRoutingHistory also corrected from `base` to
`basePath` — same class of bug (stale function parameter after
worktree entry).
Adds test verifying selfHealRuntimeRecords resolves artifacts and clears
records correctly when pointed at a worktree base path.
Anthropic rate limit reset windows are typically 60-120s. The previous 60s
default, combined with the +1s buffer in extractRetryAfterMs(), meant that
virtually all rate limit retries were immediately abandoned.
300s (5 min) covers the vast majority of rate limit windows and lets the
built-in retry logic work as intended.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Nothing reads ~/.gsd/agent/AGENTS.md, and the script was incorrectly
pointing it at agents/researcher.md.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add three remaining features:
1. Dashboard multi-session view: New worker registry
(subagent/worker-registry.ts) tracks active parallel subagent sessions
with batch grouping and status lifecycle. Dashboard overlay now renders
a "Parallel Workers" section showing per-batch worker status with
agent names, task previews, and elapsed time.
2. Budget approach notification at 80%: Added 80% threshold to the
existing 75/90/100 budget alert levels. Fires an "Approaching budget
ceiling" notification with desktop alert at the 80% mark, giving
users earlier warning before hitting enforcement thresholds.
3. End-to-end testing across milestones: New E2E test validates parallel
worker lifecycle across M001/M002 milestones, metrics accumulation,
full budget alert progression (0→75→80→90→100), cost prediction with
multi-milestone data, and combined worker+budget scenarios.
Worker registry unit tests cover registration, batch grouping, status
updates, and edge cases.
Worker spawning (parallel-orchestrator.ts):
- spawnWorker() creates child processes via spawn() with
GSD_MILESTONE_LOCK env var for state isolation
- GSD_PARALLEL_WORKER env var prevents nested parallel sessions
- Workers run `gsd --print "/gsd auto"` in their worktree cwd
- Exit handler updates worker state on completion/crash
- Graceful error handling for spawn failures (ENOENT, etc.)
- SIGTERM sent on stopParallel for immediate process termination
Worktree creation:
- createMilestoneWorktree() creates git worktrees using
milestone/<MID> branch naming without chdir (coordinator stays put)
- Reuses existing milestone branches to preserve prior work
- Runs post-create hooks for user scripts (.env copy, etc.)
GSD_MILESTONE_LOCK in state.ts:
- deriveState() filters to only the locked milestone
- getActiveMilestoneId() short-circuits when lock is set
- Complete worker isolation — each process sees one milestone
Signal consumption in auto.ts:
- handleAgentEnd() checks for coordinator signals between units
- Responds to "stop" and "pause" signals immediately
/gsd parallel merge command:
- Merge specific or all completed milestones back to main
976/976 full test suite passing, zero regressions.
GSD_MILESTONE_LOCK in state.ts:
- deriveState() filters milestoneIds to only the locked milestone
- getActiveMilestoneId() short-circuits when lock is set
- Each parallel worker sees only its assigned milestone
Signal consumption in auto.ts:
- handleAgentEnd() checks for coordinator signals before dispatching
- Responds to "stop" (calls stopAuto) and "pause" (calls pauseAuto)
- Only active when GSD_MILESTONE_LOCK env var is set
/gsd parallel merge command:
- /gsd parallel merge [mid] — merge specific or all completed milestones
- Wired into commands.ts with argument completions
Worker spawning stub:
- spawnWorker() validates state and documents the implementation plan
- Actual process forking deferred to auto-mode integration
976/976 full test suite passing, zero regressions.
Two compounding bugs caused auto-mode to loop infinitely after stopping
and restarting when a worktree with committed progress existed:
Bug 1: copyPlanningArtifacts overwrites worktree state on restart
When auto-mode restarts and the milestone branch exists (worktree dir was
removed but branch preserved), createAutoWorktree re-attaches the worktree
to the existing branch — git correctly checks out the committed state with
[x] checkboxes. But then copyPlanningArtifacts unconditionally copies the
project root's .gsd/milestones/ into the worktree, overwriting the correct
[x] with stale [ ] from the root (which isn't always fully synced).
Fix: Skip copyPlanningArtifacts when branchExists is true. The branch
checkout already has the correct artifacts from committed work.
Bug 2: deriveState reads stale content from SQLite DB
deriveState had a DB-first content loading path that read artifact content
from the SQLite artifacts table. This table was populated once during
migrateFromMarkdown and never updated when files changed on disk (roadmap
checkbox updates, plan changes, etc.). Even after fixing files on disk,
deriveState returned stale DB content, keeping the state machine stuck.
Fix: Remove the DB content loading path from deriveState entirely. The
native Rust batch parser (nativeBatchParseGsdFiles) reads all .md files
in one call and is fast enough. The DB is still used for structured queries
(decisions, requirements) but no longer as a content cache for state
derivation.
Updated derive-state-db.test.ts Test 5 to write requirements to disk
instead of testing the now-removed DB-only content path.
The reason parameter was added to stopAuto() but the reasonSuffix
variable derived from it was never declared, causing TS2304 errors.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Remove the version-match early return in initResources() that skipped
resource sync when versions matched. This allowed the runtime at
~/.gsd/agent/extensions/ to drift from the bundled resources when
individual files were manually copied or leftover from a newer version.
Also adds rmSync of bundled subdirectories before each cpSync to remove
stale files that exist only in the runtime. User-created extension
directories are preserved.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
stopAuto() now accepts an optional `reason` parameter that is included
in the session summary — every stop is self-documenting instead of
showing a generic "Auto-mode stopped" message.
Also replaces the catch-all `!mid` check with registry-aware logic that
distinguishes "all complete" from "blocked" and "unexpected no active
milestone" (with diagnostic output). Adds midTitle recovery fallback
when title regex strips to empty string.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When mergeMilestoneToMain runs from a worktree context, main is already
checked out at the project root. The unconditional git checkout main
fails with "already used by worktree" because git refuses to checkout a
branch that is active in another worktree.
Skip the checkout when the integration branch is already current at the
project root, which is always the case in worktree-mode merges.
Resolve conflicts between #699 (empty scaffold rejection) and #739
(task plan file verification) in auto-dispatch.ts imports and
auto-recovery.test.ts tests.
- auto-dispatch.ts: merged imports from both branches (resolveTaskFile
from #739, resolveMilestonePath/buildMilestoneFileName from main)
- auto-recovery.test.ts: included all tests from both #699 (empty
scaffold, actual tasks, completed tasks) and #739 (all task plans
exist, missing task plan, no tasks). Updated #699 tests to create
task plan files alongside slice plans to satisfy #739's verification.
Updated #739 "no tasks" test to expect false per #699's requirement
that plans must have task entries.
- auto-recovery.ts: auto-merged cleanly, both checks coexist
All 26 recovery tests pass. Full build clean.
Add a `validating-milestone` phase that runs BEFORE `completing-milestone`
to reconcile planned work against delivered work. The validator checks
success criteria, slice deliverables, cross-slice integration, and
requirement coverage before allowing milestone completion.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>