The subagent extension imported parseBundledExtensionPaths via a
relative path (../../../bundled-extension-paths.js) that resolves
correctly in the source tree but breaks when extensions are synced
to ~/.gsd/agent/extensions/ at runtime. Inline the trivial split
logic so the extension is self-contained.
Add 9 missing fields to preferences-reference.md: skill_staleness_days,
git.manage_gitignore, dynamic_routing, auto_visualize, auto_report,
parallel, verification_commands, verification_auto_fix, and
verification_max_retries. Add examples for dynamic routing, parallel
execution, and verification. Update the preferences template to include
all fields from the schema.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Migrated unique resolvePluginRoot and inspectPlugin tests from the older
file into the comprehensive contract test file at src/tests/, then
deleted the duplicate.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When --all is passed alongside --html, generates a report snapshot for
every milestone that doesn't already have one in the reports index.
This fills the progression timeline with cards for completed milestones
that were finished before the HTML report feature existed.
- Deduplicates against existing reports.json entries to avoid duplicates
- Tags completed milestones with kind "milestone", active with "manual"
- Tracks cumulative slice/milestone progress per snapshot for the index
- Adds --html and --html --all to export autocomplete suggestions
- Updates help text to show [--all] flag
Remove the unused copy at src/resources/extensions/gsd/mcp-server.ts.
The canonical implementation lives at src/mcp-server.ts and is the only
one imported by cli.ts and tested by mcp-server.test.ts. The extension
copy had zero imports and was dead code.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The update banner already referenced `/gsd:update` but the command
didn't exist. This adds `/gsd update` as a proper subcommand that
checks the npm registry and runs `npm install -g gsd-pi@latest`
when a newer version is available.
- Register `update` in subcommand completions and help text
- Add `handleUpdate()` that reuses `compareSemver` from update-check
- Fix banner text from `/gsd:update` to `/gsd update` (space, not colon)
- Add tests for completion registration and help description
* feat: add `gsd headless query` for structured state inspection
Add read-only query commands that return parseable JSON without
spawning an LLM session. Decouples orchestrators from .gsd/ internals.
Targets: phase, cost, progress, next
* simplify: single `query` command returning full snapshot
Replace 4 query targets (phase/cost/progress/next) with one command
that returns everything in a single JSON object. Caller uses jq.
Also document query in README.md and docs/commands.md.
* docs: update gsd-headless skill and references
- SKILL.md: add missing flags (--supervised, --max-restarts, --response-timeout)
- references/commands.md: add query, discuss, remote, inspect, forensics
- references/multi-session.md: fix spawning syntax, use query for budget
* fix: remove integration tests that entered via merge
These files belong to the feat/headless-orchestration-skill branch
and were accidentally included during the upstream/main merge.
They contain TS errors (sessionTerminated scope issue) that break CI.
* fix: restore headless-command.ts deleted by accident
* fix: prevent data loss on crash with atomic writes, file locking, and error handling
Wave 1 of failure recovery safeguards:
1. Atomic session file rewrites (tmp+rename) — _rewriteFile() and forkFrom()
now use atomicWriteFileSync to prevent session file corruption on crash
2. Atomic auto.lock writes — crash-recovery.ts writeLock() uses tmp+rename
so the crash detection system itself can't be corrupted
3. unhandledRejection handler — catches silent process death from unhandled
promise rejections in OAuth, extensions, LSP, or MCP connections
4. try/catch in emitToolCall — matches pattern used by emitUserBash,
emitContext, and emitToolResult to prevent extension handler crashes
from killing the entire agent turn
5. File locking on session appends — prevents concurrent pi instances from
interleaving partial JSON lines in session JSONL files using the same
proper-lockfile pattern established in auth-storage.ts and settings-manager.ts
* fix: add OAuth timeouts, RPC exit detection, and command context guards
Wave 2 of failure recovery safeguards:
1. OAuth fetch timeouts — all fetch() calls across all OAuth providers
(Anthropic, OpenAI Codex, Google Antigravity, Google Gemini CLI,
GitHub Copilot) now have 30-second AbortSignal.timeout() to prevent
indefinite hangs when OAuth servers are unresponsive
2. RPC subprocess exit detection — pending requests are now rejected
when the agent subprocess exits unexpectedly, preventing indefinite
hangs in the RPC client
3. Extension command context guards — default handlers for newSession,
fork, navigateTree, switchSession, and reload now throw explicit
errors instead of silently returning success when called before
bindCommandContext()
4. OAuth error detail preservation — token refresh errors now preserve
the original error as `cause` for better diagnostics
* fix: resource cleanup, LSP retry, and crash detection on session resume
Wave 3 of failure recovery safeguards:
1. Atomic completed-units.json cleanup — milestone completion writes
now use tmp+rename pattern for consistency with auto-recovery.ts
2. Bash temp file cleanup — track temp files created for large output
and register a process exit handler to clean them up
3. Settings write queue flush on shutdown — call settingsManager.flush()
during interactive mode shutdown so queued writes aren't lost
4. LSP initialization retry — wrap getOrCreateClient with up to 2 retries
with exponential backoff (1s, 2s) for transient spawn failures
5. Crash detection on session resume — wasInterrupted() checks if last
assistant turn had tool calls without results, shows warning on resume
* fix: blob garbage collection and LSP debug logging
Wave 4 of failure recovery safeguards:
1. Blob garbage collection — BlobStore.gc(referencedHashes) removes
orphaned blobs not referenced by any session file, plus totalSize()
for monitoring blob directory growth
2. LSP JSON parse error logging — malformed LSP messages are now logged
at debug level (when DEBUG env is set) instead of being silently dropped
README.md:
- Added provider error recovery section (item 5) covering transient
vs permanent error classification and auto-resume behavior
- Updated crash recovery to mention headless auto-restart with backoff
- Fixed numbered list (was 1-11 with duplicate 7, now 1-12 sequential)
docs/parallel-orchestration.md:
- Updated worker crash recovery section to reflect v2.27 persistent
state — workers now survive crashes via disk state recovery
When completing a /gsd subcommand via autocomplete (e.g. selecting 'auto'
after typing '/gsd '), ENTER now submits immediately instead of requiring
a second press.
The selectConfirm handler already fell through to submit when the
autocomplete prefix started with '/' (completing the command name itself).
Now it also falls through when the cursor is in a slash command context
(completing an argument like 'auto', 'status', 'help').
Non-slash completions (@file references, paths) still require explicit
ENTER to submit — only slash command arguments auto-submit.
When set to false in .gsd/preferences.md, GSD will not modify .gitignore
at all — no baseline patterns added, no self-healing, no untracking.
Usage in preferences.md:
git:
manage_gitignore: false
Files changed:
- git-service.ts: Add manage_gitignore to GitPreferences interface
- gitignore.ts: Early return when manageGitignore is false
- auto.ts: Pass manage_gitignore preference to ensureGitignore
- preferences.ts: Parse and validate manage_gitignore in git config
When the LLM calls the same web search query 4+ times consecutively,
return an error telling it to stop and use existing results instead of
silently returning cached results that the LLM ignores.
Tracks consecutive duplicate searches via a simple counter keyed on
the normalized query + parameters. Resets when a different query is
searched. Threshold is 3 consecutive duplicates before the guard fires.
File changed: search-the-web/tool-search.ts
* refactor: encapsulate auto.ts state into AutoSession class (#898)
Follow-up to PR #906 (7 module extractions). All ~40 mutable module-level
variables in auto.ts are replaced with properties on a single AutoSession
class instance (s).
Changes:
- auto/session.ts: 200-line AutoSession class with typed properties,
clearTimers(), resetDispatchCounters(), completeCurrentUnit(), reset(),
and toJSON() for diagnostics.
- auto.ts: ~700 variable references renamed from bare names to s.xxx.
All module-level let/const state declarations removed. Constants
(MAX_UNIT_DISPATCHES, etc.) re-exported from session.ts.
- Tests updated: milestone-transition-worktree.test.ts and
triage-dispatch.test.ts source-grep patterns updated for s.xxx names.
Benefits:
- 40 scattered declarations → 1 class with typed properties
- Manual reset of 25+ variables in stopAuto → s.reset()
- s.toJSON() for state snapshots and diagnostics
- grep 's.' shows every state access
No behavioral changes. 1224 tests pass.
* fix: import constants locally for tsconfig.extensions.json compatibility
The extensions tsconfig couldn't resolve re-exported constants from
auto/session.js. Fix: import them explicitly in addition to re-exporting.
Also remove leftover DISPATCH_GAP_TIMEOUT_MS local declaration.
Previously, any provider error during auto-mode immediately triggered the
model fallback chain. This meant providers with occasional network flakiness
(e.g. zai-coding-plan) would get abandoned after a single transient error,
barely getting used before the fallback took over.
Now, transient network errors (ECONNRESET, ETIMEDOUT, socket hang up, DNS
failures, etc.) are retried up to 2 times with linear backoff (3s, 6s)
before falling back to the next model. Permanent errors (auth, quota,
billing) still trigger immediate fallback.
Changes:
- index.ts: Add network retry loop before fallback chain in agent_end error
handler. Track retry counts per model in networkRetryCounters map.
Clear counters on successful unit completion and model switches.
- preferences.ts: Extract isTransientNetworkError() as testable utility.
Matches network signals while excluding permanent auth/billing errors.
- network-error-fallback.test.ts: Add 12 tests for transient error detection
covering all signal patterns and exclusion cases.
* fix: parallel worker PID tracking, spawn-status race, exit persistence
Three bugs in parallel-orchestrator.ts that cause workers to appear
permanently stuck in "running" or silently lose state on exit:
1. Worker PID initialized to coordinator's process.pid instead of 0.
Session status files recorded wrong PID, breaking stale detection
(isPidAlive returns true for the coordinator, not the dead worker).
2. Session status written with "running" BEFORE spawn attempt. If spawn
fails, status file stays "running" indefinitely. Now spawns first,
then writes status with actual state (running or error).
3. Worker exit handler updates session status but didn't call
persistState(), so orchestrator.json got out of sync. Next
coordinator restart could adopt already-dead workers.
Closes#672 (partial — worker lifecycle hardening)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* test: adapt lifecycle tests for spawn-aware session status
Tests now handle both outcomes: when spawnWorker() succeeds (running
state) and when it fails in CI (error state, no GSD binary available).
The lifecycle logic under test — session status writes, stop, pause,
resume — works correctly in both cases.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Three fixes for the discuss picker loop:
1. Recommend first undiscussed slice instead of always recommending
the first pending slice (i === 0). The recommended flag now checks
discussion state via CONTEXT file existence.
2. Exit with a summary notification when all pending slices have been
discussed, instead of looping back to a picker where everything
is already done.
3. Invalidate deriveState cache after each discuss session completes
so subsequent state reads pick up the newly-written CONTEXT files.
The regex required exactly '## Slices' with nothing after. If an agent
renamed it (e.g. '## Slices (generate flow — first batch)'), the parser
returned zero slices, blocking auto-mode.
Changed /^## Slices\s*$/m to /^## Slices\b.*$/m — word boundary ensures
'Slices' is complete, .* allows any trailing text.
Reorganize the 10-tab TUI visualizer for better workflow:
- Core project views: Progress, Timeline, Deps
- Analytics/monitoring: Metrics, Health, Agent
- Content: Changes, Knowledge, Captures
- Utility: Export (moved to last position)
Updated all tab index references (switch cases, export key
handling, export status display) and corrected help text
from "7-tab" to "10-tab" with accurate tab listing.
* test: parallel merge reconciliation + budget atomicity coverage (G5/G6)
27 new tests covering two gaps identified in #672:
G5 — Merge Reconciliation (parallel-merge.test.ts, 17 tests):
- determineMergeOrder: sequential, by-completion, filtering, defaults
- formatMergeResults: success, conflict, empty, mixed output
- mergeCompletedMilestone: clean merge with session cleanup, missing
roadmap error, conflict detection with structured file list
- mergeAllCompleted: sequential order, stop-on-first-conflict,
by-completion order (integration tests with real git repos)
G6 — Budget Atomicity (parallel-budget-atomicity.test.ts, 10 tests):
- Ceiling enforcement: exceeded, not exceeded, exact boundary
- Cost aggregation: correct sum, incremental updates
- No double-counting: 5 rapid refreshes produce correct total
- Budget reset: resetOrchestrator clears all state
- No ceiling: unlimited spending when budget_ceiling unset
- Worker state sync: refreshWorkerStatuses picks up disk changes
All tests use node:test + node:assert/strict. No production code changes.
Relates to #672
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: use double quotes in git commit messages for Windows compatibility
Single-quoted commit messages in test helpers fail on Windows CMD
(pathspec errors). Switch to double quotes which work cross-platform.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* refactor: TUI dashboard cleanup, dedup, and feature improvements
- Extract shared format-utils.ts: formatDuration, padRight, joinColumns,
centerLine, fitColumns, sparkline, stripAnsi — eliminating 3× duplication
across dashboard-overlay, visualizer-views, and auto-dashboard
- Use shared STATUS_GLYPH/STATUS_COLOR from ui.ts consistently across all
overlay and view files instead of hardcoded Unicode glyphs
- Fix redundant dynamic import('node:fs') in visualizer-data.ts (statSync
already imported at top level)
- Replace (entry as any) casts with proper SessionMessageEntry type narrowing
- Add mtime-based file content cache for visualizer data loader to avoid
re-parsing unchanged roadmap/plan files on every refresh
- Increase visualizer refresh interval from 2s to 5s (with mtime cache,
unchanged files are effectively free)
- Fix sparkline to use loop-based max instead of Math.max(...values) to
avoid stack overflow on large arrays
- Add ETA/time-remaining estimate to progress widget and dashboard overlay
based on average unit duration from metrics ledger
- Show warning glyph for budget-pressured units in completed units list
(continueHereFired units now show ⚠ instead of ✓)
- Add terminal resize (SIGWINCH) handling to both overlays — invalidates
cache and re-renders on window size change
- Fix dispose race in dashboard overlay close path — now calls dispose()
before onClose() to prevent timer callbacks firing after teardown
- Add 23 unit tests for format-utils.ts (including 100k-element sparkline)
- Add 2 tests for estimateTimeRemaining
- Add source-contract tests for resize handler and shared imports
* fix: use STATUS_GLYPH.warning instead of STATUS_GLYPH.statusWarning
STATUS_GLYPH is keyed by ProgressStatus ("warning"), not by GLYPH
property name ("statusWarning"). Fixes typecheck failure in CI.
The startup model validation overwrote the user's configured model when
it was 'not available' (API key missing, OAuth token expired, rate
limited). This silently changed the model to a fallback like
google/gemini-1.5-flash or openai/gpt-5.4.
Fix: Only trigger the fallback when the configured model doesn't exist
in the registry at all (removed/unknown). A model that exists but is
temporarily unavailable (credential issue) keeps its setting — the
session-level fallback resolver handles it at prompt time.
After the skip-loop breaker evicts a completion key, the fallback path
at the bottom of dispatchNextUnit re-persists it because the expected
artifact exists on disk. This recreates the exact loop the breaker was
trying to break:
evict key → dispatch → verifyArtifact(true) → re-persist key → skip → evict → repeat
Fix: Track recently-evicted keys in a Set. The fallback artifact-check
path skips re-persistence for keys that were just evicted by the
skip-loop breaker. Set is cleared on stopAuto.
Two fixes:
1. lsp/config.ts: Use `where.exe` instead of `which` on Windows.
MSYS's `which` returns POSIX paths (/c/Users/...) that Node's
spawn() can't execute. `where.exe` returns native Windows paths.
2. lsp/client.ts: Handle spawn ENOENT error gracefully. When the LSP
server binary doesn't exist, the error event now triggers a clean
exit instead of bubbling up and crashing auto-mode.
When a slice plan (S03-PLAN.md) was pre-created during roadmapping
but plan-slice never ran to generate per-task files (tasks/T01-PLAN.md),
deriveState returned 'executing' phase. execute-task then failed because
the task plan didn't exist, creating an infinite restart loop.
Fix: In deriveState, when the tasks directory exists but has zero .md
files and the slice plan references tasks, return 'planning' phase
instead of 'executing'. This causes plan-slice to dispatch and generate
the missing task plans.
Tests updated: 6 test files that create synthetic state fixtures now
include a stub task plan file so their 'executing' phase assertions
remain valid.
* fix: reduce CPU usage on long auto-mode sessions
Seven targeted fixes for compounding process/timer/I/O issues that cause
high CPU during multi-hour /gsd auto sessions:
1. Wrap idle watchdog and hard timeout async callbacks in try-catch to
prevent unhandled rejections from orphaning intervals
2. Cache nativeHasChanges fallback (10s TTL) to avoid spawning a new
git process every 15 seconds when native module is unavailable
3. Call clearUnitTimeout() before dispatchNextUnit() in all recovery
paths to prevent stale idle watchdog from firing alongside new timers
4. Add 10-second timeout to subagent worktree cleanup to prevent hangs
when git worktree remove blocks indefinitely
5. Prune dead bg-shell processes after each unit completion to free
retained output buffers (~500KB-1MB per dead process)
6. Throttle STATE.md rebuilds to at most once per 30 seconds (was every
unit completion at 100-400ms each)
7. Increase progress widget refresh interval from 5s to 15s to reduce
synchronous file I/O on the hot path
* fix: reset nativeHasChanges cache in worktree test
The 10s TTL cache on nativeHasChanges was causing the worktree test
to return stale "no changes" when checking a freshly dirtied repo
within the cache window. Reset the cache before the dirty-repo
assertion so the test correctly detects new changes.
Conflicts arose because main added continueHereHandle cleanup and
buildSnapshotOpts (with continueHereFired) while the PR extracted
inline closeout code into closeoutUnit(). Resolution: use closeoutUnit()
with buildSnapshotOpts() to pass all fields including continueHereFired.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>