Commit graph

1161 commits

Author SHA1 Message Date
Jeremy McSpadden
a2a701b129 fix: inline bundled extension path parsing in subagent
The subagent extension imported parseBundledExtensionPaths via a
relative path (../../../bundled-extension-paths.js) that resolves
correctly in the source tree but breaks when extensions are synced
to ~/.gsd/agent/extensions/ at runtime. Inline the trivial split
logic so the extension is self-contained.
2026-03-17 17:45:48 -05:00
Jeremy McSpadden
d5161fddb9 feat: add project onboarding detection and init wizard
Replace the silent .gsd/ bootstrap with an interactive init wizard that
detects project state, offers v1 migration, and guides users through
per-project configuration before their first milestone.

New commands:
- /gsd init — project init wizard (detect, configure, bootstrap .gsd/)
- /gsd setup — global setup status and routing to existing config commands

Detection engine (detection.ts):
- detectProjectState() identifies none/v1-planning/v2-gsd/v2-gsd-empty
- detectProjectSignals() scans for language, monorepo, CI, tests, package manager
- Auto-detects verification commands from package.json, Cargo.toml, go.mod, etc.
- isFirstEverLaunch() / hasGlobalSetup() for global state checks

Init wizard (init-wizard.ts):
- 8-step wizard: git → mode → verification → git prefs → instructions → advanced → bootstrap
- Every step skippable with sensible defaults
- offerMigration() when .planning/ detected (migrate/fresh/cancel)
- handleReinit() for safe re-init on existing projects
- Writes preferences.md from wizard answers + seeds CONTEXT.md with detected signals

Smart entry integration (guided-flow.ts):
- showSmartEntry() now runs detection before any bootstrap
- v1 .planning/ → migration offer before anything else
- No .gsd/ → init wizard instead of silent bootstrap
- Existing .gsd/ → unchanged behavior (zero regression)
2026-03-17 17:31:52 -05:00
TÂCHES
ecedbfe9df Merge pull request #967 from jeremymcs/feat/export-html-all
feat: add /gsd export --html --all for retrospective milestone reports
2026-03-17 16:25:42 -06:00
TÂCHES
4f3ca74c3a Merge pull request #970 from gsd-build/fix/headless-dist-import
fix: export RPC utilities from pi-coding-agent public API
2026-03-17 16:23:34 -06:00
TÂCHES
7b2feb64e4 Merge pull request #969 from gsd-build/fix/duplicate-marketplace-test
fix: remove duplicate marketplace-discovery test
2026-03-17 16:23:27 -06:00
TÂCHES
23bc77fc56 Merge pull request #968 from gsd-build/fix/duplicate-bundled-extension-paths
fix: consolidate duplicate bundled-extension-paths.ts
2026-03-17 16:23:11 -06:00
TÂCHES
afeb6f5e3b Merge pull request #966 from gsd-build/fix/duplicate-mcp-server
fix: consolidate duplicate mcp-server.ts
2026-03-17 16:23:04 -06:00
TÂCHES
e944d690ed Merge pull request #965 from gsd-build/fix/canonical-package-manager
fix: establish npm as canonical package manager
2026-03-17 16:22:40 -06:00
Lex Christopherson
a8b9030560 fix: export RPC utilities from pi-coding-agent public API
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-17 16:20:13 -06:00
Lex Christopherson
6114ede489 fix: remove duplicate marketplace-discovery test
Migrated unique resolvePluginRoot and inspectPlugin tests from the older
file into the comprehensive contract test file at src/tests/, then
deleted the duplicate.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-17 16:18:45 -06:00
Lex Christopherson
12e65afd36 fix: consolidate duplicate bundled-extension-paths.ts
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-17 16:17:53 -06:00
Jeremy McSpadden
c9f63f8e93 feat: add /gsd export --html --all for retrospective milestone reports
When --all is passed alongside --html, generates a report snapshot for
every milestone that doesn't already have one in the reports index.
This fills the progression timeline with cards for completed milestones
that were finished before the HTML report feature existed.

- Deduplicates against existing reports.json entries to avoid duplicates
- Tags completed milestones with kind "milestone", active with "manual"
- Tracks cumulative slice/milestone progress per snapshot for the index
- Adds --html and --html --all to export autocomplete suggestions
- Updates help text to show [--all] flag
2026-03-17 17:17:50 -05:00
Lex Christopherson
e50e6458b6 fix: consolidate duplicate mcp-server.ts
Remove the unused copy at src/resources/extensions/gsd/mcp-server.ts.
The canonical implementation lives at src/mcp-server.ts and is the only
one imported by cli.ts and tested by mcp-server.test.ts. The extension
copy had zero imports and was dead code.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-17 16:17:39 -06:00
Lex Christopherson
e8e3e2fbb9 fix: establish npm as canonical package manager
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-17 16:17:22 -06:00
Jeremy McSpadden
add957ed31 feat: add /gsd update slash command for in-session self-update (#964)
The update banner already referenced `/gsd:update` but the command
didn't exist. This adds `/gsd update` as a proper subcommand that
checks the npm registry and runs `npm install -g gsd-pi@latest`
when a newer version is available.

- Register `update` in subcommand completions and help text
- Add `handleUpdate()` that reuses `compareSemver` from update-check
- Fix banner text from `/gsd:update` to `/gsd update` (space, not colon)
- Add tests for completion registration and help description
2026-03-17 16:13:02 -06:00
Juan Francisco Lebrero
99c3375f18 feat: add gsd headless query for instant state inspection (#951)
* feat: add `gsd headless query` for structured state inspection

Add read-only query commands that return parseable JSON without
spawning an LLM session. Decouples orchestrators from .gsd/ internals.

Targets: phase, cost, progress, next

* simplify: single `query` command returning full snapshot

Replace 4 query targets (phase/cost/progress/next) with one command
that returns everything in a single JSON object. Caller uses jq.

Also document query in README.md and docs/commands.md.

* docs: update gsd-headless skill and references

- SKILL.md: add missing flags (--supervised, --max-restarts, --response-timeout)
- references/commands.md: add query, discuss, remote, inspect, forensics
- references/multi-session.md: fix spawning syntax, use query for budget

* fix: remove integration tests that entered via merge

These files belong to the feat/headless-orchestration-skill branch
and were accidentally included during the upstream/main merge.
They contain TS errors (sessionTerminated scope issue) that break CI.

* fix: restore headless-command.ts deleted by accident
2026-03-17 16:03:59 -06:00
Jeremy McSpadden
0e0f47ef9f fix: failure recovery & resume safeguards (all 4 waves) (#956)
* fix: prevent data loss on crash with atomic writes, file locking, and error handling

Wave 1 of failure recovery safeguards:

1. Atomic session file rewrites (tmp+rename) — _rewriteFile() and forkFrom()
   now use atomicWriteFileSync to prevent session file corruption on crash
2. Atomic auto.lock writes — crash-recovery.ts writeLock() uses tmp+rename
   so the crash detection system itself can't be corrupted
3. unhandledRejection handler — catches silent process death from unhandled
   promise rejections in OAuth, extensions, LSP, or MCP connections
4. try/catch in emitToolCall — matches pattern used by emitUserBash,
   emitContext, and emitToolResult to prevent extension handler crashes
   from killing the entire agent turn
5. File locking on session appends — prevents concurrent pi instances from
   interleaving partial JSON lines in session JSONL files using the same
   proper-lockfile pattern established in auth-storage.ts and settings-manager.ts

* fix: add OAuth timeouts, RPC exit detection, and command context guards

Wave 2 of failure recovery safeguards:

1. OAuth fetch timeouts — all fetch() calls across all OAuth providers
   (Anthropic, OpenAI Codex, Google Antigravity, Google Gemini CLI,
   GitHub Copilot) now have 30-second AbortSignal.timeout() to prevent
   indefinite hangs when OAuth servers are unresponsive
2. RPC subprocess exit detection — pending requests are now rejected
   when the agent subprocess exits unexpectedly, preventing indefinite
   hangs in the RPC client
3. Extension command context guards — default handlers for newSession,
   fork, navigateTree, switchSession, and reload now throw explicit
   errors instead of silently returning success when called before
   bindCommandContext()
4. OAuth error detail preservation — token refresh errors now preserve
   the original error as `cause` for better diagnostics

* fix: resource cleanup, LSP retry, and crash detection on session resume

Wave 3 of failure recovery safeguards:

1. Atomic completed-units.json cleanup — milestone completion writes
   now use tmp+rename pattern for consistency with auto-recovery.ts
2. Bash temp file cleanup — track temp files created for large output
   and register a process exit handler to clean them up
3. Settings write queue flush on shutdown — call settingsManager.flush()
   during interactive mode shutdown so queued writes aren't lost
4. LSP initialization retry — wrap getOrCreateClient with up to 2 retries
   with exponential backoff (1s, 2s) for transient spawn failures
5. Crash detection on session resume — wasInterrupted() checks if last
   assistant turn had tool calls without results, shows warning on resume

* fix: blob garbage collection and LSP debug logging

Wave 4 of failure recovery safeguards:

1. Blob garbage collection — BlobStore.gc(referencedHashes) removes
   orphaned blobs not referenced by any session file, plus totalSize()
   for monitoring blob directory growth
2. LSP JSON parse error logging — malformed LSP messages are now logged
   at debug level (when DEBUG env is set) instead of being silently dropped
2026-03-17 16:03:49 -06:00
Tom Boucher
87122e0b7a docs: update documentation for v2.27.0 release (#959)
README.md:
- Added provider error recovery section (item 5) covering transient
  vs permanent error classification and auto-resume behavior
- Updated crash recovery to mention headless auto-restart with backoff
- Fixed numbered list (was 1-11 with duplicate 7, now 1-12 sequential)

docs/parallel-orchestration.md:
- Updated worker crash recovery section to reflect v2.27 persistent
  state — workers now survive crashes via disk state recovery
2026-03-17 15:50:55 -06:00
Tom Boucher
d2a9ee6024 docs: update documentation for v2.26 features (#958)
Updated 6 files with 114 lines covering new v2.26 features:

auto-mode.md:
- Headless auto-restart on crash with exponential backoff
- Provider error classification and auto-resume (rate limit + server errors)
- Incremental memory system (KNOWLEDGE.md)
- Context pressure monitor (70% wrap-up signal)
- Meaningful commit messages from task summaries
- Verification enforcement with auto-fix retries
- Slice discussion gate (require_slice_discussion)
- HTML report generation (auto_report)

configuration.md:
- git.manage_gitignore preference (opt out of .gitignore changes)
- verification_commands, verification_auto_fix, verification_max_retries
- auto_report preference

troubleshooting.md:
- Provider error recovery table (transient vs permanent classification)
- Headless auto-restart for overnight unattended execution

commands.md:
- /gsd export --html command
- --max-restarts flag for headless mode

visualizer.md:
- HTML export and auto_report preference

README.md:
- git.manage_gitignore in preferences table
- git.isolation updated to include 'branch' option
2026-03-17 15:42:18 -06:00
Lex Christopherson
57774033b7 2.27.0 2026-03-17 15:30:43 -06:00
Lex Christopherson
5a70a3dc4e docs: update changelog for v2.27.0 2026-03-17 15:30:31 -06:00
Tom Boucher
188e7a67c4 fix: single ENTER submits slash command argument autocomplete (#944) (#953)
When completing a /gsd subcommand via autocomplete (e.g. selecting 'auto'
after typing '/gsd '), ENTER now submits immediately instead of requiring
a second press.

The selectConfirm handler already fell through to submit when the
autocomplete prefix started with '/' (completing the command name itself).
Now it also falls through when the cursor is in a slash command context
(completing an argument like 'auto', 'status', 'help').

Non-slash completions (@file references, paths) still require explicit
ENTER to submit — only slash command arguments auto-submit.
2026-03-17 15:28:40 -06:00
Tom Boucher
614012f38a fix: add git.manage_gitignore preference to opt out of .gitignore changes (#950) (#952)
When set to false in .gsd/preferences.md, GSD will not modify .gitignore
at all — no baseline patterns added, no self-healing, no untracking.

Usage in preferences.md:
  git:
    manage_gitignore: false

Files changed:
- git-service.ts: Add manage_gitignore to GitPreferences interface
- gitignore.ts: Early return when manageGitignore is false
- auto.ts: Pass manage_gitignore preference to ensureGitignore
- preferences.ts: Parse and validate manage_gitignore in git config
2026-03-17 15:27:34 -06:00
Tom Boucher
aa224b5944 fix: break web search loop with consecutive duplicate guard (#949) (#955)
When the LLM calls the same web search query 4+ times consecutively,
return an error telling it to stop and use existing results instead of
silently returning cached results that the LLM ignores.

Tracks consecutive duplicate searches via a simple counter keyed on
the normalized query + parameters. Resets when a different query is
searched. Threshold is 3 consecutive duplicates before the guard fires.

File changed: search-the-web/tool-search.ts
2026-03-17 15:27:18 -06:00
Tom Boucher
1b1df58749 refactor: encapsulate auto.ts state into AutoSession class (#898) (#948)
* refactor: encapsulate auto.ts state into AutoSession class (#898)

Follow-up to PR #906 (7 module extractions). All ~40 mutable module-level
variables in auto.ts are replaced with properties on a single AutoSession
class instance (s).

Changes:
- auto/session.ts: 200-line AutoSession class with typed properties,
  clearTimers(), resetDispatchCounters(), completeCurrentUnit(), reset(),
  and toJSON() for diagnostics.
- auto.ts: ~700 variable references renamed from bare names to s.xxx.
  All module-level let/const state declarations removed. Constants
  (MAX_UNIT_DISPATCHES, etc.) re-exported from session.ts.
- Tests updated: milestone-transition-worktree.test.ts and
  triage-dispatch.test.ts source-grep patterns updated for s.xxx names.

Benefits:
- 40 scattered declarations → 1 class with typed properties
- Manual reset of 25+ variables in stopAuto → s.reset()
- s.toJSON() for state snapshots and diagnostics
- grep 's.' shows every state access

No behavioral changes. 1224 tests pass.

* fix: import constants locally for tsconfig.extensions.json compatibility

The extensions tsconfig couldn't resolve re-exported constants from
auto/session.js. Fix: import them explicitly in addition to re-exporting.
Also remove leftover DISPATCH_GAP_TIMEOUT_MS local declaration.
2026-03-17 14:59:42 -06:00
Tom Boucher
1e979ff626 fix: retry transient network errors before model fallback (#941) (#945)
Previously, any provider error during auto-mode immediately triggered the
model fallback chain. This meant providers with occasional network flakiness
(e.g. zai-coding-plan) would get abandoned after a single transient error,
barely getting used before the fallback took over.

Now, transient network errors (ECONNRESET, ETIMEDOUT, socket hang up, DNS
failures, etc.) are retried up to 2 times with linear backoff (3s, 6s)
before falling back to the next model. Permanent errors (auth, quota,
billing) still trigger immediate fallback.

Changes:
- index.ts: Add network retry loop before fallback chain in agent_end error
  handler. Track retry counts per model in networkRetryCounters map.
  Clear counters on successful unit completion and model switches.
- preferences.ts: Extract isTransientNetworkError() as testable utility.
  Matches network signals while excluding permanent auth/billing errors.
- network-error-fallback.test.ts: Add 12 tests for transient error detection
  covering all signal patterns and exclusion cases.
2026-03-17 14:59:22 -06:00
deseltrus
5bae521af0 fix: parallel worker PID tracking, spawn-status race, exit persistence (#932)
* fix: parallel worker PID tracking, spawn-status race, exit persistence

Three bugs in parallel-orchestrator.ts that cause workers to appear
permanently stuck in "running" or silently lose state on exit:

1. Worker PID initialized to coordinator's process.pid instead of 0.
   Session status files recorded wrong PID, breaking stale detection
   (isPidAlive returns true for the coordinator, not the dead worker).

2. Session status written with "running" BEFORE spawn attempt. If spawn
   fails, status file stays "running" indefinitely. Now spawns first,
   then writes status with actual state (running or error).

3. Worker exit handler updates session status but didn't call
   persistState(), so orchestrator.json got out of sync. Next
   coordinator restart could adopt already-dead workers.

Closes #672 (partial — worker lifecycle hardening)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* test: adapt lifecycle tests for spawn-aware session status

Tests now handle both outcomes: when spawnWorker() succeeds (running
state) and when it fails in CI (error state, no GSD binary available).
The lifecycle logic under test — session status writes, stop, pause,
resume — works correctly in both cases.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-17 14:59:05 -06:00
Tom Boucher
a82092eb63 fix: /gsd discuss now recommends next undiscussed slice (#935) (#939)
Three fixes for the discuss picker loop:

1. Recommend first undiscussed slice instead of always recommending
   the first pending slice (i === 0). The recommended flag now checks
   discussion state via CONTEXT file existence.

2. Exit with a summary notification when all pending slices have been
   discussed, instead of looping back to a picker where everything
   is already done.

3. Invalidate deriveState cache after each discuss session completes
   so subsequent state reads pick up the newly-written CONTEXT files.
2026-03-17 14:10:15 -06:00
Jeremy McSpadden
b44e43b841 refactor: reorder HTML report sections to match visualizer tab order (#937)
Mirror the visualizer's logical grouping in the HTML report:
- Core project views: Summary, Progress, Timeline, Dependencies
- Analytics/monitoring: Metrics, Health
- Content: Changelog, Knowledge, Captures
- Report-only: Artifacts, Planning

Updated sections array, TOC nav, and file header comment.
2026-03-17 14:06:24 -06:00
Tom Boucher
87352c9a4f fix: allow suffix text after '## Slices' heading in roadmap parser (#924) (#936)
The regex required exactly '## Slices' with nothing after. If an agent
renamed it (e.g. '## Slices (generate flow — first batch)'), the parser
returned zero slices, blocking auto-mode.

Changed /^## Slices\s*$/m to /^## Slices\b.*$/m — word boundary ensures
'Slices' is complete, .* allows any trailing text.
2026-03-17 14:06:10 -06:00
Jeremy McSpadden
43776b68c6 refactor: reorder visualizer tabs into logical groupings (#934)
Reorganize the 10-tab TUI visualizer for better workflow:
- Core project views: Progress, Timeline, Deps
- Analytics/monitoring: Metrics, Health, Agent
- Content: Changes, Knowledge, Captures
- Utility: Export (moved to last position)

Updated all tab index references (switch cases, export key
handling, export status display) and corrected help text
from "7-tab" to "10-tab" with accurate tab listing.
2026-03-17 14:05:38 -06:00
deseltrus
9fe805b1d3 test: parallel merge reconciliation + budget atomicity (G5/G6) (#933)
* test: parallel merge reconciliation + budget atomicity coverage (G5/G6)

27 new tests covering two gaps identified in #672:

G5 — Merge Reconciliation (parallel-merge.test.ts, 17 tests):
- determineMergeOrder: sequential, by-completion, filtering, defaults
- formatMergeResults: success, conflict, empty, mixed output
- mergeCompletedMilestone: clean merge with session cleanup, missing
  roadmap error, conflict detection with structured file list
- mergeAllCompleted: sequential order, stop-on-first-conflict,
  by-completion order (integration tests with real git repos)

G6 — Budget Atomicity (parallel-budget-atomicity.test.ts, 10 tests):
- Ceiling enforcement: exceeded, not exceeded, exact boundary
- Cost aggregation: correct sum, incremental updates
- No double-counting: 5 rapid refreshes produce correct total
- Budget reset: resetOrchestrator clears all state
- No ceiling: unlimited spending when budget_ceiling unset
- Worker state sync: refreshWorkerStatuses picks up disk changes

All tests use node:test + node:assert/strict. No production code changes.

Relates to #672

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: use double quotes in git commit messages for Windows compatibility

Single-quoted commit messages in test helpers fail on Windows CMD
(pathspec errors). Switch to double quotes which work cross-platform.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-17 14:05:16 -06:00
Jeremy McSpadden
1ea653b5fc refactor: TUI dashboard cleanup, dedup, and feature improvements (#931)
* refactor: TUI dashboard cleanup, dedup, and feature improvements

- Extract shared format-utils.ts: formatDuration, padRight, joinColumns,
  centerLine, fitColumns, sparkline, stripAnsi — eliminating 3× duplication
  across dashboard-overlay, visualizer-views, and auto-dashboard
- Use shared STATUS_GLYPH/STATUS_COLOR from ui.ts consistently across all
  overlay and view files instead of hardcoded Unicode glyphs
- Fix redundant dynamic import('node:fs') in visualizer-data.ts (statSync
  already imported at top level)
- Replace (entry as any) casts with proper SessionMessageEntry type narrowing
- Add mtime-based file content cache for visualizer data loader to avoid
  re-parsing unchanged roadmap/plan files on every refresh
- Increase visualizer refresh interval from 2s to 5s (with mtime cache,
  unchanged files are effectively free)
- Fix sparkline to use loop-based max instead of Math.max(...values) to
  avoid stack overflow on large arrays
- Add ETA/time-remaining estimate to progress widget and dashboard overlay
  based on average unit duration from metrics ledger
- Show warning glyph for budget-pressured units in completed units list
  (continueHereFired units now show ⚠ instead of ✓)
- Add terminal resize (SIGWINCH) handling to both overlays — invalidates
  cache and re-renders on window size change
- Fix dispose race in dashboard overlay close path — now calls dispose()
  before onClose() to prevent timer callbacks firing after teardown
- Add 23 unit tests for format-utils.ts (including 100k-element sparkline)
- Add 2 tests for estimateTimeRemaining
- Add source-contract tests for resize handler and shared imports

* fix: use STATUS_GLYPH.warning instead of STATUS_GLYPH.statusWarning

STATUS_GLYPH is keyed by ProgressStatus ("warning"), not by GLYPH
property name ("statusWarning"). Fixes typecheck failure in CI.
2026-03-17 14:02:26 -06:00
Jeremy McSpadden
58fd9cf0c1 docs: update README for v2.25-v2.26 features (#929)
Add documentation for HTML report generator, verification enforcement,
parallel orchestrator crash recovery, headless multi-session orchestration,
milestone validation gate, require_slice_discussion option, meaningful
commit messages, and new keyboard shortcuts. Update comparison table,
preferences reference, suggested .gitignore, and commands table.
2026-03-17 14:02:06 -06:00
Tom Boucher
3b18711524 fix: don't overwrite user's model choice when API key is temporarily unavailable (#910) (#928)
The startup model validation overwrote the user's configured model when
it was 'not available' (API key missing, OAuth token expired, rate
limited). This silently changed the model to a fallback like
google/gemini-1.5-flash or openai/gpt-5.4.

Fix: Only trigger the fallback when the configured model doesn't exist
in the registry at all (removed/unknown). A model that exists but is
temporarily unavailable (credential issue) keeps its setting — the
session-level fallback resolver handles it at prompt time.
2026-03-17 14:01:51 -06:00
Tom Boucher
2cba5bc072 fix: break reassess-roadmap skip loop by preventing re-persistence of evicted keys (#912) (#927)
After the skip-loop breaker evicts a completion key, the fallback path
at the bottom of dispatchNextUnit re-persists it because the expected
artifact exists on disk. This recreates the exact loop the breaker was
trying to break:

  evict key → dispatch → verifyArtifact(true) → re-persist key → skip → evict → repeat

Fix: Track recently-evicted keys in a Set. The fallback artifact-check
path skips re-persistence for keys that were just evicted by the
skip-loop breaker. Set is cleared on stopAuto.
2026-03-17 14:01:36 -06:00
Tom Boucher
2306e6bb34 fix: LSP command resolution and ENOENT crash on Windows/MSYS (#901) (#925)
Two fixes:

1. lsp/config.ts: Use `where.exe` instead of `which` on Windows.
   MSYS's `which` returns POSIX paths (/c/Users/...) that Node's
   spawn() can't execute. `where.exe` returns native Windows paths.

2. lsp/client.ts: Handle spawn ENOENT error gracefully. When the LSP
   server binary doesn't exist, the error event now triggers a clean
   exit instead of bubbling up and crashing auto-mode.
2026-03-17 14:01:16 -06:00
Tom Boucher
7869312769 fix: dispatch plan-slice when task plan files are missing (#909) (#923)
When a slice plan (S03-PLAN.md) was pre-created during roadmapping
but plan-slice never ran to generate per-task files (tasks/T01-PLAN.md),
deriveState returned 'executing' phase. execute-task then failed because
the task plan didn't exist, creating an infinite restart loop.

Fix: In deriveState, when the tasks directory exists but has zero .md
files and the slice plan references tasks, return 'planning' phase
instead of 'executing'. This causes plan-slice to dispatch and generate
the missing task plans.

Tests updated: 6 test files that create synthetic state fixtures now
include a stub task plan file so their 'executing' phase assertions
remain valid.
2026-03-17 13:59:34 -06:00
Jeremy McSpadden
e0420f5981 fix: reduce CPU usage on long auto-mode sessions (#921)
* fix: reduce CPU usage on long auto-mode sessions

Seven targeted fixes for compounding process/timer/I/O issues that cause
high CPU during multi-hour /gsd auto sessions:

1. Wrap idle watchdog and hard timeout async callbacks in try-catch to
   prevent unhandled rejections from orphaning intervals
2. Cache nativeHasChanges fallback (10s TTL) to avoid spawning a new
   git process every 15 seconds when native module is unavailable
3. Call clearUnitTimeout() before dispatchNextUnit() in all recovery
   paths to prevent stale idle watchdog from firing alongside new timers
4. Add 10-second timeout to subagent worktree cleanup to prevent hangs
   when git worktree remove blocks indefinitely
5. Prune dead bg-shell processes after each unit completion to free
   retained output buffers (~500KB-1MB per dead process)
6. Throttle STATE.md rebuilds to at most once per 30 seconds (was every
   unit completion at 100-400ms each)
7. Increase progress widget refresh interval from 5s to 15s to reduce
   synchronous file I/O on the hot path

* fix: reset nativeHasChanges cache in worktree test

The 10s TTL cache on nativeHasChanges was causing the worktree test
to return stale "no changes" when checking a freshly dirtied repo
within the cache window. Reset the cache before the dirty-repo
assertion so the test correctly detects new changes.
2026-03-17 13:58:14 -06:00
Lex Christopherson
6becff186e Merge pull request #906 from jeremymcs/issue-898-auto-refactor
refactor: extract 8 focused modules from auto.ts
2026-03-17 13:20:39 -06:00
Lex Christopherson
2e013d70b5 merge: resolve 12 conflicts with main — integrate continueHere feature into refactored closeoutUnit
Conflicts arose because main added continueHereHandle cleanup and
buildSnapshotOpts (with continueHereFired) while the PR extracted
inline closeout code into closeoutUnit(). Resolution: use closeoutUnit()
with buildSnapshotOpts() to pass all fields including continueHereFired.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-17 13:20:20 -06:00
Colin Johnson
7733e12413 fix: reap orphan-prone child processes across session churn (#920)
* fix: reap orphan-prone child processes across session churn

* test: make bg-shell cleanup test shell-safe
2026-03-17 13:14:51 -06:00
Jeremy McSpadden
3167e9fbf4 feat: HTML report generator with progression index (#876) 2026-03-17 11:51:54 -06:00
Tom Boucher
dc7f5e8da5 fix: resolve symlinked skill directories in always_use_skills (#911) (#919) 2026-03-17 11:49:11 -06:00
Goran Cabarkapa
6ed5e23603 fix: resolve symlinked skill directories in preferences (#913) 2026-03-17 11:48:46 -06:00
deseltrus
10200c43f3 feat: crash recovery for parallel orchestrator (#873) 2026-03-17 11:47:26 -06:00
Tom Boucher
5d86159ea8 fix: replan-slice infinite loop, non-standard finish_reason crash, fork-resilient test (#866) 2026-03-17 11:47:07 -06:00
Jeremy McSpadden
2dd7e256f0 fix: skip slice plan commit when commit_docs is false (#784) (#802) 2026-03-17 11:46:53 -06:00
Jeremy McSpadden
2ee0e5ee17 fix: dispatch plan-slice when task plans missing instead of hard-stop (#909) (#915) 2026-03-17 11:44:42 -06:00
Jeremy McSpadden
f5e9b00f47 fix: wire continue-here context-pressure monitor to send wrap-up signal at 70% (#916) 2026-03-17 11:44:12 -06:00