Commit graph

899 commits

Author SHA1 Message Date
Tom Boucher
f0fe4b2443 fix: emit agent_end after abort during tool execution (#1414) (#1417)
* fix: sync worktree completion artifacts back to external state before merge (#1412)

When a worktree's .gsd/ was a real directory (not symlinked to external
state), milestone completion artifacts (SUMMARY, VALIDATION, updated
ROADMAP) were written locally but never synced back. The project root's
deriveState() read from external state and found no SUMMARY — reporting
the milestone as incomplete.

Changes:
- auto-worktree.ts: Added syncWorktreeStateBack() that copies milestone
  and slice .md files from worktree .gsd/ to the main external state dir
- auto.ts: Call syncWorktreeStateBack() in tryMergeMilestone before the
  git merge, ensuring artifacts are visible from the project root

Fixes #1412

* fix: emit agent_end after abort during tool execution (#1414)

When a user aborts a turn while a tool call is running, the abort RPC
succeeds but agent_end was never emitted. RPC consumers tracking turn
lifecycle via events got stuck in a 'streaming' state permanently.

Fix: After abort() + waitForIdle(), emit a synthetic agent_end if the
agent is no longer streaming. This ensures consumers always see the
turn-complete signal regardless of how the turn ended.

Fixes #1414
2026-03-19 10:24:39 -06:00
Juan Francisco Lebrero
e6bbd035ba fix: auto-discard bootstrap crash locks and clean auto.lock on exit (#1397)
Two root causes for the false "Interrupted Session Detected" prompt
that appears every time /gsd is run after a normal exit:

1. guided-flow.ts showed the crash recovery menu even for bootstrap
   crashes (unitType="starting", unitId="bootstrap", completedUnits=0)
   where no work was lost. Now these are silently discarded — the menu
   only appears when real auto-mode work was interrupted.

2. session-lock.ts exit handler cleaned the OS lock directory
   (.gsd.lock/) but not the auto.lock metadata file. On next startup,
   readCrashLock() found the stale file and triggered false recovery.
   Now the exit handler also removes auto.lock.
2026-03-19 08:31:15 -06:00
deseltrus
2dc804a485 fix: harden quick-task branch lifecycle — disk recovery + integration branch guard (#1342) 2026-03-19 07:39:54 -06:00
deseltrus
e6ceb8dfe8 fix: skip verification retry on spawn infra errors (ETIMEDOUT, ENOENT) (#1340) 2026-03-19 07:39:13 -06:00
Jeremy McSpadden
d7bf3d4e72 Improve startup performance with lazy extension loading (#1336) 2026-03-19 07:38:50 -06:00
dan bachelder
b67101c51b fix: keep external GSD state stable in worktrees (#1334) 2026-03-19 07:37:25 -06:00
Tom Boucher
d121c8e3b2 fix: stop excluding all .gsd/ from commits — only exclude runtime files (#1326) (#1328)
smartStage() was excluding the entire .gsd/ directory from git staging,
which is correct when .gsd/ is symlinked to external state. But on
Windows (junction links) or projects where .gsd/ is git-tracked (not
gitignored), this caused a mid-milestone behavioral discontinuity:

1. One-time cleanup removes runtime files from the index
2. After cleanup, nativeAddAll() + nativeResetPaths('.gsd/') causes ALL
   .gsd/ files to be unstaged — including milestone artifacts
3. autoCommit returns null (nothing staged) for the rest of the milestone
4. Work continues silently with no commits, no errors, no warnings
5. Worktree teardown loses all uncommitted .gsd/ artifacts

Fix: replace the blanket '.gsd/' exclusion with targeted RUNTIME_EXCLUSION_PATHS.
Milestone artifacts (.gsd/milestones/, preferences.md, DECISIONS.md, etc.)
are now committed normally when they're tracked. When .gsd/ is in .gitignore
(the default), git add -A already skips it — the reset is a harmless no-op.

Updated git-service.test.ts to verify the new behavior: runtime files
excluded, milestone artifacts committed.

Fixes #1326
2026-03-18 22:06:41 -06:00
Jeremy McSpadden
fc56cdf93e fix: handle ECOMPROMISED in uncaughtException guard and align retry onCompromised (#1322) (#1332)
When a GSD session crashes hard (SIGKILL, OOM, etc.) without running its
exit handler, the proper-lockfile OS lock directory (.gsd.lock/) is left
stranded. On the next /gsd auto resume, acquireSessionLock detects the dead
PID, cleans up the stale directory, and re-acquires via the retry path.

10 seconds later, proper-lockfile's update timer fires. Due to a subtle
interaction between the synchronous fs adapter (lockSync / toSyncOptions)
and the setTimeout boundary in Node.js v25+, the ECOMPROMISED error
propagates up through the synchronous callback chain and becomes an
uncaught exception — even though the onCompromised callback sets
_lockCompromised = true without throwing.

The _gsdEpipeGuard uncaughtException handler only handled EPIPE, so it
re-threw ECOMPROMISED, crashing the process. Each crash wrote a new
"interrupted session" record, causing an infinite crash loop on resume.

Two fixes:

1. index.ts: Handle ECOMPROMISED in _gsdEpipeGuard. Exit with code 1
   (non-zero to signal failure) so the process.once("exit") handler runs
   and removes the lock directory, allowing the next session to start clean.

2. session-lock.ts: The retry path's onCompromised was missing
   `_releaseFunction = null`, unlike the primary path. This left the
   release function pointer live after compromise, causing validateSessionLock
   to return true and preventing graceful stop detection. Now matches primary.
2026-03-18 22:06:03 -06:00
Jeremy McSpadden
15a8807eb3 fix: clean up stale numbered lock files and harden signal/exit handling (#1315) (#1323) 2026-03-18 21:15:47 -06:00
Tom Boucher
7537e30815 fix: worktree sync and home-directory safety check (#1311, #1317) (#1322) 2026-03-18 21:15:36 -06:00
Tom Boucher
68e0672dda test: add regression harness for auto-mode dispatch loop (125 assertions) (#1319) 2026-03-18 21:14:59 -06:00
Jeremy McSpadden
805c7718c4 chore: remove orphaned mcporter extension manifest (#1318) 2026-03-18 21:14:50 -06:00
Tom Boucher
0418458cf9 refactor: extract tryMergeMilestone to eliminate 4 duplicate merge paths in auto.ts (#1314) 2026-03-18 20:04:10 -06:00
Tom Boucher
583e84e932 refactor: dispatch loop hardening — defensive guards, regression tests, lock alignment (#1310) 2026-03-18 20:03:59 -06:00
TÂCHES
e6ab3b6722 refactor: extract parseUnitId() to centralize unit ID parsing (#1282)
Replaces 30+ inline `unitId.split("/")` + destructuring patterns across
16 production files with a single `parseUnitId()` helper that returns
`{ milestone, slice?, task? }`. If the unit ID format ever changes,
only one function needs updating.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-18 19:20:08 -06:00
Tom Boucher
afb438164e fix: align retry lock path with primary lock settings to prevent ECOMPROMISED (#1307)
The retry lock acquisition path (from stale lock recovery in #1251)
used a 5-minute stale threshold and no onCompromised handler, while
the primary path used 30 minutes and a graceful flag-based handler.

This mismatch meant locks acquired via the retry path would throw
ECOMPROMISED (uncaught, crashes process) if the event loop stalled
for >5 minutes — which happens during long LLM operations.

Fixed:
- Stale timeout: 300_000 → 1_800_000 (matches primary)
- Added onCompromised handler (sets _lockCompromised flag)
- Added process.on('exit') safety net (matches primary)

Also: reporter is on Node v25.6.1 which is unsupported — GSD requires
Node >=22.0.0 with 24 LTS recommended.

Fixes #1304
2026-03-18 19:15:47 -06:00
Tom Boucher
150575957d fix: skip symlinks in makeTreeWritable to prevent EPERM on NixOS/nix-darwin (#1303)
makeTreeWritable used statSync which follows symlinks. On NixOS and
nix-darwin, ~/.gsd/agent/bin/ contains symlinks to the immutable Nix
store (/run/current-system/sw/bin/). Attempting to chmod those targets
crashed GSD on startup with EPERM.

Changes:
- Use lstatSync instead of statSync — detects symlinks without
  following them
- Skip symlinks entirely (they don't carry own permissions, targets
  may be immutable)
- Added try/catch around chmodSync as safety net for any remaining
  permission errors on unusual filesystems

Secondary analysis: rmSync with force:true already handles symlinks
correctly (removes the link, not the target). cpSync with force:true
replaces symlinks with regular files (desired behavior for resource
sync).

Fixes #1298
2026-03-18 19:15:33 -06:00
TÂCHES
2a2056bcd7 refactor: extract getErrorMessage() helper to eliminate 65 inline duplicates (#1280)
Consolidate the repeated `err instanceof Error ? err.message : String(err)`
pattern into a single `getErrorMessage(err)` utility. Reduces visual noise in
catch blocks across 20 files in the GSD extension.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-18 19:12:44 -06:00
TÂCHES
922826ba8a refactor: consolidate DB-fallback inline functions in auto-prompts (#1276)
* refactor: consolidate DB-fallback inline functions in auto-prompts

Extract shared inlineFromDbOrFile() helper that encapsulates the
repeated pattern of checking DB availability, dynamically importing
context-store, running a query, formatting results, and falling back
to the filesystem. The three public functions (inlineDecisionsFromDb,
inlineRequirementsFromDb, inlineProjectFromDb) become thin wrappers
that pass only the differing query/format logic as a callback.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: update source-level test to match refactored DB-fallback function name

The context-compression test greps auto-prompts.ts source for
`inlineGsdRootFile(base, "project.md"` which was replaced by
`inlineProjectFromDb(base)` in the consolidation refactor.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-18 19:11:01 -06:00
Tom Boucher
3ae1d54759 fix: handle Windows EPERM on .gsd migration rename with copy+delete fallback (#1296) 2026-03-18 18:57:06 -06:00
Tom Boucher
5660100c66 fix: add actionable recovery guidance to crash info messages (#1295) 2026-03-18 18:56:55 -06:00
Tom Boucher
5351e776d9 fix: resolve main repo root in worktrees for stable identity hash (#1294) 2026-03-18 18:56:43 -06:00
Tom Boucher
f3cf03c067 fix: merge quick-task branch back to original after completion (#1293) 2026-03-18 18:56:30 -06:00
Jeremy McSpadden
5e7f42c686 feat: always-on health widget and visualizer health tab expansion (#1286) 2026-03-18 18:04:21 -06:00
Jeremy McSpadden
652ac385b7 feat: environment health checks, progress score, and status integration (#1263) 2026-03-18 18:04:14 -06:00
Juan Francisco Lebrero
d4d8d1a81e fix: skip crash recovery when auto.lock was written by current process (#1289) 2026-03-18 18:03:28 -06:00
Juan Francisco Lebrero
a0de87929b fix: load worktree-cli extension modules via jiti instead of static ESM imports (#1285) 2026-03-18 17:51:27 -06:00
Jeremy McSpadden
f6c980db81 fix(gsd): prevent concurrent dispatch during skip chains (#1272) (#1283) 2026-03-18 17:27:05 -06:00
TÂCHES
2f729fe051 refactor: deduplicate knownUnitTypes and STATE_REBUILD_MIN_INTERVAL_MS constants (#1281) 2026-03-18 17:26:06 -06:00
TÂCHES
0141ea21af refactor: extract prompt builder helpers for inlined context and source file lists (#1279) 2026-03-18 17:25:38 -06:00
TÂCHES
1e4ea36f1f refactor: extract createGitService() factory, remove debug logs (#1278) 2026-03-18 17:23:26 -06:00
TÂCHES
a647a2fcb6 refactor: extract dispatchUnit helper, inline dead buildDocsCommitInstruction (#1275) 2026-03-18 17:22:54 -06:00
TÂCHES
0b4510398d refactor: unify unit-type switch statements into lookup map (#1273) 2026-03-18 17:22:41 -06:00
TÂCHES
84b556908c fix: skip non-artifact UAT dispatch in auto-mode (#1277)
* fix(gsd): skip non-artifact UAT dispatch in auto-mode

Non-artifact-driven UATs (human-experience, live-runtime, mixed) were
dispatched only to write a "surfaced-for-human-review" verdict, which
then blocked the verdict gate and killed auto-mode progression. Auto
now only dispatches artifact-driven UATs it can actually execute.

- checkNeedsRunUat returns null for non-artifact-driven UAT types
- Remove pauseAfterDispatch flag (always artifact-driven now)
- Strip human-review template path from run-uat prompt
- Remove dead pause-after-UAT logic from auto.ts
- Add test for non-artifact UAT skip + stale replay guard

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: update buildRunUatPrompt call in direct dispatch after signature change

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-18 17:22:20 -06:00
TÂCHES
ce550b2423 fix(gsd): stop replaying completed run-uat units (#1270) 2026-03-18 16:24:50 -06:00
jpmarques19
ad9d7a1815 fix: prevent false-positive 'Session lock lost' during auto-mode (#1257)
- Add onCompromised handler to prevent uncaught throw in setTimeout
- Increase stale threshold from 5min to 30min for laptop sleep safety
- Release OS lock explicitly in SIGTERM handler
2026-03-18 15:44:14 -06:00
Lex Christopherson
69efb57355 fix: remove stale git-commit assertion in worktree test after commit_docs removal
The test asserted that captureIntegrationBranch commits metadata to git,
but #1258 intentionally stopped committing .gsd/ artifacts.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-18 15:26:04 -06:00
Lex Christopherson
d4d3938e94 fix: remove commit_docs test that broke CI after type removal (#1258)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-18 15:18:01 -06:00
TÂCHES
981b2e4b17 refactor: remove dead commit_docs preference (incompatible with external .gsd/ state) (#1258)
PR #1242 moved .gsd/ state to ~/.gsd/projects/<hash>/ with a symlink.
Git refuses to track files through symlinks, making commit_docs: true
fundamentally broken. Remove the preference and all conditional logic:

- .gsd/ is always gitignored (blanket ignore, no runtime-pattern approach)
- smartStage() always excludes .gsd/ from commits
- Prompt builders always say "do not commit planning artifacts"
- writeIntegrationBranch() writes metadata to disk without committing
- Init wizard no longer asks about commit_docs or bootstrap-commits .gsd/
- Validation emits a deprecation warning if commit_docs is still set

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-18 15:12:16 -06:00
Jean-Dominique Stepek
acec5b5fda feat: add aws-auth extension for automatic Bedrock credential refresh (#1253)
Adds a new bundled extension that proactively checks and refreshes AWS
credentials for Bedrock model users.

Startup (session_start):
- Runs 'aws sts get-caller-identity' with the profile extracted from
  the configured awsAuthRefresh command
- If credentials are expired, runs the refresh command (e.g. aws sso login)
  before the user sends their first prompt
- Shows 'AWS Bedrock login confirmed ✓' when credentials are valid

Mid-session (before_provider_request):
- Re-verifies credentials every 15 minutes before Bedrock API calls
- Catches credential expiry during long sessions without needing retry logic

Zero changes to base files — the entire feature is a single extension file.
Only activates when awsAuthRefresh is set in settings.json and the current
model uses bedrock-converse-stream.
2026-03-18 15:07:10 -06:00
Tom Boucher
a1168cba13 fix: replace blanket git clean .gsd/ with targeted runtime file removal (#1252)
The git clean -fd .gsd/ added in #1239 was too aggressive — it could
remove untracked milestone and planning files on projects where .gsd/
isn't fully gitignored (e.g., manage_gitignore: false).

Replaced with explicit removal of only runtime state files:
- STATE.md, completed-units.json, auto.lock, gsd.db
- .gsd/runtime/ directory

Milestone directories, DECISIONS.md, REQUIREMENTS.md, PROJECT.md and
all other planning artifacts are never touched.

Fixes #1250
2026-03-18 15:05:30 -06:00
Tom Boucher
4cb1374755 fix: invalidate caches inside discuss loop to detect newly written slice context (#1249)
After discussing a slice, the LLM writes S0x-CONTEXT.md. The discuss
loop re-evaluates but hits stale parse caches, showing the slice as
'not discussed' even though the context file exists on disk.

Added invalidateAllCaches() at the top of each loop iteration.

Fixes #1244
2026-03-18 15:03:27 -06:00
Tom Boucher
7e2ca68161 fix: robust prose slice header parsing — handle H1-H4, bold, dots, no-separator variants (#1248)
The prose fallback parser only matched H2 (## S01:) headers with
colon/dash separators. LLMs produce many variants that silently
produced 0 slices, permanently blocking auto-mode.

Expanded the regex from:
  /^##\s+(?:Slice\s+)?(S\d+)[:\s—–-]+\s*(.+)/gm
to:
  /^#{1,4}\s+\*{0,2}(?:Slice\s+)?(S\d+)\*{0,2}[:\s.—–-]*\s*(.+)/gm

Now handles:
  - H1 through H4 headers (# ## ### ####)
  - Bold-wrapped: **S01: Title**, **S01**: Title
  - Dot separator: S01. Title
  - Space-only separator: S01 Title (no punctuation)
  - Non-zero-padded IDs: S1, S01, S001
  - No-space after colon: S01:Title
  - All previous separators: colon, hyphen, em dash, en dash

Also strips trailing bold markers from titles and skips matches
with empty titles.

Fixes #1243
2026-03-18 15:02:51 -06:00
Tom Boucher
de1256d352 fix: clean up stranded .gsd.lock/ directory to prevent false lock conflicts (#1251)
* fix: clean up stranded .gsd.lock/ directory to prevent false lock conflicts

Three fixes for stranded proper-lockfile lock directories:

1. releaseSessionLock: explicitly removes .gsd.lock/ after releasing
   the OS lock and deleting auto.lock

2. acquireSessionLock: when lock acquisition fails, checks if auto.lock
   is missing or the owning PID is dead. If so, removes the stale
   .gsd.lock/ dir and retries acquisition instead of failing.

3. process.on('exit') handler: registered at lock acquisition time as
   a safety net — cleans up .gsd.lock/ on normal process exit if
   releaseSessionLock wasn't called.

Fixes #1245

* fix: move gsdDir declaration before try/catch to fix TS2304 scope error

gsdDir was declared inside the try block but referenced in the catch
block's retry logic, causing 'Cannot find name gsdDir' build failures.
2026-03-18 15:02:37 -06:00
TÂCHES
4d9aef5705 Revert "fix: Two-column dashboard layout with task checklist (#1195)" (#1254)
This reverts commit d3780c9bdb.
2026-03-18 15:02:01 -06:00
TÂCHES
35cee7b05f feat: add -w/--worktree CLI flag for isolated worktree sessions (#1247)
* feat: add -w/--worktree CLI flag to start in an isolated worktree

Enables `gsd -w` to auto-create a randomly-named worktree (adjective-verbing-noun
pattern) and `gsd -w my-feature` for named worktrees. Reuses existing worktree
infrastructure under .gsd/worktrees/ with worktree/<name> branches.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: full worktree lifecycle — subcommands, auto-commit on exit, status banners

Major improvements to the -w/--worktree system:

- `gsd worktree list` — show worktrees with status (files changed, commits, dirty)
- `gsd worktree merge [name]` — squash-merge into main and clean up
- `gsd worktree clean` — remove all merged/empty worktrees
- `gsd worktree remove <name>` — remove with --force safety gate
- `gsd -w` (no name) resumes the only active worktree instead of creating a new one
- `gsd -w` with multiple active worktrees shows a picker
- Auto-commit dirty work on session exit (session_shutdown hook)
- Status banner on normal `gsd` launch when unmerged worktrees exist
- Full help text with lifecycle documentation (`gsd worktree --help`)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-18 14:57:25 -06:00
TÂCHES
92f0b15268 feat: add extension manifest + registry for user-managed enable/disable (#1238)
Every extension gets a declarative extension-manifest.json (id, tier,
provides, dependencies). A persistent registry at ~/.gsd/extensions/registry.json
tracks enabled/disabled state. `gsd extensions` command family (list, enable,
disable, info) lets users manage extensions without touching source code.

Registry gate filters disabled extensions in loader.ts and resource-loader.ts
before paths reach loadExtensions(). Zero breakage: extensions without manifests
default to enabled, fresh installs have an empty registry.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-18 14:12:19 -06:00
TÂCHES
3102831db9 refactor: move .gsd/ to external state directory with symlink (ADR-002) (#1242)
Move mutable .gsd/ state from inside the project directory to
~/.gsd/projects/<repo-hash>/, replacing it with a symlink. All worktrees
share the same external state — eliminating the entire bidirectional
sync layer (~370 lines) that was the source of 15+ bug fixes.

Key changes:
- repo-identity.ts: repoIdentity(), externalGsdRoot(), ensureGsdSymlink()
- gsdRoot() resolves through symlinks via realpathSync
- migrate-external.ts: automatic migration with atomic rollback
- resource-version.ts: kept utilities from deleted sync module
- Worktree detection uses git metadata (.git file) instead of path parsing
- gitignore simplified to single .gsd entry
- Doctor checks for failed_migration and broken_symlink

Deleted: auto-worktree-sync.ts, copyWorktreeDb, reconcileWorktreeDb,
reconcilePlanCheckboxes, copyPlanningArtifacts, dual state derivation.

Net: -1271 lines across 38 files. 0 sync ops per dispatch cycle.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-18 14:05:10 -06:00
Jeremy McSpadden
6e727092ff fix: align react-best-practices skill name with directory name (#1234)
The `name` field in `src/resources/skills/react-best-practices/SKILL.md`
was set to `vercel-react-best-practices`, which does not match the parent
directory `react-best-practices`. This triggers a validation warning on
startup: 'name "vercel-react-best-practices" does not match parent
directory "react-best-practices"'.

Updated the name field to `react-best-practices` to match the directory.
2026-03-18 14:04:23 -06:00
Tom Boucher
1947e8e8e3 fix: gate slice progression on UAT verdict, not just file existence (#1241)
Two changes:

1. checkNeedsRunUat now reads the verdict from UAT-RESULT. Only skips
   re-running UAT when verdict is PASS/passed. Non-PASS verdicts (FAIL,
   surfaced-for-human-review) no longer silently advance.

2. Added uat-verdict-gate dispatch rule between run-uat and
   reassess-roadmap. When uat_dispatch is enabled, scans all completed
   slices for non-PASS UAT verdicts and stops auto-mode if found.
   This prevents advancing to the next slice when a UAT failed or
   needs human review.

Fixes #1231
2026-03-18 14:00:36 -06:00