Commit graph

1426 commits

Author SHA1 Message Date
Jeremy McSpadden
fc56cdf93e fix: handle ECOMPROMISED in uncaughtException guard and align retry onCompromised (#1322) (#1332)
When a GSD session crashes hard (SIGKILL, OOM, etc.) without running its
exit handler, the proper-lockfile OS lock directory (.gsd.lock/) is left
stranded. On the next /gsd auto resume, acquireSessionLock detects the dead
PID, cleans up the stale directory, and re-acquires via the retry path.

10 seconds later, proper-lockfile's update timer fires. Due to a subtle
interaction between the synchronous fs adapter (lockSync / toSyncOptions)
and the setTimeout boundary in Node.js v25+, the ECOMPROMISED error
propagates up through the synchronous callback chain and becomes an
uncaught exception — even though the onCompromised callback sets
_lockCompromised = true without throwing.

The _gsdEpipeGuard uncaughtException handler only handled EPIPE, so it
re-threw ECOMPROMISED, crashing the process. Each crash wrote a new
"interrupted session" record, causing an infinite crash loop on resume.

Two fixes:

1. index.ts: Handle ECOMPROMISED in _gsdEpipeGuard. Exit with code 1
   (non-zero to signal failure) so the process.once("exit") handler runs
   and removes the lock directory, allowing the next session to start clean.

2. session-lock.ts: The retry path's onCompromised was missing
   `_releaseFunction = null`, unlike the primary path. This left the
   release function pointer live after compromise, causing validateSessionLock
   to return true and preventing graceful stop detection. Now matches primary.
2026-03-18 22:06:03 -06:00
github-actions[bot]
d25c174f8b release: v2.33.1 2026-03-19 04:02:21 +00:00
Jeremy McSpadden
15a8807eb3 fix: clean up stale numbered lock files and harden signal/exit handling (#1315) (#1323) 2026-03-18 21:15:47 -06:00
Tom Boucher
7537e30815 fix: worktree sync and home-directory safety check (#1311, #1317) (#1322) 2026-03-18 21:15:36 -06:00
Tom Boucher
0e4db4b709 docs: update README and docs for v2.33.0 release (#1320) 2026-03-18 21:15:11 -06:00
Tom Boucher
68e0672dda test: add regression harness for auto-mode dispatch loop (125 assertions) (#1319) 2026-03-18 21:14:59 -06:00
Jeremy McSpadden
805c7718c4 chore: remove orphaned mcporter extension manifest (#1318) 2026-03-18 21:14:50 -06:00
github-actions[bot]
106f5d8d32 release: v2.33.0 2026-03-19 02:40:54 +00:00
Tom Boucher
6b61b75f3d feat: add live regression test harness for post-build pipeline validation (#1316)
10 tests that run against the installed gsd binary after npm publish:

1. headless query returns valid JSON
2. Empty project → pre-planning phase
3. Milestone with roadmap → planning phase
4. All tasks done → summarizing phase
5. Complete milestone → complete phase
6. Stale auto.lock doesn't block --version
7. Crash recovery query works with stale lock
8. Non-TTY exits quickly with clean error
9. Version skew detected before TTY check
10. --help works (native addon loads or falls back)

Wired into pipeline.yml test-verify job after fixture tests
and before @next promotion.

These catch the state machine / infrastructure bugs from #1308
that unit tests can't reach — they exercise deriveState through
the real gsd binary with real .gsd/ directory structures.

Part of #1308
2026-03-18 20:22:54 -06:00
Tom Boucher
0418458cf9 refactor: extract tryMergeMilestone to eliminate 4 duplicate merge paths in auto.ts (#1314) 2026-03-18 20:04:10 -06:00
Tom Boucher
583e84e932 refactor: dispatch loop hardening — defensive guards, regression tests, lock alignment (#1310) 2026-03-18 20:03:59 -06:00
TÂCHES
e6ab3b6722 refactor: extract parseUnitId() to centralize unit ID parsing (#1282)
Replaces 30+ inline `unitId.split("/")` + destructuring patterns across
16 production files with a single `parseUnitId()` helper that returns
`{ milestone, slice?, task? }`. If the unit ID format ever changes,
only one function needs updating.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-18 19:20:08 -06:00
Tom Boucher
afb438164e fix: align retry lock path with primary lock settings to prevent ECOMPROMISED (#1307)
The retry lock acquisition path (from stale lock recovery in #1251)
used a 5-minute stale threshold and no onCompromised handler, while
the primary path used 30 minutes and a graceful flag-based handler.

This mismatch meant locks acquired via the retry path would throw
ECOMPROMISED (uncaught, crashes process) if the event loop stalled
for >5 minutes — which happens during long LLM operations.

Fixed:
- Stale timeout: 300_000 → 1_800_000 (matches primary)
- Added onCompromised handler (sets _lockCompromised flag)
- Added process.on('exit') safety net (matches primary)

Also: reporter is on Node v25.6.1 which is unsupported — GSD requires
Node >=22.0.0 with 24 LTS recommended.

Fixes #1304
2026-03-18 19:15:47 -06:00
Tom Boucher
150575957d fix: skip symlinks in makeTreeWritable to prevent EPERM on NixOS/nix-darwin (#1303)
makeTreeWritable used statSync which follows symlinks. On NixOS and
nix-darwin, ~/.gsd/agent/bin/ contains symlinks to the immutable Nix
store (/run/current-system/sw/bin/). Attempting to chmod those targets
crashed GSD on startup with EPERM.

Changes:
- Use lstatSync instead of statSync — detects symlinks without
  following them
- Skip symlinks entirely (they don't carry own permissions, targets
  may be immutable)
- Added try/catch around chmodSync as safety net for any remaining
  permission errors on unusual filesystems

Secondary analysis: rmSync with force:true already handles symlinks
correctly (removes the link, not the target). cpSync with force:true
replaces symlinks with regular files (desired behavior for resource
sync).

Fixes #1298
2026-03-18 19:15:33 -06:00
TÂCHES
2a2056bcd7 refactor: extract getErrorMessage() helper to eliminate 65 inline duplicates (#1280)
Consolidate the repeated `err instanceof Error ? err.message : String(err)`
pattern into a single `getErrorMessage(err)` utility. Reduces visual noise in
catch blocks across 20 files in the GSD extension.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-18 19:12:44 -06:00
TÂCHES
922826ba8a refactor: consolidate DB-fallback inline functions in auto-prompts (#1276)
* refactor: consolidate DB-fallback inline functions in auto-prompts

Extract shared inlineFromDbOrFile() helper that encapsulates the
repeated pattern of checking DB availability, dynamically importing
context-store, running a query, formatting results, and falling back
to the filesystem. The three public functions (inlineDecisionsFromDb,
inlineRequirementsFromDb, inlineProjectFromDb) become thin wrappers
that pass only the differing query/format logic as a callback.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: update source-level test to match refactored DB-fallback function name

The context-compression test greps auto-prompts.ts source for
`inlineGsdRootFile(base, "project.md"` which was replaced by
`inlineProjectFromDb(base)` in the consolidation refactor.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-18 19:11:01 -06:00
Tom Boucher
8f06a14fb6 docs: update README for v2.32.0 release (#1299) 2026-03-18 18:57:29 -06:00
Tom Boucher
3ae1d54759 fix: handle Windows EPERM on .gsd migration rename with copy+delete fallback (#1296) 2026-03-18 18:57:06 -06:00
Tom Boucher
5660100c66 fix: add actionable recovery guidance to crash info messages (#1295) 2026-03-18 18:56:55 -06:00
Tom Boucher
5351e776d9 fix: resolve main repo root in worktrees for stable identity hash (#1294) 2026-03-18 18:56:43 -06:00
Tom Boucher
f3cf03c067 fix: merge quick-task branch back to original after completion (#1293) 2026-03-18 18:56:30 -06:00
github-actions[bot]
113c5e6518 release: v2.32.0 2026-03-19 00:15:12 +00:00
Jeremy McSpadden
5e7f42c686 feat: always-on health widget and visualizer health tab expansion (#1286) 2026-03-18 18:04:21 -06:00
Jeremy McSpadden
652ac385b7 feat: environment health checks, progress score, and status integration (#1263) 2026-03-18 18:04:14 -06:00
Juan Francisco Lebrero
d4d8d1a81e fix: skip crash recovery when auto.lock was written by current process (#1289) 2026-03-18 18:03:28 -06:00
Juan Francisco Lebrero
a0de87929b fix: load worktree-cli extension modules via jiti instead of static ESM imports (#1285) 2026-03-18 17:51:27 -06:00
Jeremy McSpadden
f6c980db81 fix(gsd): prevent concurrent dispatch during skip chains (#1272) (#1283) 2026-03-18 17:27:05 -06:00
TÂCHES
2f729fe051 refactor: deduplicate knownUnitTypes and STATE_REBUILD_MIN_INTERVAL_MS constants (#1281) 2026-03-18 17:26:06 -06:00
TÂCHES
0141ea21af refactor: extract prompt builder helpers for inlined context and source file lists (#1279) 2026-03-18 17:25:38 -06:00
TÂCHES
1e4ea36f1f refactor: extract createGitService() factory, remove debug logs (#1278) 2026-03-18 17:23:26 -06:00
TÂCHES
a647a2fcb6 refactor: extract dispatchUnit helper, inline dead buildDocsCommitInstruction (#1275) 2026-03-18 17:22:54 -06:00
TÂCHES
0b4510398d refactor: unify unit-type switch statements into lookup map (#1273) 2026-03-18 17:22:41 -06:00
TÂCHES
84b556908c fix: skip non-artifact UAT dispatch in auto-mode (#1277)
* fix(gsd): skip non-artifact UAT dispatch in auto-mode

Non-artifact-driven UATs (human-experience, live-runtime, mixed) were
dispatched only to write a "surfaced-for-human-review" verdict, which
then blocked the verdict gate and killed auto-mode progression. Auto
now only dispatches artifact-driven UATs it can actually execute.

- checkNeedsRunUat returns null for non-artifact-driven UAT types
- Remove pauseAfterDispatch flag (always artifact-driven now)
- Strip human-review template path from run-uat prompt
- Remove dead pause-after-UAT logic from auto.ts
- Add test for non-artifact UAT skip + stale replay guard

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: update buildRunUatPrompt call in direct dispatch after signature change

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-18 17:22:20 -06:00
github-actions[bot]
a488de99bb release: v2.31.2 2026-03-18 22:40:20 +00:00
TÂCHES
ce550b2423 fix(gsd): stop replaying completed run-uat units (#1270) 2026-03-18 16:24:50 -06:00
github-actions[bot]
f2b637a596 release: v2.31.1 2026-03-18 22:13:47 +00:00
jpmarques19
ad9d7a1815 fix: prevent false-positive 'Session lock lost' during auto-mode (#1257)
- Add onCompromised handler to prevent uncaught throw in setTimeout
- Increase stale threshold from 5min to 30min for laptop sleep safety
- Release OS lock explicitly in SIGTERM handler
2026-03-18 15:44:14 -06:00
github-actions[bot]
b095e352a7 release: v2.31.0 2026-03-18 21:40:31 +00:00
Lex Christopherson
69efb57355 fix: remove stale git-commit assertion in worktree test after commit_docs removal
The test asserted that captureIntegrationBranch commits metadata to git,
but #1258 intentionally stopped committing .gsd/ artifacts.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-18 15:26:04 -06:00
Lex Christopherson
d4d3938e94 fix: remove commit_docs test that broke CI after type removal (#1258)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-18 15:18:01 -06:00
TÂCHES
981b2e4b17 refactor: remove dead commit_docs preference (incompatible with external .gsd/ state) (#1258)
PR #1242 moved .gsd/ state to ~/.gsd/projects/<hash>/ with a symlink.
Git refuses to track files through symlinks, making commit_docs: true
fundamentally broken. Remove the preference and all conditional logic:

- .gsd/ is always gitignored (blanket ignore, no runtime-pattern approach)
- smartStage() always excludes .gsd/ from commits
- Prompt builders always say "do not commit planning artifacts"
- writeIntegrationBranch() writes metadata to disk without committing
- Init wizard no longer asks about commit_docs or bootstrap-commits .gsd/
- Validation emits a deprecation warning if commit_docs is still set

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-18 15:12:16 -06:00
Jean-Dominique Stepek
acec5b5fda feat: add aws-auth extension for automatic Bedrock credential refresh (#1253)
Adds a new bundled extension that proactively checks and refreshes AWS
credentials for Bedrock model users.

Startup (session_start):
- Runs 'aws sts get-caller-identity' with the profile extracted from
  the configured awsAuthRefresh command
- If credentials are expired, runs the refresh command (e.g. aws sso login)
  before the user sends their first prompt
- Shows 'AWS Bedrock login confirmed ✓' when credentials are valid

Mid-session (before_provider_request):
- Re-verifies credentials every 15 minutes before Bedrock API calls
- Catches credential expiry during long sessions without needing retry logic

Zero changes to base files — the entire feature is a single extension file.
Only activates when awsAuthRefresh is set in settings.json and the current
model uses bedrock-converse-stream.
2026-03-18 15:07:10 -06:00
Tom Boucher
a1168cba13 fix: replace blanket git clean .gsd/ with targeted runtime file removal (#1252)
The git clean -fd .gsd/ added in #1239 was too aggressive — it could
remove untracked milestone and planning files on projects where .gsd/
isn't fully gitignored (e.g., manage_gitignore: false).

Replaced with explicit removal of only runtime state files:
- STATE.md, completed-units.json, auto.lock, gsd.db
- .gsd/runtime/ directory

Milestone directories, DECISIONS.md, REQUIREMENTS.md, PROJECT.md and
all other planning artifacts are never touched.

Fixes #1250
2026-03-18 15:05:30 -06:00
Tom Boucher
4cb1374755 fix: invalidate caches inside discuss loop to detect newly written slice context (#1249)
After discussing a slice, the LLM writes S0x-CONTEXT.md. The discuss
loop re-evaluates but hits stale parse caches, showing the slice as
'not discussed' even though the context file exists on disk.

Added invalidateAllCaches() at the top of each loop iteration.

Fixes #1244
2026-03-18 15:03:27 -06:00
Tom Boucher
7e2ca68161 fix: robust prose slice header parsing — handle H1-H4, bold, dots, no-separator variants (#1248)
The prose fallback parser only matched H2 (## S01:) headers with
colon/dash separators. LLMs produce many variants that silently
produced 0 slices, permanently blocking auto-mode.

Expanded the regex from:
  /^##\s+(?:Slice\s+)?(S\d+)[:\s—–-]+\s*(.+)/gm
to:
  /^#{1,4}\s+\*{0,2}(?:Slice\s+)?(S\d+)\*{0,2}[:\s.—–-]*\s*(.+)/gm

Now handles:
  - H1 through H4 headers (# ## ### ####)
  - Bold-wrapped: **S01: Title**, **S01**: Title
  - Dot separator: S01. Title
  - Space-only separator: S01 Title (no punctuation)
  - Non-zero-padded IDs: S1, S01, S001
  - No-space after colon: S01:Title
  - All previous separators: colon, hyphen, em dash, en dash

Also strips trailing bold markers from titles and skips matches
with empty titles.

Fixes #1243
2026-03-18 15:02:51 -06:00
Tom Boucher
de1256d352 fix: clean up stranded .gsd.lock/ directory to prevent false lock conflicts (#1251)
* fix: clean up stranded .gsd.lock/ directory to prevent false lock conflicts

Three fixes for stranded proper-lockfile lock directories:

1. releaseSessionLock: explicitly removes .gsd.lock/ after releasing
   the OS lock and deleting auto.lock

2. acquireSessionLock: when lock acquisition fails, checks if auto.lock
   is missing or the owning PID is dead. If so, removes the stale
   .gsd.lock/ dir and retries acquisition instead of failing.

3. process.on('exit') handler: registered at lock acquisition time as
   a safety net — cleans up .gsd.lock/ on normal process exit if
   releaseSessionLock wasn't called.

Fixes #1245

* fix: move gsdDir declaration before try/catch to fix TS2304 scope error

gsdDir was declared inside the try block but referenced in the catch
block's retry logic, causing 'Cannot find name gsdDir' build failures.
2026-03-18 15:02:37 -06:00
TÂCHES
4d9aef5705 Revert "fix: Two-column dashboard layout with task checklist (#1195)" (#1254)
This reverts commit d3780c9bdb.
2026-03-18 15:02:01 -06:00
TÂCHES
35cee7b05f feat: add -w/--worktree CLI flag for isolated worktree sessions (#1247)
* feat: add -w/--worktree CLI flag to start in an isolated worktree

Enables `gsd -w` to auto-create a randomly-named worktree (adjective-verbing-noun
pattern) and `gsd -w my-feature` for named worktrees. Reuses existing worktree
infrastructure under .gsd/worktrees/ with worktree/<name> branches.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: full worktree lifecycle — subcommands, auto-commit on exit, status banners

Major improvements to the -w/--worktree system:

- `gsd worktree list` — show worktrees with status (files changed, commits, dirty)
- `gsd worktree merge [name]` — squash-merge into main and clean up
- `gsd worktree clean` — remove all merged/empty worktrees
- `gsd worktree remove <name>` — remove with --force safety gate
- `gsd -w` (no name) resumes the only active worktree instead of creating a new one
- `gsd -w` with multiple active worktrees shows a picker
- Auto-commit dirty work on session exit (session_shutdown hook)
- Status banner on normal `gsd` launch when unmerged worktrees exist
- Full help text with lifecycle documentation (`gsd worktree --help`)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-18 14:57:25 -06:00
github-actions[bot]
558b2e1c10 release: v2.30.0 2026-03-18 20:26:29 +00:00
TÂCHES
92f0b15268 feat: add extension manifest + registry for user-managed enable/disable (#1238)
Every extension gets a declarative extension-manifest.json (id, tier,
provides, dependencies). A persistent registry at ~/.gsd/extensions/registry.json
tracks enabled/disabled state. `gsd extensions` command family (list, enable,
disable, info) lets users manage extensions without touching source code.

Registry gate filters disabled extensions in loader.ts and resource-loader.ts
before paths reach loadExtensions(). Zero breakage: extensions without manifests
default to enabled, fresh installs have an empty registry.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-18 14:12:19 -06:00