Commit graph

728 commits

Author SHA1 Message Date
Tom Boucher
c442e2d97e fix: treat auto-discovered verification failures as advisory, not blocking (#1188)
When the verification gate auto-discovers commands from package.json
(typecheck, lint, test), failures on pre-existing errors create a doom
loop: execute → fail → auto-fix → still fails → retry exhausted → pause.
The agent can't fix pre-existing lint/test errors it didn't introduce.

Now, when discoverySource is 'package-json', gate failures are logged
as warnings and the task proceeds without triggering the retry loop.
Explicitly configured checks (via preferences or task plan verify field)
still trigger the full retry cycle.

This preserves the safety of user-configured verification while
preventing auto-discovered checks from blocking on inherited tech debt.

Fixes #1186
2026-03-18 10:54:16 -06:00
TÂCHES
58903093cd fix: non-blocking verification gate for auto-discovered commands (#1177)
* fix: make package-json discovered verification commands non-blocking (advisory only)

Auto-discovered commands from package.json scripts (typecheck, lint, test) are
advisory: their failures are logged as warnings but do not block the gate or
trigger retries. Only explicitly configured preference commands and task-plan
verify commands remain blocking.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: add missing blocking field to verification-evidence test fixtures

The previous commit added `blocking: boolean` to VerificationCheck but
only updated verification-gate.test.ts. The evidence test file had 26
VerificationCheck literals missing the new required field.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-18 10:13:28 -06:00
Tom Boucher
8d04ec19fd fix: add defensive guards against undefined .filter() in auto-mode dispatch/recovery (#1180)
Auto-mode crashed with 'Cannot read properties of undefined (reading
filter)' during partial execute-task recovery when derived state was
structurally incomplete.

Added ?? [] fallback guards on all .filter()/.find()/.map() calls
that access state.registry, roadmap.slices, or similar derived arrays
in the dispatch and recovery paths:

- auto.ts: 3 state.registry.filter() calls
- auto-recovery.ts: 1 roadmap.slices.find() call
- auto-start.ts: 1 state.registry.filter() call

These are belt-and-suspenders guards — the parsers always return arrays,
but crash recovery can encounter partially written or corrupt state files
where the parsers return unexpected shapes.

Fixes #1176
2026-03-18 10:07:22 -06:00
Tom Boucher
8281a2ea75 fix: sync living docs (DECISIONS/REQUIREMENTS/PROJECT/KNOWLEDGE) between worktree and project root (#1173)
syncStateToProjectRoot() copied STATE.md, milestone directories,
completed-units.json, and runtime records — but not the four root-level
living documents. When agents updated these during slice execution in a
worktree, a new session would read stale copies from the project root,
losing decisions, requirement status changes, project descriptions, and
accumulated knowledge.

Added bidirectional sync for DECISIONS.md, REQUIREMENTS.md, PROJECT.md,
and KNOWLEDGE.md:
- Worktree → project root: in syncStateToProjectRoot() after runtime records
- Project root → worktree: in syncProjectRootToWorktree() before milestone sync

Fixes #1168
2026-03-18 10:07:06 -06:00
Tom Boucher
a1ef04a5f3 fix: route needs-discussion phase to interactive flow instead of stopping (#1175)
When a milestone has CONTEXT-DRAFT.md (phase: needs-discussion), the
dispatch table returned 'stop' — which made auto-mode exit. Running
/gsd again would re-enter auto → dispatch → stop → loop indefinitely.

The guided-flow already has a complete interactive handler for
needs-discussion (discuss from draft / start fresh / skip), but it was
never reached from the auto-mode entry path.

Added an early check in dispatchNextUnit: if phase is needs-discussion,
stop auto-mode gracefully and route to showSmartEntry() which handles
the discussion flow correctly.

Fixes #1170
2026-03-18 10:06:48 -06:00
Copilot
0a974c9765 Fix validate-milestone skip loop: align artifact check with state machine (#1113)
* Initial plan

* Fix validate-milestone skip loop: verify terminal verdict in artifact check

When verifyExpectedArtifact checked validate-milestone units, it only
verified the VALIDATION file existed on disk. But deriveState requires the
verdict to be terminal (pass/needs-attention/needs-remediation) before
advancing past validating-milestone. If the file existed with malformed
frontmatter or an unrecognized verdict, the artifact check passed (causing
skip) while deriveState stayed in validating-milestone, creating a hard
skip loop that hit the lifetime cap.

Now verifyExpectedArtifact reads the VALIDATION file content and calls
isValidationTerminal() to confirm the verdict matches what deriveState
expects. Non-terminal validations are treated as incomplete artifacts,
triggering re-run instead of skip.

Adds 3 new tests for the tightened verification.

Co-authored-by: glittercowboy <186001655+glittercowboy@users.noreply.github.com>

* Address review feedback: clarify comments and add unrecognized verdict test

Co-authored-by: glittercowboy <186001655+glittercowboy@users.noreply.github.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: glittercowboy <186001655+glittercowboy@users.noreply.github.com>
2026-03-18 10:05:29 -06:00
Jeremy McSpadden
880d9ced3a feat: auto-open HTML reports in default browser on manual export (#1164)
When running /gsd export --html, the generated report now automatically
opens in the user's default browser. Uses platform-specific commands
(open/xdg-open/start). Only applies to manual exports — auto-mode
milestone completion reports do not auto-open.
2026-03-18 09:29:04 -06:00
Jeremy McSpadden
34c56cc284 fix: prevent concurrent GSD sessions from overlapping on same project (#1154)
Adds OS-level exclusive session locking via proper-lockfile to prevent
multiple GSD auto-mode processes from running simultaneously on the
same project. Previously, the advisory JSON lock file had a TOCTOU race
condition where two processes could both read "no lock" before either
wrote one.

Changes:
- New session-lock.ts module with acquireSessionLock/releaseSessionLock/
  validateSessionLock using proper-lockfile for OS-level file locking
- Lock acquired at the START of bootstrapAutoSession (before any state
  mutation), not after initialization as before
- Periodic lock validation in dispatchNextUnit detects if another
  process has taken over, triggering graceful shutdown
- Session lock released on both stop and pause
- Resume path re-acquires lock before reactivating
- DB module tracks owner PID for diagnostic purposes
- 16 new tests covering acquire/release/validate/lifecycle scenarios
2026-03-18 09:10:56 -06:00
Tom Boucher
22f2f452b9 fix: exclude completion-transition errors from health escalation at task level (#1157)
When the last task in a slice completes, the doctor detects expected
completion-transition issues (missing slice summary, unchecked roadmap)
that will be resolved by the upcoming complete-slice dispatch. These
were being counted as real errors in the proactive health tracker,
inflating consecutiveErrorUnits and potentially triggering misleading
heal escalation or verification-failure warnings.

Changes:
- Export COMPLETION_TRANSITION_CODES from doctor-types.ts (was local
  to doctor.ts)
- doctor.ts uses the shared constant instead of its local copy
- auto-post-unit.ts filters out completion-transition codes from the
  error count and health snapshot when fixLevel is 'task'

Existing doctor-fixlevel tests confirm the doctor still detects and
reports (but does not fix) these issues at task level.

Fixes #1155
2026-03-18 09:10:43 -06:00
Jeremy McSpadden
8b70fc03f6 feat: add /gsd logs command to browse activity, debug, and metrics logs (#1162)
Adds a new /gsd logs command for browsing and inspecting GSD's existing
logging infrastructure. Users can now discover and review activity logs,
debug logs, and metrics without navigating the filesystem manually.

Subcommands:
  /gsd logs           — List recent activity + debug logs with metrics summary
  /gsd logs <N>       — Show summary of activity log #N (tool calls, files, errors)
  /gsd logs debug     — List debug log files
  /gsd logs debug <N> — Show debug log summary (events, duration, errors)
  /gsd logs tail [N]  — Show last N activity log summaries (default 5)
  /gsd logs clear     — Remove old activity and debug logs (keeps recent 5)

Addresses #1161 — users needed a way to understand what happened during
auto-mode sessions for debugging.
2026-03-18 09:10:11 -06:00
Tom Boucher
974a5f361a fix: /gsd quick respects git isolation: none preference (#1156)
When git.isolation is set to 'none' in preferences, /gsd quick now
stays on the current branch instead of creating a gsd/quick/<n>-<slug>
branch. The branch creation logic is skipped entirely, matching the
behavior users expect from isolation: none.

The 'branch' and 'worktree' modes continue to create branches as before.

Fixes #1153
2026-03-18 08:40:09 -06:00
Copilot
05beb9cba7 fix: text-based fallbacks for RPC mode where TUI widgets produce empty turns (#1112)
* Initial plan

* fix: add text-based fallbacks for RPC mode where TUI widgets produce empty turns

- rpc-mode.ts: Emit placeholder widget event instead of silently dropping factory-based setWidget calls
- commands.ts: handleStatus() falls back to text-based status summary when custom() returns undefined
- commands.ts: handleVisualize() notifies that TUI is required when custom() returns undefined
- auto-dashboard.ts: updateProgressWidget() emits string-array fallback before factory widget
- queue-reorder-ui.ts: showQueueReorder() notifies with current order when custom() returns undefined
- index.ts: Dashboard shortcut handler falls back to text status in RPC mode

Co-authored-by: glittercowboy <186001655+glittercowboy@users.noreply.github.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: glittercowboy <186001655+glittercowboy@users.noreply.github.com>
2026-03-18 08:34:49 -06:00
Jeremy McSpadden
45af9f7f9d feat(browser-tools): configurable screenshot resolution, format, and quality (#1152)
Add environment variable overrides for screenshot capture settings so
users can opt into full-resolution output for human review while keeping
the Anthropic vision-optimized defaults:

- SCREENSHOT_MAX_WIDTH (default 1568, set 0 to uncap)
- SCREENSHOT_MAX_HEIGHT (default 8000, set 0 to uncap)
- SCREENSHOT_FORMAT (default jpeg for viewport / png for crops)
- SCREENSHOT_QUALITY (default 80, range 1-100)

Also fixes:
- Integration test viewport/scale mismatch: was 1280x720 scale 1,
  now 1280x800 scale 2 to match production browser context
- Unit test height-limit assertion: test expected <= 1568 but
  MAX_SCREENSHOT_HEIGHT is 8000 — corrected test image and assertions
2026-03-18 08:33:40 -06:00
Jeremy McSpadden
d834d7be41 fix: pause auto-mode when env variables needed instead of blocking (#1147)
* fix: pause auto-mode instead of blocking when env variables needed (#1146)

When gsd auto encounters pending secrets in the SECRETS.md manifest,
it now pauses the session with a clear notification listing the missing
keys, instead of blocking the entire auto loop with an interactive TUI
prompt. On resume (/gsd auto), secrets are re-collected via the TUI —
if all are skipped, the session re-pauses to prevent broken task runs.

* feat: notify remote channels (Slack/Discord/Telegram) on secrets pause

Sends a one-way notification to the configured remote channel when
auto-mode pauses for missing env variables. The notification directs
the user back to the terminal — secrets are never collected through
remote channels for security reasons.
2026-03-18 08:32:46 -06:00
Tom Boucher
fedfbcd255 feat(mcporter): add .gsd/mcp.json per-project MCP config support (#1141) 2026-03-18 08:26:02 -06:00
Tom Boucher
b1ce681803 feat(metrics): add API request counter for copilot/subscription users (#1140) 2026-03-18 08:25:41 -06:00
Tom Boucher
65fe3c2adc fix(google-search): add 30s timeout to Gemini API call (#1139) 2026-03-18 08:25:24 -06:00
Tom Boucher
62a8be03da fix(verification-gate): sanitize preference commands with isLikelyCommand (#1138) 2026-03-18 08:25:08 -06:00
Tom Boucher
308e328c66 fix(auto-dashboard): show trigger task label for hook units (#1136) 2026-03-18 08:24:39 -06:00
Tom Boucher
ce3dc6ce7b fix(auto-worktree): detect worktree structurally when originalBase is null (#1135) 2026-03-18 08:24:14 -06:00
Tom Boucher
03caf9c958 fix(auto-worktree): auto-commit project root dirty state before milestone merge (#1130) 2026-03-18 08:23:39 -06:00
Tom Boucher
3c0125555b fix(guided-flow): support re-discuss flow for already-discussed slices (#1129) 2026-03-18 08:23:23 -06:00
deseltrus
1410aa597b fix: dispatch guard skips parked milestones — they no longer block later milestone dispatch (#1126) 2026-03-18 08:22:58 -06:00
deseltrus
9b3f1ea261 fix: worktree reassess-roadmap loop — existsSync fallback in checkNeedsReassessment (#1117) 2026-03-18 08:22:32 -06:00
deseltrus
0e4de6fff8 feat: per-milestone depth verification + queue-flow write-gate (#1116) 2026-03-18 08:22:19 -06:00
Vedant
3f9085a588 feat: add OSC 8 clickable hyperlinks for file paths in export notifications (#1114) 2026-03-18 08:21:56 -06:00
Tom Boucher
b2ea63a214 fix(gsd-db): auto-initialize database when tools are called (#1133) 2026-03-18 08:21:15 -06:00
Copilot
1020c140af Fix Codex server_error handling: extraction, retry matching, escalating backoff (#1106)
* Initial plan

* Fix OpenAI Codex error handling: proper error extraction, retry matching, and escalating backoff

- Fix mapCodexEvents to extract error details from nested event.error object
  (Codex API returns {type:"error", error:{type:"server_error", message:"..."}})
- Fix _isRetryableError regex: server.?error matches both server_error and server error
- Add escalating backoff for repeated transient auto-resumes (30s→60s→120s→240s→480s)
- Cap consecutive transient auto-resumes at 5 before pausing indefinitely
- Reset counter on successful unit completion

Co-authored-by: glittercowboy <186001655+glittercowboy@users.noreply.github.com>

* Improve regex test to be behavioral instead of structural

Co-authored-by: glittercowboy <186001655+glittercowboy@users.noreply.github.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: glittercowboy <186001655+glittercowboy@users.noreply.github.com>
2026-03-18 01:10:55 -06:00
deseltrus
51da6c4f74 feat: park/discard actions for in-progress milestones (#1107)
* feat: add park/discard actions for in-progress milestones

Users could not discard, park, or skip milestones once work had begun.
The wizard only offered "Go auto" and "View status" for milestones with
a roadmap, trapping users with stale or deprioritized milestones.

This adds:

- Park mechanism: PARKED.md marker file in milestone directory.
  deriveState() transparently skips parked milestones when finding the
  active one. Parked milestones do NOT satisfy depends_on for downstream
  milestones, preventing accidental unblocking.

- "Milestone actions" submenu in all four active-milestone wizard
  branches (roadmap-exists, planning, summarizing, executing). Offers
  Park / Discard / Skip / Back with clean navigation.

- /gsd park [id] and /gsd unpark [id] CLI subcommands for direct access.

- New module milestone-actions.ts with parkMilestone(), unparkMilestone(),
  discardMilestone(), isParked(), getParkedReason() — keeps guided-flow
  and commands thin.

- 14 tests (36 assertions) covering state derivation, dependency
  semantics, park/unpark round-trip, discard with queue-order pruning,
  and edge cases (all-parked, no-roadmap park, progress counts).

Files changed:
  types.ts           — Add 'parked' to MilestoneRegistryEntry.status
  milestone-actions.ts — NEW: park/unpark/discard core logic
  state.ts           — Skip parked in getActiveMilestoneId + deriveState
  guided-flow.ts     — Milestone actions submenu in 4 wizard branches
  commands.ts        — /gsd park and /gsd unpark subcommands + help
  guided-flow-queue.ts — Parked count in queue summary
  visualizer-data.ts — Add 'parked' to VisualizerMilestone.status
  park-milestone.test.ts — NEW: comprehensive test suite

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* test: add edge case tests for park/discard milestone interactions

Covers 9 critical scenarios (31 assertions):
- Discard breaks depends_on chain → system correctly blocks
- Park blocks depends_on chain
- Queue order survives discards (QUEUE-ORDER.json pruned)
- Park all + discard all → clean pre-planning state
- Mixed states coexist (complete + parked + active + pending)
- Park then discard same milestone
- Discard milestone that has deps on others

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: address critical review findings for park/discard feature

Fixes 7 issues found by adversarial code review:

1. CRITICAL: auto-mode crashed with "Unexpected: N incomplete" error when
   all milestones were parked. Filter now excludes 'parked' status, and
   pre-planning phase is recognized as a valid stop condition.

2. Merge-to-main was skipped when parked milestones existed — same
   incomplete filter now excludes parked.

3. Completed milestones could be parked, corrupting depends_on
   satisfaction. parkMilestone() now guards against SUMMARY.md existence.

4. Escape during park reason picker silently parked with literal
   "not_yet" as reason. Now properly cancels the operation.

5. Parked milestones lost their human-readable title in registry
   (showed ID instead). Phase 1 now caches roadmap for parked
   milestones too, for title extraction.

6. GSD_MILESTONE_LOCK bypassed parked check — parallel workers locked
   to a parked milestone now correctly return null.

7. Parked milestones were eligible for parallel execution, wasting
   worker slots. parallel-eligibility now skips parked milestones.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: complete parked status display across all surfaces

- Visualizer: parked milestones show pause glyph (yellow) instead of
  pending dot
- Doctor: parked milestones show pause emoji in registry report
- HTML export: add .dot-parked CSS (yellow), parked legend entry,
  collapse parked milestone details by default
- Queue reorder: exclude parked milestones from movable list

Closes all remaining cosmetic findings from adversarial review.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-18 01:03:00 -06:00
Lex Christopherson
c6f4cd826b chore(M001-1ya5a3/S01): auto-commit after research-slice 2026-03-18 00:19:30 -06:00
Lex Christopherson
2615473dab fix: update tests for god-file decomposition
- token-profile.test.ts: read preferences-types, preferences-models, and
  preferences-validation alongside preferences.ts for structural checks
- triage-dispatch.test.ts: search auto-post-unit.ts for triage/dispatch
  markers that moved during extraction, update comment markers to match
  actual code
- none-mode-gates.test.ts: skip "no prefs default" test when global
  preferences file exists (cannot control ~/.gsd/preferences.md)
- preferences.test.ts: skip getIsolationMode default test (same reason)

Reduces test failures from 48 to 3 (all pre-existing: doctor-git,
worktree-e2e, stopAutoRemote).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-18 00:09:34 -06:00
Jeremy McSpadden
b20e7b065a feat: cache-ordered prompt assembly and dashboard cache hit rate (#1094)
* feat: cache-ordered prompt assembly and dashboard cache hit rate

Add prompt section reordering for better Anthropic cache hit rates.
Sections are classified as static/semi-static/dynamic and reordered
so stable content appears first in the prefix.

- prompt-ordering.ts: section extraction, classification, and
  reordering by cache stability (static -> semi-static -> dynamic)
- auto.ts: wire reorderForCaching into dispatch with logged warnings
  on failure (not silent catch)
- auto-dashboard.ts: show cache hit rate percentage in progress widget
- dashboard-overlay.ts: show aggregate cache hit rate in status overlay
- auto-prompts.ts: respect compression_strategy preference before
  compressing carry-forward sections

Includes 12 tests for reorderForCaching and analyzeCacheEfficiency.

Split from #1083 per review feedback.

* fix: update source-reading tests for post-refactor file locations

triage-dispatch.test.ts: read auto-post-unit.ts (dispatch logic moved
from auto.ts) and update comment string matches to reflect renamed
section headers.

token-profile.test.ts: read preferences-types.ts, preferences-validation.ts,
and preferences-models.ts (GSDPreferences interface and validation logic
split from preferences.ts).
2026-03-17 23:31:20 -06:00
Jeremy McSpadden
76a834cdf6 feat: add comprehensive API key manager (/gsd keys) (#1089)
* feat: add comprehensive API key manager (/gsd keys)

Add /gsd keys command with 6 subcommands for full API key lifecycle
management: list, add, remove, test, rotate, and doctor.

- list/status: Dashboard grouped by category (LLM, search, tool, remote)
  with masked key previews, OAuth expiry, env var source detection
- add: Interactive provider picker with OAuth vs API key choice,
  prefix validation, and env var activation
- remove: Multi-key support with individual or bulk removal
- test: Lightweight API validation per provider with latency reporting
  and error classification (401/429/5xx/timeout)
- rotate: Remove-and-replace flow with optional pre-save validation
- doctor: Health checks for expired OAuth, empty keys, duplicates,
  env var conflicts, file permissions, missing LLM provider

Includes unified provider registry (22 providers), tab completions,
and redirect from /gsd setup keys. 44 unit tests.

* fix: convert key-manager tests from vitest to node:test for CI typecheck

Extension tests use node:test + node:assert/strict (not vitest) since
tsconfig.extensions.json includes test files and vitest types are not
available in the CI typecheck step.
2026-03-17 22:32:26 -06:00
TÂCHES
5ef52b8a59 refactor: decompose doctor.ts into types, format, and checks modules (#1096)
Extract three modules from the 1,348-line doctor.ts god file:
- doctor-types.ts: DoctorSeverity, DoctorIssueCode, DoctorIssue, DoctorReport, DoctorSummary
- doctor-format.ts: summarizeDoctorIssues, filterDoctorIssues, formatDoctorReport, formatDoctorIssuesForPrompt
- doctor-checks.ts: checkGitHealth, checkRuntimeHealth

All public exports are re-exported from doctor.ts so existing imports
from "./doctor.js" continue to work unchanged.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-17 22:27:38 -06:00
TÂCHES
51c259e778 refactor: extract milestone-ids and guided-flow-queue from guided-flow.ts (#1095)
- Extract milestone ID utilities (MILESTONE_ID_RE, generateMilestoneSuffix,
  nextMilestoneId, extractMilestoneSeq, parseMilestoneId, milestoneIdSort,
  maxMilestoneNum, findMilestoneIds) into milestone-ids.ts (~95 lines)
- Extract queue management (showQueue, handleQueueReorder, showQueueAdd,
  buildExistingMilestonesContext) into guided-flow-queue.ts (~445 lines)
- Add re-exports from guided-flow.ts to preserve public API
- Fix circular dependency: queue-order.ts now imports milestoneIdSort
  from milestone-ids.js instead of guided-flow.js
- guided-flow.ts reduced from 1611 to 1144 lines

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-17 22:27:35 -06:00
TÂCHES
bf3c17c8de refactor: decompose preferences.ts, populate skills and models modules (#1091)
Extract types/interfaces/constants to preferences-types.ts (~200 lines),
validation logic to preferences-validation.ts (~490 lines), move skill
resolution into preferences-skills.ts (~160 lines), and model resolution
into preferences-models.ts (~270 lines). The retained preferences.ts
(~330 lines) handles loading, merging, rendering, hooks, and re-exports
all symbols so existing imports remain unmodified.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-17 22:27:21 -06:00
TÂCHES
25d5f60836 refactor: decompose auto.ts into 6 focused modules (#1088)
Extract 6 cohesive modules from the 3,476-line auto.ts god file,
reducing it to 1,732 lines while preserving all external import paths.

New modules:
- auto-timers.ts (223 lines): Unit supervision timers — soft timeout,
  idle watchdog, hard timeout, context-pressure monitor
- auto-idempotency.ts (150 lines): Completed-key checks, skip loop
  detection, phantom loop handling, fallback persistence
- auto-stuck-detection.ts (220 lines): Dispatch count tracking,
  lifetime cap, MAX_UNIT_DISPATCHES loop detection, stub recovery.
  Uses return values instead of calling stopAuto/dispatchNextUnit.
- auto-verification.ts (195 lines): Post-unit typecheck/lint/test gate,
  runtime error capture, dependency audit, auto-fix retry logic
- auto-post-unit.ts (585 lines): Split into postUnitPreVerification
  and postUnitPostVerification — commit, doctor, state rebuild,
  worktree sync, DB dual-write, hooks, triage, quick-tasks
- auto-start.ts (472 lines): Fresh session bootstrap — git/state init,
  crash lock detection, debug init, worktree setup, DB lifecycle

All extracted functions receive AutoSession + context as parameters.
No circular dependencies — new modules import from leaf dependencies
only, never from ./auto.js. All public exports from auto.ts are
preserved so external import paths continue to work unchanged.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-17 22:26:05 -06:00
TÂCHES
05fa939c11 Merge pull request #1092 from jeremymcs/fix/export-html-blocker-filter
fix: match both milestoneId and sliceId in blocker card filter
2026-03-17 22:25:56 -06:00
Lex Christopherson
f7a03946f3 refactor: decompose commands.ts into 5 focused modules
Extract cohesive groups of functions from the 2,247-line commands.ts
god-file into focused modules:

- commands-prefs-wizard.ts (~600 lines): TUI preferences wizard,
  all configure* functions, serialization helpers
- commands-config.ts (~80 lines): TOOL_KEYS, loadToolApiKeys,
  API key management
- commands-inspect.ts (~80 lines): InspectData type,
  formatInspectOutput, DB diagnostics
- commands-maintenance.ts (~200 lines): cleanup branches/snapshots,
  skip, dry-run handlers
- commands-handlers.ts (~335 lines): doctor, steer, capture, triage,
  knowledge, run-hook, update, skill-health handlers

commands.ts retains registerGSDCommand (router), dispatchDoctorHeal,
fireStatusViaCommand, projectRoot, handleStatus, handleVisualize,
handleSetup, showHelp, and re-exports from all sub-modules to
preserve the public API surface.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-17 22:21:00 -06:00
Jeremy McSpadden
2d8fdcc0ab fix: match both milestoneId and sliceId when filtering duplicate blocker cards
The high-risk card filter in buildBlockersSection only compared sliceId,
causing false positives when different milestones had slices with the
same ID (e.g. M001/S01 and M002/S01). Now matches on both milestoneId
and sliceId to correctly deduplicate.
2026-03-17 23:19:42 -05:00
TÂCHES
965028e219 Merge pull request #1077 from jeremymcs/feat/token-optimization-suite
feat: token optimization suite — caching, compression, smart context selection
2026-03-17 22:07:52 -06:00
Jeremy McSpadden
288b399f88 fix: add dispatch stall guards to prevent auto-mode pause after slice completion (#1073) (#1076)
* fix: prevent summarizing phase stall by retrying dropped agent_end events (#1072)

When handleAgentEnd dispatches a sub-unit (via hooks, triage, or quick-task
early-dispatch paths) and that unit completes before handleAgentEnd returns,
the resulting agent_end event is silently dropped by the reentrancy guard.
This leaves auto-mode active but permanently stalled — no unit running, no
watchdog set, process at high CPU doing nothing.

Add a pendingAgentEndRetry flag to AutoSession that the reentrancy guard sets
when it drops an agent_end event. The finally block in handleAgentEnd checks
this flag and schedules a deferred retry via setImmediate, ensuring the
completed unit's agent_end is always processed.

* fix: add dispatch stall guards to prevent auto-mode pause after slice completion (#1073)

After a slice completes all tasks, auto-mode can stall if newSession()
hangs or dispatchNextUnit gets permanently blocked at any await point.
The existing gap watchdog only fires AFTER dispatchNextUnit returns, so
it cannot recover from hangs inside the function itself.

- Wrap newSession() with Promise.race timeout (30s) to prevent permanent
  hangs from session manager deadlocks or network issues
- Add pre-dispatch hang guard (60s) in handleAgentEnd that starts the
  gap watchdog if dispatchNextUnit hasn't completed — catches hangs at
  any await point (model selection, session creation, etc.)
- Add better diagnostics: notify user when session creation times out
  or fails, with specific unit type/ID for debugging
2026-03-17 22:02:10 -06:00
Jeremy McSpadden
b2befe3628 fix: prevent summarizing phase stall by retrying dropped agent_end events (#1072) (#1074)
When handleAgentEnd dispatches a sub-unit (via hooks, triage, or quick-task
early-dispatch paths) and that unit completes before handleAgentEnd returns,
the resulting agent_end event is silently dropped by the reentrancy guard.
This leaves auto-mode active but permanently stalled — no unit running, no
watchdog set, process at high CPU doing nothing.

Add a pendingAgentEndRetry flag to AutoSession that the reentrancy guard sets
when it drops an agent_end event. The finally block in handleAgentEnd checks
this flag and schedules a deferred retry via setImmediate, ensuring the
completed unit's agent_end is always processed.
2026-03-17 22:01:58 -06:00
Tom Boucher
38b79d75a7 refactor: remove redundant test file, identify consolidation targets (#1070)
* docs: add Node LTS pinning guide for macOS Homebrew users

New doc (docs/node-lts-macos.md) explains how to pin Node 24 LTS
via Homebrew to avoid running on odd-numbered development releases.
Covers brew install/link/pin, version managers as alternatives,
and verification steps.

Added notice banner in README linking to the guide.

* refactor: remove auto-draft-pause.test.ts — redundant with auto-dashboard.test.ts

auto-draft-pause.test.ts tested describeNextUnit() for needs-discussion,
pre-planning, and executing phases. All of these are already covered by
auto-dashboard.test.ts which has proper node:test structure.

The removed file also had fragile structural tests (string-matching source
code) that break on refactors. The behavioral coverage is complete in the
existing file.

1296 tests pass, 0 fail.
2026-03-17 22:01:20 -06:00
Jeremy McSpadden
60dfaabe03 fix: use atomic writes for completed-units.json and invalidate caches in db-writer (#1069)
Addresses state safety issues found during #1062 deep dive:

1. completed-units.json writes in auto-worktree.ts and auto-worktree-sync.ts
   used plain writeFileSync which could produce truncated/corrupt files on
   crash, losing completion keys and causing unit re-dispatch. Switched to
   atomicWriteSync (temp file + rename) for crash safety.

2. Plan file checkbox reconciliation in auto-worktree.ts also switched to
   atomicWriteSync to prevent partial PLAN.md writes on crash.

3. db-writer.ts functions (saveDecisionToDb, updateRequirementInDb,
   saveArtifactToDb) wrote markdown files via saveFile() without invalidating
   caches afterward. Added targeted cache invalidation (state + path + parse)
   so deriveState() always sees fresh data. Uses individual invalidation
   functions rather than invalidateAllCaches() to avoid clearing the artifacts
   table that was just written to.
2026-03-17 22:01:08 -06:00
Jeremy McSpadden
668f12b97f fix: reject prose Verify: fields from being executed as shell commands (#1066) (#1068)
The verification gate's discoverCommands() was passing prose descriptions
from task plan Verify: fields through sanitizeCommand(), which only checked
for shell injection characters. English prose like "Document exists, contains
all 5 scale names..." passed the filter and was executed via spawnSync,
causing exit code 127 false negatives.

Added isLikelyCommand() heuristic that distinguishes executable commands
from prose descriptions by checking:
- Known command prefixes (npm, node, tsc, eslint, etc.)
- Path-like first tokens (./script.sh, /usr/bin/check)
- Flag-like tokens (-v, --check)
- Uppercase-initial words with 4+ tokens (prose pattern)
- Comma-space clause separators (prose pattern)

Prose Verify: fields now fall through to package.json scripts or "none"
instead of being executed. Valid commands continue to work as before.

Closes #1066
2026-03-17 22:00:52 -06:00
Jeremy McSpadden
9083d86766 fix: restore session model on error instead of reading stale global prefs (#1065) (#1067)
When a model fails during auto-mode and the fallback chain is exhausted
(or absent), the error recovery path previously fell through to pause
without attempting to restore the session's original model. Meanwhile,
the fallback chain itself was read fresh from disk via
loadEffectiveGSDPreferences(), which could pick up models configured by
a different concurrent GSD session sharing the same global preferences
file.

This adds a session model recovery step between fallback exhaustion and
pause. After the existing fallback chain logic, we now check whether the
current model has diverged from the model captured at auto-mode start
(autoModeStartModel). If so, we restore the session model and retry
before giving up and pausing.

Changes:
- auto.ts: export getAutoModeStartModel() getter for the session's
  captured start model
- index.ts: add session model recovery block after fallback chain
  exhaustion, using the session-scoped model instead of re-reading
  global preferences from disk
- model-isolation.test.ts: add 4 tests covering cross-session leakage
  detection, divergence checks, and null safety
2026-03-17 22:00:33 -06:00
Jeremy McSpadden
306c205dfc fix: prevent run-uat re-dispatch loop when roadmap checkbox update fails (#1063) (#1064)
Two compounding bugs caused auto-mode to re-dispatch run-uat indefinitely
after UAT passed:

1. markSliceDoneInRoadmap regex required dash at line start (^-) but the
   roadmap parser accepts optional leading whitespace (^\s*-). When LLMs
   indented checklist items, the doctor could never mark them done.

2. After run-uat completed, handleAgentEnd ran doctor with fixLevel:"task"
   which explicitly excluded slice-level completion transitions. Since
   run-uat is the terminal unit for a slice, the roadmap checkbox stayed
   unchecked, causing deriveState to return the same slice indefinitely.

Fix: Update markSliceDoneInRoadmap and markTaskDoneInPlan regexes to
accept leading whitespace (matching the parser), preserving indentation
in the replacement. Add run-uat to the set of unit types that use
fixLevel:"all" in handleAgentEnd closeout.
2026-03-17 22:00:19 -06:00
Tom Boucher
55769392af refactor: batch 2 — consolidate preferences, convert 8 more files to node:test (#1061)
* docs: add Node LTS pinning guide for macOS Homebrew users

New doc (docs/node-lts-macos.md) explains how to pin Node 24 LTS
via Homebrew to avoid running on odd-numbered development releases.
Covers brew install/link/pin, version managers as alternatives,
and verification steps.

Added notice banner in README linking to the guide.

* refactor: batch 2 — consolidate preferences tests, convert 7 more files to node:test

Preferences (6 files → 1):
  preferences-{git,hooks,mode,models,schema-validation,wizard-fields}.test.ts
  → preferences.test.ts (28 tests)

Converted to node:test (custom runner → node:test):
  - discuss-prompt.test.ts (1 test)
  - auto-preflight.test.ts (1 test)
  - next-milestone-id.test.ts (4 tests)
  - plan-slice-prompt.test.ts (3 tests)
  - workspace-index.test.ts (1 test)
  - roadmap-slices.test.ts (5 tests)
  - in-flight-tool-tracking.test.ts (5 tests)

Net: -933 lines, -6 files. Full suite: 1325 pass, 0 fail.

* refactor: convert dispatch-guard.test.ts to node:test

Net: 1 more file converted. Total this branch: 14 files converted/consolidated, 6 deleted.

* fix: add null guards for parsePreferencesMarkdown in tests

Add assert.ok(prefs) after each parsePreferencesMarkdown() call to
narrow the GSDPreferences | null return type before property access.
Fixes TS18047 errors in CI typecheck.
2026-03-17 22:00:04 -06:00
Tom Boucher
8dfa7d058c refactor: consolidate tests by area, standardize on node:test (#1059)
* docs: add Node LTS pinning guide for macOS Homebrew users

New doc (docs/node-lts-macos.md) explains how to pin Node 24 LTS
via Homebrew to avoid running on odd-numbered development releases.
Covers brew install/link/pin, version managers as alternatives,
and verification steps.

Added notice banner in README linking to the guide.

* refactor: consolidate tests by area, standardize on node:test

Consolidated 10 test files into 4, standardizing on node:test.

Provider errors (3 files → 1): provider-errors.test.ts (34 tests)
Metrics (2 files → 1): metrics.test.ts (13 tests, converted from custom runner)
Activity log (2 files → 1): activity-log.test.ts (11 tests, converted from custom runner)
Complexity (2 files → 1): removed redundant structural string checks

Net: -694 lines, -6 files.
2026-03-17 21:59:50 -06:00