Commit graph

1476 commits

Author SHA1 Message Date
Jeremy
bbe67da02c feat(gsd): enhance /gsd codebase with preferences, --collapse-threshold, and auto-init
Add configurable codebase map options via preferences.md (exclude_patterns,
max_files, collapse_threshold), expose --collapse-threshold as a CLI flag,
and auto-generate CODEBASE.md during project init for instant agent orientation.

Closes #3509
2026-04-04 14:51:51 -05:00
Jeremy McSpadden
7a1c6213a0 Merge pull request #3507 from jeremymcs/refactor/workflow-logger-migration
refactor(gsd): migrate all catch blocks to centralized workflow-logger
2026-04-04 14:04:26 -05:00
Jeremy McSpadden
1a21915572 Merge pull request #3505 from jeremymcs/pr-3496
fix(gsd): fail-closed stop guard, harden backtrack parsing, fix prompt params
2026-04-04 13:59:04 -05:00
Jeremy
64fe364fdb fix(gsd): address adversarial review findings on workflow-logger migration
workflow-events.ts: stop logging raw event line content to audit log —
log byte length only to avoid persisting potentially sensitive payload
fragments to .gsd/audit-log.jsonl.

parallel-orchestrator.ts: revert worker NDJSON parse failure to silent
drop — non-JSON lines (progress text, tool output) are expected in
worker stdout and logging each one creates I/O pressure and audit log
bloat in the parallel execution hot path.
2026-04-04 13:53:16 -05:00
Jeremy
3d6d72c04d refactor(gsd): migrate all catch blocks to centralized workflow-logger
Replace raw process.stderr.write(), console.error(), and empty catch
blocks across 50 GSD files with structured logWarning/logError calls
from the centralized workflow-logger system.

Add 13 new LogComponent types to cover all subsystems: recovery,
session, prompt, dashboard, timer, worktree, command, parallel, fs,
bootstrap, guided, registry, renderer.

Every migrated catch block now automatically:
- Shows in terminal (stderr) with component tag
- Gets buffered for auto-loop stuck-detection summary
- Persists to .gsd/audit-log.jsonl for post-mortem analysis

Update regression test to verify catch blocks use workflow-logger
instead of raw stderr/console, covering auto-mode files and all
explicitly migrated infrastructure files.

Closes #3506
Supersedes the approach in #3496
2026-04-04 13:42:55 -05:00
Jeremy
abe887de10 fix(gsd): fail-closed stop guard, harden backtrack parsing, fix prompt params
- Stop/backtrack guard now calls pauseAuto before marking captures executed,
  and returns break on any exception to prevent silently dropping user halt intent
- Backtrack target parsing excludes current milestone ID and rejects ambiguous
  multi-target strings instead of guessing first match
- Fixed gsd_skip_slice parameter names in rethink prompt (milestone_id → milestoneId)
2026-04-04 13:09:16 -05:00
Tibsfox
4f896cc561 fix(gsd): add diagnostic logging to empty catch blocks in auto-mode
Auto-mode has empty catch blocks across 11 files that silently
swallow errors. When these operations fail (DB writes, git commands,
file sync, worktree operations), the error is lost and downstream
systems see stale or inconsistent state — leading to stuck loops,
phantom milestones, and silent data loss.

Replace every empty catch with a process.stderr.write() call that
logs the operation context and error message. Format:

  gsd [filename]: <operation> failed: <error.message>

For catches already annotated with /* non-fatal */ or /* best-effort */
comments, the logging is added alongside the annotation to preserve
the original intent while making failures observable.

Adds a regression test that scans all auto-mode source files and
asserts no empty catch blocks remain.

Files modified (11):
  auto-worktree.ts, auto.ts, auto-recovery.ts, auto-prompts.ts,
  auto-dashboard.ts, auto-start.ts, auto-timers.ts, auto-post-unit.ts,
  auto-dispatch.ts, auto-unit-closeout.ts, auto/phases.ts

No behavioral changes — only diagnostic output added.

Addresses #3348, addresses #3345
2026-04-04 10:38:54 -07:00
Jeremy McSpadden
d07f573799 Merge pull request #3499 from jeremymcs/test/state-machine-edge-cases
test(gsd): fill state machine E2E verification gaps
2026-04-04 11:57:17 -05:00
Jeremy
e0884375e6 test: add regression test for interview-ui notes loop (#3502)
Exercises the goNextOrSubmit → notes auto-open path to ensure:
- Enter after typing a note advances instead of looping
- Empty notes still trigger the auto-open
- Normal option selection is unaffected

Fixes #3502
2026-04-04 11:22:15 -05:00
Jeremy
f153745c4f fix: break infinite notes loop when selecting "None of the above"
goNextOrSubmit() unconditionally reopened the notes field whenever the
cursor sat on the "None of the above" slot, even after the user had
already typed a note and pressed Enter. This trapped users in an
endless loop where Enter always bounced back to notes mode.

Add a `!states[currentIdx].notes` guard so the auto-open only fires
when notes are still empty.

Fixes #3502
2026-04-04 11:12:17 -05:00
Jeremy
3f9fa9351f fix(test): use correct RequirementCounts type fields in edge case tests
Replace non-existent `invalidated` field with the correct type fields
(`outOfScope`, `blocked`, `total`) to pass typecheck.
2026-04-04 10:25:00 -05:00
Jeremy
62cc474002 test(gsd): fill state machine E2E verification gaps (#3498)
Add 102 integration tests across two new files covering state machine
edge cases, runtime failures, and boundary conditions not exercised
by the existing live-validation suite.

Closes #3498
2026-04-04 10:00:07 -05:00
Tom Boucher
a061e3c276 feat: stop/backtrack capture classifications for milestone regression (#3488)
* feat: add stop/backtrack capture classifications for milestone regression (#3487)

Adds 4-layer methodology for halting auto-mode and backtracking to
previous milestones when captures indicate the user wants to stop or
that a milestone missed critical features:

1. Type layer: "stop" and "backtrack" classification types in captures.ts
2. Guard layer: pre-dispatch stop check in runGuards() pauses auto-mode
   before the next unit dispatches
3. Resolution layer: executeBacktrack() writes BACKTRACK-TRIGGER.md and
   milestone regression markers for state machine detection
4. Protection layer: revertExecutorResolvedCaptures() detects and reverts
   captures silenced by non-triage agents (resolved without classification)

Also adds fast-path stop detection in auto-post-unit.ts that pattern-matches
pending capture text for stop keywords without waiting for triage.

Closes #3487

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat: add slice-level skip with gsd_skip_slice tool (#3477)

Adds "skipped" as a closed status alongside "complete" and "done":

- status-guards.ts: isClosedStatus() recognizes "skipped"
- state.ts: isStatusDone() recognizes "skipped"
- gsd-db.ts: getActiveSliceFromDb() skips slices with status "skipped"
- db-tools.ts: new gsd_skip_slice tool for rethink and manual use
- rethink.md: added "Skip a slice" operation to rethink prompt
- rethink.ts: buildRethinkData shows skipped slice counts

Skipped slices satisfy dependencies for downstream slices, allowing
auto-mode to advance past them. Slice data is preserved for reference.

Relates to #3477

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: resolve 4 issues found in adversarial review of PR #3488

1. triage-ui.ts: Restore stop/backtrack entries in CLASSIFICATION_LABELS
   and ALL_CLASSIFICATIONS — the Record<Classification, ...> type requires
   all union members, and runtime lookups would crash on stop/backtrack.
   Also auto-confirm stop/backtrack in the triage confirmation flow
   (matching the triage-captures.md prompt directive).

2. triage-resolution.ts: Replace require("node:fs") in clearBacktrackTrigger
   with ESM import of unlinkSync — consistent with the rest of the codebase.

3. auto-post-unit.ts: Anchor STOP_PATTERN regex to start-of-string (^) to
   prevent false positives on captures like "add a pause button" or "stop
   the timer from re-rendering" which are feature descriptions, not halt
   directives.

4. status-guards.test.ts: Add missing test case for isClosedStatus("skipped")
   to cover the new status value.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: update tool-naming test count for gsd_skip_slice

The new gsd_skip_slice tool (no alias) brings the total from 29 to 30.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-04 01:40:33 -04:00
Tom Boucher
7d5bf63b2d feat: GSD context optimization with model routing and context masking
* docs: add context optimization design spec, implementation plan, and pi-layer research

- Spec: 6-change design for GSD extension context optimization
- Plan: 9-task TDD implementation plan with exact file paths and code
- Pi-layer doc: 10 infrastructure opportunities (research only, not planned)

Part of #3171, #3406, #3452, #3433.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat(context): add observation masking for auto-mode sessions

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(context): add phase handoff anchors for auto-mode

Introduces PhaseAnchor read/write utilities so downstream agents can
inherit decisions, blockers, and intent written at phase boundaries
without re-inferring from conversation history.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(context): add capability-aware model routing and context management preferences

Implement ADR-004 Phase 2 capability scoring with 7-dimension model
profiles, task requirement vectors, and weighted scoring. Add
ContextManagementConfig preferences for observation masking thresholds.
Wire capability scoring into auto-model-selection dispatch path.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat(context): wire observation masking, phase anchors, and tool truncation

Register observation masker in before_provider_request hook to replace
old tool results with placeholders during auto-mode. Add tool result
truncation (configurable via context_management.tool_result_max_chars).
Inject phase handoff anchors into prompt builders so downstream phases
inherit decisions from research/planning. Write anchors after successful
phase completion. Update ADR-004 status to Implemented.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* chore: remove internal planning artifacts from PR

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* docs: add capability routing, observation masking, and context management

Update dynamic-model-routing.md with capability-aware scoring section.
Update token-optimization.md with observation masking, tool truncation,
and phase handoff anchor documentation. Update configuration.md with
context_management preference block and capability_routing flag.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Merge branch 'main' into feat/gsd-context-optimization

* fix: add context_management to known keys and prevent tool truncation state corruption

- Add missing 'context_management' to KNOWN_PREFERENCE_KEYS set so users
  don't get spurious unknown-key warnings when configuring it.
- Replace in-place mutation of tool result content with immutable spread
  to prevent corrupting shared conversation message objects.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: add stop and backtrack to triage-ui classification labels

The Classification type gained stop and backtrack variants from main
but triage-ui.ts was not updated, causing a TypeScript build failure.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: context masker and tool truncation operate on correct pi-ai message format

The observation masker and tool result truncation in before_provider_request
were checking m.type === "toolResult" but the actual pi-ai payload uses
m.role === "toolResult" with content as TextContent[] arrays (not strings).
bashExecution messages are converted to {role:"user"} by convertToLlm before
the hook fires, so checking m.type === "bashExecution" was a no-op.

- Fix context-masker to match on role, handle array content, detect bash
  results by their "Ran `" prefix
- Fix register-hooks truncation to operate on role:"toolResult" with
  array content blocks
- Update tests to use correct pi-ai LLM payload format

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-04 01:02:35 -04:00
Justin Wyer
71caa18552 fix(security): add configurable overrides for command allowlist and SSRF blocklist
PR #666 introduced hardcoded SAFE_COMMAND_PREFIXES and SSRF URL
blocklists with no override mechanism. Users with non-standard
credential tools (sops, doppler, age, infisical) or needing to fetch
from internal URLs (self-hosted docs, VPN services) were silently
blocked with no recourse.

Add two global-only settings (ignored in project-level settings.json
to preserve the security property against malicious repos):

- allowedCommandPrefixes: replaces the built-in command allowlist
- fetchAllowedUrls: exempts hostnames from SSRF blocking

Both also support env var overrides (GSD_ALLOWED_COMMAND_PREFIXES,
GSD_FETCH_ALLOWED_URLS) for CI/container environments. Env vars
take precedence over settings.json.

Security model: global-only keys are stripped from project settings
at load time via stripGlobalOnlyKeys(), applied at all three
assignment points for this.projectSettings. The merge function
stays untouched — no future caller can accidentally skip stripping.

15 new tests covering override behavior, cache invalidation,
allowlist exemptions, and global-only enforcement.
2026-04-02 13:45:05 +02:00
Jeremy McSpadden
d0555857c2 Merge pull request #2976 from jeremymcs/splash-header-updates-clean
feat(splash): add remote channel indicator to tools row
2026-04-01 16:14:23 -05:00
Jeremy McSpadden
b2abff3ce5 Merge pull request #3138 from jeremymcs/claude/add-stale-commit-check-GIbgw
feat(doctor): stale commit safety check with gsd snapshot and auto-cleanup
2026-04-01 14:22:21 -05:00
Jeremy
f7cb3ec07b chore(merge): resolve conflict with upstream/main for PR #3204
Keep catch-all STREAM_RE from PR; upstream's 5-variant whack-a-mole is
superseded by the /in JSON at position \d+/ pattern. Also drop the now-
stale comment about checking stream before server/connection (no longer
needed since catch-all avoids those false-positive overlaps).
2026-04-01 14:05:28 -05:00
Jeremy
d929e9ceed chore(merge): resolve conflicts with upstream/main for PR #3138
- auto-worktree.ts: take upstream's MERGE_HEAD cleanup wording/order
- state.ts: take upstream's inline disk→DB reconciliation (#2631)
  over the simpler "always call deriveStateFromDb" approach
2026-04-01 14:04:16 -05:00
Tom Boucher
0415f41eee Merge pull request #3116 from jeremymcs/refactor/planning-tier-heavy
refactor(complexity): reclassify planning phases from standard to heavy tier
2026-04-01 11:49:11 -04:00
Claude
1edf172463 test(worktree): add regression test for SQUASH_MSG/MERGE_MSG pre-merge cleanup (#2912)
Satisfies CI require-tests gate by adding a test that verifies the
comprehensive pre-merge cleanup (step 7b) removes stale SQUASH_MSG and
MERGE_MSG files — the enhancement over the prior MERGE_HEAD-only cleanup.

https://claude.ai/code/session_01SSHD9RNwVGNxAJZEgNZpgZ
2026-04-01 15:19:18 +00:00
Claude
e5b6a6a1b9 fix(worktree): resolve merge conflict for PR #3322 — adopt comprehensive pre-merge cleanup
Main already had a simpler step 7c (removing only MERGE_HEAD). The PR's
step 7b is more thorough: it also removes SQUASH_MSG and MERGE_MSG,
matching the existing post-merge cleanup pattern. Replace 7c with 7b.

https://claude.ai/code/session_01SSHD9RNwVGNxAJZEgNZpgZ
2026-04-01 15:11:12 +00:00
Tom Boucher
77220d1dde Merge pull request #2283 from jeremymcs/feat/codebase-map
feat(gsd): codebase map — structural orientation for fresh agent contexts
2026-04-01 10:49:01 -04:00
Jeremy McSpadden
04ebe3f0a0 feat(extensions): add Ollama extension for first-class local LLM support (#3371)
Self-contained extension at src/resources/extensions/ollama/ that
auto-detects a running Ollama instance, discovers locally pulled models,
and registers them as a first-class provider with zero configuration.

Features:
- Auto-discovery of local models via /api/tags on session_start
- Capability detection (vision, reasoning, context window) for 40+ model families
- /ollama slash command with status, list, pull, remove, ps subcommands
- ollama_manage LLM-callable tool for agent-driven model operations
- Onboarding flow with auto-detect (no API key required)
- Non-blocking async probe — doesn't delay TUI paint
- Respects OLLAMA_HOST env var for non-default endpoints

Core changes (minimal):
- Add "ollama" to KnownProvider in pi-ai types
- Add "ollama" key resolution in env-api-keys.ts
- Add "ollama" default model in model-resolver.ts
- Add "Ollama (Local)" to onboarding wizard with probe flow
2026-04-01 08:37:31 -06:00
Jeremy
2cc01c11ee fix(merge): clean stale MERGE_HEAD before squash merge (#2912)
A pre-existing MERGE_HEAD (from failed prior merge, libgit2 native path,
or external tooling) blocks git merge --squash. Remove stale merge state
files before starting the squash merge, not just after.
2026-03-31 17:48:45 -05:00
Jeremy
0e978d4565 fix(state): always run disk→DB reconciliation when DB is available (#2631)
When DB was available but empty, deriveState skipped deriveStateFromDb
entirely, bypassing the disk→DB sync logic. Milestones created outside
the DB write path were never discovered.
2026-03-31 17:34:05 -05:00
Jeremy
36b03890da fix(git-service): fix merge-base ancestry check and .gsd/ leakage in snapshot absorption
- Check HEAD~1 (newest snapshot) instead of resetTarget (pre-snapshot
  base) for remote ancestry. The old check false-positived when the
  remote was at the pre-snapshot base but snapshots were local-only.
- Re-run smartStage() after soft reset so RUNTIME_EXCLUSION_PATHS
  apply to the absorbed commit. Without this, .gsd/ state files from
  snapshot commits leaked into the real commit.
2026-03-31 17:25:29 -05:00
Jeremy
fa0651bfd6 feat(doctor): stale commit safety check with gsd snapshot and auto-cleanup
Adds a safety mechanism that detects uncommitted changes idle past a
configurable threshold (default: 30 min), auto-snapshots tracked files
using `git add -u`, and cleans up snapshot commits when real work lands.

- New `stale_uncommitted_changes` doctor issue with auto-snapshot fix
- Detection in health widget (60s), pre-dispatch gate, and /gsd doctor
- `nativeAddTracked()` stages only tracked files (no secrets/binaries)
- `absorbSnapshotCommits()` squashes `gsd snapshot:` commits into next
  real autoCommit via soft reset + re-commit
- Configurable via `stale_commit_threshold_minutes` preference (0=off)
2026-03-31 17:25:29 -05:00
Jeremy McSpadden
f0059a5498 fix(extensions): update provides.hooks in 7 extension manifests to match actual registrations (#3157)
Audit found that 7 bundled extensions had incomplete provides.hooks
arrays in their manifests. Updated each to match actual pi.on() calls:

- async-jobs: +session_before_switch, session_shutdown
- bg-shell: +8 hooks (session_compact, session_tree, etc.)
- browser-tools: +session_start
- context7: +session_shutdown
- google-search: +session_shutdown
- gsd: +12 hooks (bash_transform, tool_call, tool_result, etc.)
- search-the-web: +session_start

Closes #3156
2026-03-31 11:54:41 -06:00
Jeremy McSpadden
1e89090136 test(state-machine): add regression suite — 86 tests across 6 files (#3161) (#3162)
Comprehensive validation of the GSD state machine identified 7 HIGH, 14 MEDIUM,
and 16 LOW findings. This adds regression and integration tests covering:

Unit tests (49):
- Event replay idempotency (M4 lossy blocker replay, M5 duplicate evidence)
- Reconciliation edge cases (fork detection, entity keys, conflict detection)
- Completion hierarchy guards (vacuous truth, phantom parents, rollback fidelity)
- State derivation parity (ghost milestones, phase transitions, DB/FS consistency)
- Stuck detection coverage (all 3 rules + documented gap for 3-unit cycles)

Integration tests (37):
- Full happy-path lifecycle (pre-planning → complete)
- 12 completion guard edge cases with real handlers
- 7 reopen operations including H5 (no reopen-milestone exists)
- Phantom parent auto-creation (H6)
- State derivation consistency with live DB
- Event log integrity across operations
- M12: stale SUMMARY.md causes reconciler to override reopen

Closes #3161
2026-03-31 11:54:30 -06:00
Jeremy McSpadden
fbb67f15f8 feat(widget): add last commit display and dashboard layout improvements (#3226)
- Health widget: always-on last commit with relative time + message
- Dashboard: move worktree/branch info to right-aligned line under header
- Dashboard: move last commit to bottom-left with hints on right
- Dashboard: cap task titles at 45 chars, commit messages at 65 chars
- Dashboard: use … instead of ... for all truncation
2026-03-31 11:49:35 -06:00
Jeremy McSpadden
eaccf3e690 test(state): comprehensive state machine phase walkthrough (#3276) (#3277)
70 tests covering all 16 phases of the GSD state machine with both
happy-path and failure-mode verification. Exercises DB and filesystem
derivation paths, reconciliation logic, and edge cases.

Findings documented in #3276: 0-byte SUMMARY triggers false completion,
DB task rows missing causes wrong phase, stale path cache across
derivations, non-standard status strings silently accepted.
2026-03-31 11:49:28 -06:00
Jeremy McSpadden
706a2f8e9f refactor(state): centralize pipeline logging through workflow logger (#3282)
* refactor(state): centralize pipeline logging through workflow logger

Route 15 raw process.stderr.write calls through the structured
workflow logger (logWarning/logError). Adds "db" and "dispatch"
as new LogComponent values. Enables auto-loop drain/summarize,
audit-log persistence, and doctor integration for reconciliation
and DB events that previously bypassed structured logging.

Files changed:
- workflow-logger.ts: add "db" and "dispatch" components
- state.ts: 3 reconciliation calls → logWarning/logError
- gsd-db.ts: 4 DB operation calls → logError
- workflow-reconcile.ts: 3 event merge calls → logWarning/logError
- auto-dispatch.ts: 1 reactive dispatch call → logError
- auto-post-unit.ts: 3 triage/rogue calls → logWarning/logError

* test(workflow-logger): add tests for db and dispatch log components

Cover the new LogComponent values added in this refactor to satisfy
the CI require-tests gate.
2026-03-31 11:49:19 -06:00
Jeremy McSpadden
17471ea280 feat(model-routing): enable dynamic routing by default (#3120)
* feat(model-routing): enable dynamic routing by default

Change defaultRoutingConfig().enabled from false to true so that
dynamic model routing (tier-based downgrading for light/standard
tasks) is active out of the box. Users can still disable it via
dynamic_routing.enabled: false in PREFERENCES.md.

This is a behavioral change: sessions that previously used the
configured model for all tasks will now automatically downgrade
to cheaper models for light and standard complexity tasks.

* test(model-routing): verify dynamic routing enabled by default

Tests that defaultRoutingConfig returns enabled: true and all
routing features are active.
2026-03-31 11:47:38 -06:00
Tom Boucher
081c5dc52f fix: surface nativeCommit errors in reconcileMergeState instead of silently swallowing (#3052)
The catch block in reconcileMergeState silently swallowed all nativeCommit
exceptions, including real failures (permissions, corrupt git state, hook
rejections). This caused auto-mode to report success and return true (dirty,
re-derive) even when the merge commit actually failed, leading to an infinite
loop where auto-mode repeatedly attempted worktree finalization.

Now the catch block logs the error via ctx.ui.notify at "error" level and
returns false to signal that reconciliation failed, allowing upstream logic
to react appropriately. The nativeCommit return value is also checked —
a null return (nothing to commit) gets its own info notification distinct
from a successful commit SHA.

Closes #2542

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-30 14:50:28 -06:00
Tom Boucher
46d798a1bf fix(parallel): scope commits to milestone boundaries in parallel mode (#3047)
When GSD_MILESTONE_LOCK is set (parallel worker mode), smartStage() now
excludes .gsd/milestones/<M>/ directories for all milestones other than the
locked one. This prevents a parallel worker (e.g., M033) from staging and
committing fabricated artifacts for a milestone it does not own (e.g., M032).

Previously, smartStage() ran `git add -A` with only runtime path exclusions,
allowing cross-milestone pollution when workers share the same .gsd/ directory
(git.isolation: "none"). The GSD_MILESTONE_LOCK env var only filtered what
deriveState() sees but did not prevent file staging.

Closes #1991

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-30 14:50:21 -06:00
Tom Boucher
466c7dea18 fix: skip auto-mode pause on empty-content aborted messages (#2695) (#3045)
When the LLM sends an assistant message with empty content[] and
stopReason "aborted", this is a non-fatal agent stop — not a crash.
The abort handler now checks for empty content and missing errorMessage
before deciding to pause. Empty-content aborts are routed to
resolveAgentEnd instead, breaking the stuck re-dispatch loop.

Closes #2695

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-30 14:50:05 -06:00
Tom Boucher
0b36977804 fix: detect and remove nested .git dirs in worktree cleanup to prevent data loss (#3044)
Scaffolding tools (create-next-app, cargo init, etc.) create nested .git
directories inside worktrees. Git records these as gitlinks (mode 160000)
without .gitmodules, so worktree cleanup destroys the only copy of the
nested object database — causing permanent silent data loss.

Added findNestedGitDirs() helper that recursively scans worktree for nested
.git directories (skipping node_modules and other non-project dirs). The
removeWorktree() function now calls this before cleanup and removes any
nested .git dirs so files are tracked as regular content.

Closes #2616

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-30 14:49:54 -06:00
Tom Boucher
9384641b25 fix: prevent data loss when git isolation default changes (#2625) (#3043)
When the default isolation mode flipped from "worktree" to "none" between
versions, mergeAndExit() returned early for mode "none" without checking
whether the session was physically inside an active worktree. This silently
skipped the merge, orphaning committed work on the milestone branch.

The fix moves the worktree-presence check (isInAutoWorktree + originalBasePath)
before the mode-none early return. If we are inside a worktree, mergeAndExit
proceeds with the worktree merge path regardless of the configured mode.

Also fixes the misleading JSDoc on GitPreferences.isolation that claimed
"worktree" was the default when the runtime default is actually "none".

Closes #2625

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-30 14:49:03 -06:00
Tom Boucher
70e3d9d6c2 fix(gsd): preserve queued milestones with worktrees in ghost detection (#3041)
isGhostMilestone() now checks for DB rows and worktree directories before
falling back to content-file detection. A milestone with a DB row or a
worktree is a legitimate milestone that hasn't been populated yet, not a
ghost from a killed session.

Fixes #2921

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-30 14:47:53 -06:00
Tom Boucher
fb10141e9b fix: call cleanupQuickBranch on turn_end to squash-merge quick branch back (#3054)
cleanupQuickBranch() was exported from quick.ts but never called anywhere.
After a /gsd quick task completed, the user was left on the quick branch
with orphaned state in quick-return.json.

Register a turn_end hook in register-hooks.ts that calls cleanupQuickBranch()
after each agent turn. The function is already idempotent (no-op when no
quick-return state is pending), so it is safe to call on every turn.

Closes #2668

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-30 14:46:03 -06:00
Tom Boucher
dfb18c6e62 fix: align run-uat artifact path to ASSESSMENT, preventing false stuck retries (#3053)
The run-uat prompt instructs the agent to save results via gsd_summary_save
with artifact_type: "ASSESSMENT", which writes S##-ASSESSMENT.md. But
resolveExpectedArtifactPath and diagnoseExpectedArtifact expected S##-UAT.md,
causing artifact verification to fail and auto-mode to retry indefinitely.

Align all three contract points (prompt uatResultPath, artifact resolution,
and diagnostic message) to use ASSESSMENT as the canonical artifact type.

Closes #2873

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-30 14:45:57 -06:00
Tom Boucher
fb0fb5582e fix: replace invalid Discord invite links with canonical URL (#3056)
Closes #2699

The Discord badge in README.md pointed to https://discord.gg/gsd (expired
vanity URL) and the Pi ecosystem doc used an old invite code. Both now use
the canonical invite https://discord.com/invite/nKXTsAcmbT that was
established in commit 0a1dad9a.

Adds a regression test that validates all Discord invite links in
user-facing files match the canonical URL.

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-30 14:45:32 -06:00
Tom Boucher
df9e06cfa5 fix: respect .gitignore for .gsd/ in rethink prompt (#3059)
* fix: respect .gitignore for .gsd/ in rethink prompt (#2570)

The rethink.md prompt template hardcoded `git add .gsd/` which caused
the executing agent to force-add .gsd/ files (via `git add -f`) when
.gsd was listed in .gitignore. This silently overrode the user's
gitignore configuration, tracking planning artifacts they explicitly
excluded.

- Add `isGsdGitignored()` utility that uses `git check-ignore` to
  detect when .gsd is covered by .gitignore rules
- Replace hardcoded `git add .gsd/` in rethink.md with the
  `{{commitInstruction}}` template variable (consistent with all
  other prompt templates)
- Pass gitignore-aware commit instruction from rethink.ts: skip
  commit when .gsd is gitignored, include git add only when it is not

Closes #2570

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* ci: re-trigger checks

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-30 14:43:56 -06:00
Tom Boucher
e71de432ab fix: migrate unit ownership from JSON to SQLite to eliminate read-modify-write race (#3061)
The JSON-based unit-claims storage had a lost-update race under concurrent
multi-agent use: two agents could both read the file as unclaimed, then both
write their claim, with the second silently overwriting the first.

Replace with a SQLite-backed store using INSERT OR IGNORE on a PRIMARY KEY
constraint for atomic first-writer-wins claim semantics. claimUnit() now
returns boolean (true = claimed, false = already claimed by another agent).

Closes #2728

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-30 14:43:44 -06:00
Tom Boucher
e78dca41d4 fix(roadmap): handle numbered, bracketed, and indented prose H3 headers in slice parser (#3063)
The prose slice header fallback parser failed to extract slices when
LLMs generated common formatting variants: numbered prefixes (### 1. S01),
parenthetical numbering (### (1) S01), bracketed IDs (### [S01]), or
indented headings (  ### S01). This caused auto-mode to permanently block
with "No slice eligible" when the plan-milestone prompt produced these
formats inside a ## Slices section.

Broadened the parseProseSliceHeaders regex to accept optional leading
whitespace, numeric prefixes, parenthetical numbering, and square brackets
around slice IDs.

Closes #2567

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-30 14:43:33 -06:00
Tom Boucher
8b680179e2 fix: add worktree-merge to resolveModelWithFallbacksForUnit switch and update KNOWN_UNIT_TYPES (#3066)
Closes #2900

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-30 14:43:22 -06:00
Tom Boucher
571b382075 fix: clean up MERGE_HEAD on all error paths in mergeMilestoneToMain (#2912) (#3068)
libgit2's merge implementation creates MERGE_HEAD even for squash merges,
unlike CLI git. When the merge fails with conflicts, the error paths in
mergeMilestoneToMain cleaned SQUASH_MSG and MERGE_MSG but left MERGE_HEAD
on disk. This blocked all subsequent merge attempts and caused doctor to
report corrupt merge state.

Add MERGE_HEAD cleanup (via nativeMergeAbort + explicit unlink) to:
- The code-conflict error path (before MergeConflictError throw)
- The dirty-working-tree error path (defensive)
- The success path (alongside existing SQUASH_MSG cleanup)

Closes #2912

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-30 14:43:02 -06:00
Tom Boucher
43ece11be7 fix: add openai-codex provider and modern OpenAI models to MODEL_CAPABILITY_TIER and cost tables (#3070)
Closes #2885

The MODEL_CAPABILITY_TIER map in model-router.ts and the BUNDLED_COST_TABLE
in model-cost-table.ts were missing all openai-codex provider models
(gpt-5.1, gpt-5.2, gpt-5.3-codex, gpt-5.4, etc.) and modern OpenAI models
(o4-mini, gpt-4.1, gpt-5, gpt-5-mini, gpt-5-nano, gpt-5-pro). This caused
dynamic routing to treat these models as unknown (falling back to the
isKnownModel guard) and cost comparisons to assign them 999 (the "unknown,
assume expensive" fallback).

Added 17 new model entries to MODEL_CAPABILITY_TIER across all three tiers,
matching the tier assignments from the issue. Added corresponding entries to
both MODEL_COST_PER_1K_INPUT (model-router.ts) and BUNDLED_COST_TABLE
(model-cost-table.ts). Updated the #2192 test fixture that used gpt-5.4 as
an "unknown" model since it is now known.

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-30 14:41:13 -06:00
Tom Boucher
501fb83606 fix: include project name in desktop notifications (#3072)
Desktop notifications now display "GSD — projectName" instead of just
"GSD", making it clear which project a notification belongs to when
multiple projects are active.

Closes #2708

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-30 14:40:58 -06:00