When /gsd auto is called with no milestone, it delegates to the
discussion flow (showSmartEntry). Previously, if the LLM didn't follow
the discussion protocol — e.g. for simple tasks where it judged the
ceremony overkill and started editing directly — auto mode never
activated. The function returned after showSmartEntry with no retry
or notification, leaving the user in a loop.
Fix: After showSmartEntry returns in both the no-milestone and
pre-planning paths, re-derive state from disk. If the LLM produced
enough artifacts (CONTEXT.md, ROADMAP.md, or advanced the phase),
auto mode proceeds instead of returning. If not, a clear warning
tells the user what happened and what to do next.
This handles the case where the LLM writes files but doesn't follow
the exact discussion → CONTEXT.md → checkAutoStartAfterDiscuss flow.
Adds /gsd help (aliases: h, ?) that displays a grouped reference of
every available subcommand with usage, flags, and shortcuts.
Commands are organized by category: Workflow, Visibility, Course
Correction, Project Knowledge, Configuration, and Maintenance.
Also simplifies the "Unknown command" error to point users to /gsd help
instead of listing all commands inline.
In worktree contexts, the LLM received relative output paths like
`.gsd/milestones/M002/slices/S01/S01-RESEARCH.md` combined with a
working directory containing `.gsd/worktrees/M002`. The double .gsd
in the resulting path confused the LLM, which resolved the relative
path against the project root instead of the worktree — writing
artifacts to the wrong location and triggering loop detection.
All write-target path variables (outputPath, taskSummaryPath,
sliceSummaryPath, milestoneSummaryPath, replanPath, planPath,
uatResultPath, assessmentPath, secretsOutputPath) are now passed
as absolute paths via join(base, relPath), eliminating the need
for the LLM to do path arithmetic in confusing worktree layouts.
Solo developers can fire-and-forget thoughts during auto-mode execution
via /gsd capture. The system triages accumulated captures at natural seams
between tasks, classifies their impact into five types (quick-task, inject,
defer, replan, note), and proposes appropriate action with user confirmation
for plan-modifying resolutions.
Pipeline: capture → triage → confirm → resolve → resume
- /gsd capture appends to .gsd/CAPTURES.md (worktree-aware)
- Triage fires automatically between tasks in handleAgentEnd
- Five resolution types: inline quick task, inject task into plan,
defer for reassess, trigger replan with context, acknowledge as note
- Dashboard overlay shows pending capture count badge
- Capture context injected into replan-slice and reassess-roadmap prompts
- Parse failure falls back to note — pipeline never blocks
New modules: captures.ts, triage-ui.ts, triage-resolution.ts
New prompt: triage-captures.md
52 tests across 3 test files, all passing
Requirements R045-R051 validated
Closes#505
chore: pre-merge cleanup — remove dead code, single-read dashboard optimization
- Remove processTriageResults() and associated types (dead code, superseded by
inline resolution in auto.ts dispatch loop)
- Add countPendingCaptures() for single-read regex count on dashboard hot path
(replaces two-phase hasPendingCaptures + loadPendingCaptures)
- Update triage-dispatch tests to match new implementation
* feat: dynamic model routing for token consumption optimization (#575)
Add complexity-based model routing that classifies units into light/standard/heavy
tiers and routes to cheaper models when appropriate. Reduces token consumption
by 20-50% for users on capped plans.
- Complexity classifier with heuristic-based tier assignment (no LLM call)
- Model router with downgrade-only semantics (user's config is ceiling)
- Budget-pressure-aware routing (more aggressive as budget fills)
- Cross-provider cost comparison via bundled cost table
- Hook classification support
- Escalation on failure (light → standard → heavy)
- Full preference validation and merge support
- Metrics tracking with tier and downgrade fields
- 40 new tests (classifier, router, cost table)
Closes#575
* feat: phases 2-4 — dashboard, adaptive learning, task introspection
Phase 2 — Observability & Dashboard:
- Tier badge [L]/[S]/[H] displayed in progress widget next to phase label
- Dynamic routing savings summary shown in footer when units have been downgraded
- Tier and modelDowngraded fields passed through snapshotUnitMetrics
Phase 3 — Adaptive Learning:
- New routing-history.ts: tracks success/failure per tier per unit-type pattern
- Rolling window of 50 entries per pattern to prevent stale data
- User feedback support (over/under/ok) with 2x weight vs automatic
- Failure rate >20% auto-bumps tier for that pattern
- Tag-specific patterns (e.g. execute-task:docs) for granular learning
- History persists to .gsd/routing-history.json
- Classifier consults adaptive history before finalizing tier
Phase 4 — Task Plan Introspection:
- Code block counting in task plans (5+ blocks → heavy)
- Complexity keyword detection: migration, architecture, security,
performance, concurrency, compatibility
- Multiple complexity keywords (2+) → heavy, single → standard
- New codeBlockCount and complexityKeywords fields in TaskMetadata
Tests: 16 new tests (routing history + introspection), 419 total passing
* test: add feature-branch lifecycle integration test
Proves the core invariant: milestone worktrees branch from and merge
back to the feature branch, never touching main. Covers:
- Full lifecycle with unique milestone IDs (M001-xxxxxx format)
- Untracked .gsd/ planning files copied into worktree
- Multiple successive milestones on the same feature branch
- Main branch completely untouched throughout
* fix: commitCount return type (parseInt)
Update both docs/configuration.md (user-facing) and
src/resources/extensions/gsd/docs/preferences-reference.md (internal)
with complete coverage of all GSD preferences:
- Add /gsd prefs subcommands table (global, project, status, wizard, setup)
- Document token_profile (budget/balanced/quality) and phases settings
- Document context_pause_threshold field
- Document remote_questions configuration (Slack/Discord)
- Document git.merge_strategy (squash/merge) and git.isolation (worktree/branch)
- Expand post_unit_hooks with missing agent field
- Expand pre_dispatch_hooks with skip_if, unit_type, model fields
and action validation rules
- Add known unit types list for hook before/after arrays
- Add examples for pre-dispatch hooks (modify/skip/replace)
- Add examples for token profile, phases, and remote questions
- Update models to show all 6 phases (research, planning, execution,
execution_simple, completion, subagent)
- Add full example combining all major settings
In RPC mode, ctx.ui.custom() returns undefined without emitting any event.
This caused showNextAction() — and all 13+ call sites in guided-flow.ts —
to silently complete without taking action. No error thrown, no event
emitted, command handler returns normally.
Fix: After custom() returns, check for undefined/null and fall back to
ctx.ui.select() which IS implemented in RPC mode. Maps the action list
to select labels and resolves the chosen action id.
Four bugfixes for open issues:
1. Worktree created from integration branch, not main (#606)
- createAutoWorktree reads integration branch from META.json
- mergeMilestoneToMain merges to integration branch, not hardcoded main
- createWorktree accepts optional startPoint parameter
2. Resolve project root from worktree paths in all commands (#608, #602)
- Add resolveProjectRoot() to detect .gsd/worktrees/ in cwd
- All GSD commands use projectRoot() instead of raw process.cwd()
- Fixes stale cwd after milestone completion (#608)
- Fixes discuss/status basepath disagreement (#602)
3. Milestone merge skipped in branch isolation mode (#603)
- Add branch-mode fallback when isInAutoWorktree() is false
- Detects milestone/* branch and performs squash-merge
- Uses same mergeMilestoneToMain flow as worktree mode
4. Remote questions onboarding missing .js module (#592)
- Extract saveRemoteQuestionsConfig into compiled src/ helper
- Avoids cross-boundary import from compiled JS to raw .ts
After milestone completion and merge, the process cwd could remain
inside .gsd/worktrees/<MID>/, causing new milestone writes to land
in the wrong directory.
Three-layer fix:
1. escapeStaleWorktree() at startAuto entry — detects if base path
is inside .gsd/worktrees/ and chdir back to project root
2. stopAuto() unconditionally restores cwd to originalBasePath,
not just when isInAutoWorktree returns true (module state may
have been cleared by mergeMilestoneToMain already)
3. Milestone merge error handler restores cwd on partial failure
where mergeMilestoneToMain chdir'd but then threw
Closes#608
Multiple sources of unbounded memory growth caused V8 to OOM after
~50 minutes of auto-mode operation:
1. activity-log.ts: saveActivityLog serialized ALL session entries
into a single string for SHA1 dedup, allocating hundreds of MB
per unit cycle. Now uses streaming writes (writeSync per entry)
and a lightweight fingerprint (entry count + last 3 entries hash)
instead of full-content hashing.
2. activity-log.ts: activityLogState Map was never cleared between
sessions, accumulating lastSnapshotKeyByUnit entries indefinitely.
Added clearActivityLogState() export, called from stopAuto().
3. auto.ts: completedUnits array grew unbounded for dashboard
display. Now capped at 200 entries and cleared on stopAuto().
4. paths.ts: dirEntryCache and dirListCache Maps grew without bounds
between clearPathCache() calls. Added DIR_CACHE_MAX (200) eviction
— when cache exceeds limit, it's cleared before adding new entries.
Closes#611
* fix: avoid native hangs in gsd auto paths
* fix: use .js extension in edit-diff.test.ts import for tsc compatibility
* fix: prevent OOM on large file diffs and implement context-line windowing
- Add size guard (MAX_DP_CELLS=4M) to buildLineDiff that falls back to a
linear-time prefix/suffix matching algorithm for large files, preventing
the O(n*m) DP table from causing OOM crashes
- Implement contextLines parameter in generateDiffString so only lines
within N lines of a change are rendered (with "..." separators), matching
unified diff behavior — the parameter was previously accepted but ignored
- Add tests for both context windowing and large-file fallback
---------
Co-authored-by: TÂCHES <afromanguy@me.com>
- Restore exhaustive never check in mapStopReason (throw on unhandled FinishReason)
- Add 12 unit tests for sanitizeSchemaForGoogle covering patternProperties removal,
const→enum conversion at various depths, arrays, deeply nested objects, pass-through
- Simplify redundant recursion branches into single typeof object catch-all
- Fix misleading comment ("only in anyOf/oneOf") — conversion happens everywhere
- Drop unnecessary (p: Part) annotation; TypeScript infers it from @google/genai types
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Reduces auto-mode token consumption by 40-60% through coordinated
optimizations driven by a single token_profile preference.
Profile presets (budget/balanced/quality):
- One preference key coordinates model selection, phase skipping,
context compression, and subagent routing
- Balanced is the default for new projects (D046)
- Explicit user preferences always override profile defaults
Phase skipping:
- Guard clauses on research-milestone, research-slice, and
reassess-roadmap dispatch rules
- Skipped phases return null (fall-through), preserving state machine
- Budget profile skips all research + reassess; balanced skips slice
research only
Context compression:
- inlineLevel parameter (full/standard/minimal) on 6 prompt builders
- Minimal: only output template + essential context (≥30% reduction)
- Standard: skip redundant templates
- Full: current behavior unchanged
Complexity routing:
- classifyTaskComplexity() for task plans (step/file/signal heuristics)
- classifyUnitComplexity() for unit types with budget pressure
thresholds at 50/75/90% (from #579)
- execution_simple model config for cheap simple-task routing
- escalateTier() for failure recovery (light→standard→heavy)
Adaptive learning (from #579):
- routing-history.ts tracks success/failure per tier per pattern
- Rolling 50-entry window, 20% failure threshold auto-bumps tier
- User feedback weighted 2x vs automatic detection
- Persists to .gsd/routing-history.json
Budget prediction:
- getAverageCostPerUnitType() + predictRemainingCost() in metrics
- projectedRemainingCost + profileDowngraded in AutoDashboardData
- One-way auto-downgrade within a milestone (D048)
Addresses #575
95 tests across 5 test files, all passing.
* fix: allow stopping auto-mode from a different terminal (#584)
Auto-mode lock file was written to the worktree path instead of the
project root, making it invisible to other processes. Additionally,
/gsd stop only checked in-memory state which is process-local.
- Add lockBase() helper to always write auto.lock at project root
- Add stopAutoRemote() for cross-process stop via SIGTERM
- Update /gsd stop to fall back to lock-file-based remote stop
* fix: handle Windows SIGTERM behavior in stop-auto-remote test
On Windows, SIGTERM is not interceptable by Node.js processes — the
process exits with code 1 rather than running the SIGTERM handler.
Accept either exit code on Windows while still asserting clean exit (0)
on Unix platforms.
cacheKey() used length + first/last 100 chars, which collides when a
checkbox changes [ ] → [x] mid-file (same length, same endpoints).
verifyExpectedArtifact() only cleared the path cache, not the parse
cache, so parseRoadmap() returned stale data with done=false.
- Add clearParseCache() to verifyExpectedArtifact alongside clearPathCache
- Include middle 100-char sample in cacheKey to prevent interior collisions
- Add regression test for the cache collision scenario
* feat: add git.commit_docs setting to keep .gsd/ local-only (#501)
Adds a new `commit_docs` boolean to git preferences. When set to `false`:
- The entire `.gsd/` directory is added to `.gitignore`
- `smartStage()` excludes all `.gsd/` files from commits
- Bootstrap init skips the "chore: init gsd" commit
- `writeIntegrationBranch()` skips committing metadata
- The self-heal that removes blanket `.gsd/` patterns is bypassed
This allows users in corporate environments or mixed teams to use GSD
without polluting the shared git repository with planning artifacts.
Closes#501
* feat: add commit_docs toggle to preferences wizard
Adds "Track .gsd/ planning docs in git" to the /gsd prefs wizard,
allowing users to toggle commit_docs interactively alongside other
git settings like main_branch.
The guided flow's "Create roadmap" path never set pendingAutoStart,
so checkAutoStartAfterDiscuss() always returned false after planning
completed. Auto-mode stalled at "Milestone planned" instead of
proceeding to slice research/execution.
Three fixes:
- Set pendingAutoStart when choice === "plan" in showSmartEntry
- Relax Gate 1 to accept ROADMAP.md (plan path) or CONTEXT.md (discuss path)
- Add STATE.md write instruction to guided-plan-milestone prompt
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Fix two editor input bugs:
1. Arrow key cursor movement not visually updating (fixes#464)
The layout cache key only included {width, textVersion}. Cursor-only
moves don't change textVersion, so stale cached layout was returned
and the diff renderer skipped repaint. Added cursorLine and cursorCol
to the cache key so cursor movements invalidate the cache.
2. Shift+Enter not inserting newlines in non-kitty terminals (Zed, VS Code, etc.)
The /terminal-setup command configures terminals to send ESC+CR (\x1b\r)
for Shift+Enter. But the followUp app action (bound to alt+enter) was
intercepting \x1b\r in CustomEditor.handleInput before the editor's
newLine handler could see it — because in non-kitty terminals, \x1b\r
matches alt+enter. Now when kitty protocol is not active and \x1b\r is
received, the followUp match is skipped so it falls through to newLine.
Alt+Enter followUp still works in kitty-protocol terminals (iTerm2,
Ghostty, Kitty, WezTerm) where the key combos are distinguishable.
Co-authored-by: TÂCHES <afromanguy@me.com>
* fix(undo): use invalidateAllCaches to prevent stale state after undo
After deleting summary files and modifying PLAN files, only
invalidateStateCache() was called. Path and parse caches remained
stale, causing deriveState() to return incorrect results — showing
undone tasks as still complete.
* perf: optimize hot-path lookups, cache clearing, and error resilience
- Replace O(n) Array.includes() with Set-based O(1) lookups in
persistCompletedKey, findCommitsForUnit, and extractCommitShas
- Skip unnecessary cache invalidation for hook units in
verifyExpectedArtifact (moved clearPathCache after hook early-return)
- Avoid redundant disk writes in removePersistedKey when key not present
- Single-pass partition for conflicted files in reconcileMergeState
instead of two separate filter passes
- Wrap undo git operations in try/finally to guarantee cache
invalidation even on partial failure
- Surface auto-start errors to user via ui.notify instead of
swallowing silently (was debug-only logging)
* chore: add PR template and bug report issue template
Standardize PR descriptions and bug reports with structured templates
to improve consistency across contributors.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* chore: simplify PR template — replace milestone/slice with target branch
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* chore: rename section to 'Release context'
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>