LLMs frequently write depends:[S01-S04] as natural shorthand.
The parser split only on commas, so this produced a single literal
element "S01-S04" that never matched any real slice ID —
permanently blocking the slice with "No slice eligible".
Changes:
roadmap-slices.ts:
- Add expandDependencies() helper — after comma-split, detect dep
tokens matching /^PrefixN(-|..)PrefixM$/ and expand to individual
IDs. Handles S01-S04 (dash range) and S01..S04 (dot-range).
Zero-padding preserved. Mismatched prefixes and reversed ranges
pass through unchanged.
- Wire into parseRoadmapSlices() after the comma-split step.
- Export for direct testing.
doctor.ts:
- Add "unresolvable_dependency" warning code.
- In the slice audit loop, check each dep against the set of known
slice IDs in the roadmap. Fires a warning with the bad dep name
and the correct format hint. Catches leftover range IDs on roadmaps
that were written before this fix, and catches typos.
plan-milestone.md prompt:
- Add explicit rule: use comma-separated depends:[S01,S02,S03], never
range syntax. Defense-in-depth so LLMs don't generate the problem.
Tests:
- roadmap-slices.test.ts: 10 new expandDependencies cases + 2
parseRoadmapSlices integration cases (range + comma round-trip).
- doctor.test.ts: unresolvable_dependency fires for unknown dep S99,
does not fire for valid S01 dep.
952/952 unit tests pass.
Closes#737
verifyExpectedArtifact() for plan-slice units only checked whether the
plan file existed on disk, not whether it contained actual task entries.
When a plan file was created as an empty scaffold during discussion/context
(headings but no tasks), the artifact check considered it 'complete' and
skipped the dispatch. Since deriveState still returned phase:'planning'
(no tasks found), this created an infinite skip loop until auto-mode
exhausted its retry budget and stopped silently.
Added a content check that requires at least one task entry matching
the pattern '- [ ] **T##:' or '- [x] **T##:' before considering a
plan-slice artifact valid. This mirrors the existing content-aware
check used for execute-task (which verifies checkbox state).
Added 3 regression tests covering empty scaffold, valid tasks, and
completed tasks.
Two additional layers to address #733 (background command hang):
1. Stalled-tool detection in idle watchdog (auto.ts)
- Change inFlightTools from Set<string> to Map<string, number> to
track per-tool start timestamps
- Idle watchdog now compares the oldest in-flight tool's age to the
idle timeout. Tools in-flight for < idleTimeoutMs continue to
suppress recovery as before. Tools running >= idleTimeoutMs are
treated as stuck and recovery proceeds — preventing infinite hang
when the bash rewrite is bypassed or a tool hangs for other reasons.
- Export getOldestInFlightToolAgeMs() for testability
2. Prompt guidance in execute-task.md
- Add explicit "Background process rule" to step 5 explaining why
bare `command &` hangs the Bash tool and showing the correct
`command > /dev/null 2>&1 &` pattern
- Recommends bg_shell tool as the preferred approach
3. Test updates (in-flight-tool-tracking.test.ts)
- Import and verify getOldestInFlightToolAgeMs export
- Update header comment to reflect Map-with-timestamps design
When deriveState() keeps returning the same already-completed unit,
the idempotency skip paths in dispatchNextUnit recursively call
themselves forever. The existing MAX_SKIP_DEPTH (20) breaker yields
to the UI but then re-enters the same loop; the hard lifetime counter
(unitLifetimeDispatches) is never reached because skip paths return
before touching it.
Root cause: no per-unit counter on the skip-only path.
Fix:
- Add unitConsecutiveSkips map + MAX_CONSECUTIVE_SKIPS = 3
- Both skip paths (completedKeySet hit, and fallback artifact-exists)
increment the counter on each skip of the same idempotencyKey
- When the counter exceeds MAX_CONSECUTIVE_SKIPS, evict the key from
completedKeySet and persisted storage, invalidate state, and let
deriveState reconcile on the next real dispatch
- Counter resets to 0 for a given key whenever a real dispatch
proceeds (i.e., past both skip paths)
- Counter fully cleared at all 4 existing clear sites (stopAuto,
startAuto, crash recovery, pause/resume)
Export _getUnitConsecutiveSkips / _resetUnitConsecutiveSkips /
MAX_CONSECUTIVE_SKIPS for testability (same pattern as
doctor-proactive.ts resetProactiveHealing).
Tests: auto-skip-loop.test.ts — counter mechanics, threshold bounds,
eviction round-trip, per-key isolation (10 assertions).
Closes#728
- Remove --verbose flag from headless (use --json for detailed output)
- Remove redundant sawToolExecution state variable
- Remove unused rejectCompletion
- Add missing build*Prompt imports in auto.ts (fixes CI typecheck:extensions)
- Document headless mode in README.md and docs/commands.md
- Simplify help text with examples instead of exhaustive command catalog
Replace --step flag with positional command routing so any /gsd
subcommand can run headlessly. Add /gsd dispatch <phase> for direct
unit-type dispatch (research, plan, execute, complete, reassess, uat,
replan) with state-aware resolution.
Quick commands (status, queue, doctor, etc.) resolve on first agent_end.
Long-running commands (auto, next, dispatch) use idle timer + terminal
notification detection.
End-to-end test that validates the headless CLI subcommand by:
- Creating a temp dir with a complete .gsd/ project fixture
- Spawning `node dist/loader.js headless --step --json`
- Validating exit code, JSONL stdout, stderr progress, and artifact
Supports --dry-run for fixture validation without running the agent.
Adds a first-class `gsd headless` command that runs auto-mode without a
TUI by spawning a child process in RPC mode via RpcClient. Useful for
CI/CD pipelines, scripts, and unattended execution.
CLI interface:
gsd headless - Run auto-mode until complete
gsd headless --step - Run one unit only (sends /gsd next)
gsd headless --timeout 300000 - Custom timeout (default 5 min)
gsd headless --json - Forward RPC events as JSONL to stdout
gsd headless --verbose - Show full agent text and tool results
gsd headless --model <id> - Override model
Exit codes: 0 = complete, 1 = error/timeout, 2 = blocked
Features:
- Extension UI auto-responder (handles select, confirm, input, editor,
notify, setStatus, setWidget, setTitle, set_editor_text)
- Completion detection via terminal notification keywords + idle timeout
- Human-readable progress output to stderr
- SIGINT/SIGTERM forwarding for clean shutdown
- Child process crash detection
- Completion summary with diagnostics on failure
PID 1 (init) exists on Unix but not on Windows, causing the
cross-process detection test to fail in CI. Use process.ppid
(parent process) which is guaranteed alive on all platforms.
When auto-mode runs in an auto-worktree, activity logs are written to
`.gsd/worktrees/<MID>/.gsd/activity/` while forensics only scanned
`.gsd/activity/` at the project root. This caused forensics to report
stale failures from the root while the worktree had already produced
the correct artifacts and advanced to execution.
Changes:
forensics.ts:
- scanActivityLogs() now accepts activeMilestone and scans both the
worktree activity dir (if an auto-worktree exists) and the root dir
- Results are merged and sorted by mtime so the most recent traces
from either source appear first
- detectMissingArtifacts() checks both root and worktree paths before
reporting a missing artifact, preventing false positives
- ForensicReport now includes activeWorktree field for visibility
- Saved report and prompt output include worktree context
session-forensics.ts:
- getDeepDiagnostic() now checks the worktree activity dir first by
reading the active milestone ID from STATE.md (synchronous, no
async deriveState dependency)
- Falls back to root activity dir when no worktree is found
- Added readActiveMilestoneId() helper for sync milestone detection
Closes#724
Three bugs caused /gsd status to show "No unit running" while auto mode
was actively executing in another terminal:
1. auto.lock was only written during unit dispatch (after newSession()),
not at auto-mode startup or resume. Any cross-process check between
startup and first dispatch would find no lock file.
2. The dashboard read only the in-memory `active` flag, which is always
false in a different process. It never checked auto.lock for
cross-process detection.
3. The triage dispatch path wrote the lock to `basePath` (worktree)
instead of `lockBase()` (project root), making it invisible to
other terminals checking the project root.
Changes:
- Write initial auto.lock immediately in startAuto() and on resume
- Add cross-process detection in getAutoDashboardData() via auto.lock
- Add remoteSession field to AutoDashboardData for cross-process info
- Update dashboard overlay to show remote session status and unit info
- Fix triage dispatch to use lockBase() instead of basePath
- Add 11 tests covering lock creation, cross-process detection, and
stale lock handling
Full-page screenshots were being squished into a 1568x1568 square,
making tall pages unreadable. Now caps width at 1568px and height
at 8000px independently, preserving readability for long pages.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Fix Windows MCP test failures: use pathToFileURL() instead of bare
join() paths for dynamic imports, fixing ERR_UNSUPPORTED_ESM_URL_SCHEME
on Windows where D:\ paths are not valid ESM import specifiers
- Remove parallel orchestration code that was WIP from another feature
branch and not part of the VS Code extension scope (commands.ts,
preferences.ts, types.ts changes reverted to main)
- Rebase cleanly onto main, resolving mcp-server.ts merge conflict by
keeping main's dynamic import approach with PR's exported interface
and JSDoc documentation
Captures classified as inject, replan, or quick-task were marked
"resolved" in CAPTURES.md but their resolution actions were never
executed — tasks were never injected into plans, replan triggers
were never written, and quick-tasks were never dispatched.
This wires up the existing resolution executor functions that were
defined but never called:
- After triage-captures unit completes, executeTriageResolutions()
reads actionable captures and executes their resolutions:
- inject: calls executeInject() to add tasks to the slice plan
- replan: calls executeReplan() to write REPLAN-TRIGGER.md
- quick-task: queues for dispatch as a new unit type
- Quick-task dispatch block dispatches queued captures one at a time
using buildQuickTaskPrompt(), with proper session/timeout handling
- New markCaptureExecuted() and loadActionableCaptures() functions
track execution state, preventing double-execution on retries
- Quick-task unit type excluded from post-unit hooks (lightweight
one-offs don't need hook chains)
Closes#701
Add a new `gsd sessions` subcommand that lists all saved sessions for
the current directory and lets the user interactively pick one to resume.
Currently `gsd --continue` only resumes the most recent session, with no
way to access older conversations. This change adds:
- `gsd sessions` subcommand that calls SessionManager.list() to enumerate
all sessions for the current working directory
- Interactive numbered list showing date, message count, session name (if
set), and a preview of the first message
- Selection by number to resume any past session via SessionManager.open()
- Subcommand help text (`gsd sessions --help`)
- Help text entry in the main `gsd --help` output
The implementation uses only existing SessionManager APIs (list, open) -
no SDK changes required.
buildExecuteTaskPrompt() was missing the verificationBudget variable
that the execute-task.md template expects. The prompt-loader's strict
placeholder validator threw on every auto-mode task dispatch, blocking
all execution entirely.
Compute the budget from the executor's context window using the existing
computeBudgets() engine and pass it as ~NNK chars format string.
Fixes#707
Previously, running `/gsd cleanup` without a subcommand (branches or
snapshots) fell through to the unknown command handler, producing a
warning. Now bare `/gsd cleanup` runs both branch and snapshot cleanup.
The execute-task, plan-slice, and research-slice prompts all include a
passive instruction to 'use GSD Skill Preferences to decide which skills
to load.' In practice, auto-mode agents never act on this — across 30+
execution units in a real milestone, zero skill files were read.
The root cause is that the passive wording ('use it to decide') gets
overridden by the stronger 'don't re-research, just build what the plan
says' directive in execute-task. The agent treats skill loading as
optional and skips it 100% of the time.
This change rewrites the skill instruction in all three prompts from
passive guidance to an explicit action:
- execute-task: 'read its SKILL.md file now — before writing any code'
- plan-slice: 'read any skill files relevant to this slice's technology
stack before decomposing'
- research-slice: 'read any skill files relevant to this slice's
technology stack before exploring code'
The execute-task change also points agents to both the GSD Skill
Preferences block AND the <available_skills> catalog, since both are
present in the system prompt but the old instruction only referenced
the preferences block.
The plan-slice change adds guidance to note relevant skills in task
plans, so executors know which skills to load without rediscovering
them.
Scans activity logs, metrics, crash locks, and doctor diagnostics for
anomalies, generates a structured forensic report, saves it locally,
and hands it to the LLM for interactive root-cause analysis with
optional GitHub issue creation.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
getAutoWorktreePath() only checked existsSync() on the worktree
directory, treating any directory under .gsd/worktrees/<MID>/ as a
valid auto-worktree. A stray (non-git) directory would be accepted,
causing auto-mode to derive state from an empty/invalid path and
conclude no milestones exist.
Add git worktree validation to both getAutoWorktreePath() and
enterAutoWorktree(): check that the directory contains a .git file
(not directory) with a 'gitdir:' pointer, which is the hallmark of
a real git worktree checkout. Return null / throw if validation fails.
This ensures stray directories are ignored and auto-mode falls through
to normal worktree creation or root-state derivation.
Closes#695
The mcporter extension only discovered servers that the mcporter CLI
itself knew about (via .vscode/mcp.json, Claude Desktop config, etc.).
Servers configured in the standard .mcp.json at the project root —
used by Claude Code, Cursor, and other AI coding tools — were invisible.
Changes:
1. mcporter extension (index.ts):
- Add readProjectMcpJson() that reads .mcp.json from cwd and returns
servers not already discovered by mcporter
- Merge .mcp.json servers into getServerList() results
- Add getMcpJsonServerUrl() to resolve HTTP URLs for .mcp.json servers
- Update getServerDetail() to pass HTTP URLs directly to mcporter
for servers only known via .mcp.json
- Update mcp_call to use HTTP URL as server reference for .mcp.json
servers
2. discover_configs scanner (scanners.ts):
- Add .mcp.json to the project-level MCP config scan path alongside
.claude/.mcp.json and .claude/mcp.json
Closes#692
Use execFileSync with argument arrays instead of execSync with string
interpolation to prevent shell injection via sinceDays parameter.
Validate sinceDays as a positive integer. Replace string-based path
resolution in file-watcher with path.relative() to prevent traversal
via symlinks or .. segments.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
YAML frontmatter parsers can return Date objects for ISO date strings
instead of plain strings. This caused a TypeError when calling
.localeCompare() on a Date object in the changelog sort.
Wrap completedAt with String() at both assignment and sort to handle
both native and JS parser paths safely.
- Skip E2E --print test when no API key is configured (process hangs
waiting for onboarding wizard input in non-TTY CI environments)
- Skip file-watcher extensions subdirectory test on Windows (chokidar
subdirectory event delivery is unreliable in Windows CI runners)
Warp terminal (both macOS and Windows) does not emit recognized escape
sequences for Ctrl+Alt key combos. This adds Warp to the unsupported
terminals list so users see the /gsd status fallback hint.
Closes#643
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Increase file-watcher extension directory test delay to 1500ms with
500ms settle time (Windows filesystem events are slower)
- Make E2E --print test more permissive on exit code 1: check for
unhandled crash indicators instead of specific error messages
(error text varies by CI environment)