`remote-questions-config.ts` was extracted in #592 to avoid crossing
the compiled/uncompiled boundary. However, it still imported
`getGlobalGSDPreferencesPath` from `preferences.ts` via a `.js`
extension — which fails at runtime because `preferences.ts` is
loaded via jiti and never compiled to `.js` in dist/.
This caused remote questions setup (Telegram/Slack/Discord) to fail
during `gsd config` with:
Cannot find module '.../preferences.js' imported from
.../remote-questions-config.js
Fix: inline the path constant directly. It's a single `join()` call
with no logic, so duplicating it is cleaner than adding a build step
or creating a separate compiled module just for this one export.
* feat(S01/T01): Scaffolded the `studio` Electron workspace with a workin…
- package.json
- studio/package.json
- studio/electron.vite.config.ts
- studio/src/main/index.ts
- studio/src/preload/index.ts
- studio/src/renderer/src/styles/index.css
- studio/src/renderer/src/App.tsx
* chore: init gsd
* fix(ci): add safe.directory for containerized pipeline job
The Dev Publish job runs inside a Docker container where the checkout
user differs from the container user (root), causing git's dubious
ownership check to reject git operations in version-stamp.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix(ci): remove .gsd/.gitignore from tracking
The no-gsd-dir CI check fails when .gsd/ exists as a directory, even
if only .gitignore is tracked inside it.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: add park/discard actions for in-progress milestones
Users could not discard, park, or skip milestones once work had begun.
The wizard only offered "Go auto" and "View status" for milestones with
a roadmap, trapping users with stale or deprioritized milestones.
This adds:
- Park mechanism: PARKED.md marker file in milestone directory.
deriveState() transparently skips parked milestones when finding the
active one. Parked milestones do NOT satisfy depends_on for downstream
milestones, preventing accidental unblocking.
- "Milestone actions" submenu in all four active-milestone wizard
branches (roadmap-exists, planning, summarizing, executing). Offers
Park / Discard / Skip / Back with clean navigation.
- /gsd park [id] and /gsd unpark [id] CLI subcommands for direct access.
- New module milestone-actions.ts with parkMilestone(), unparkMilestone(),
discardMilestone(), isParked(), getParkedReason() — keeps guided-flow
and commands thin.
- 14 tests (36 assertions) covering state derivation, dependency
semantics, park/unpark round-trip, discard with queue-order pruning,
and edge cases (all-parked, no-roadmap park, progress counts).
Files changed:
types.ts — Add 'parked' to MilestoneRegistryEntry.status
milestone-actions.ts — NEW: park/unpark/discard core logic
state.ts — Skip parked in getActiveMilestoneId + deriveState
guided-flow.ts — Milestone actions submenu in 4 wizard branches
commands.ts — /gsd park and /gsd unpark subcommands + help
guided-flow-queue.ts — Parked count in queue summary
visualizer-data.ts — Add 'parked' to VisualizerMilestone.status
park-milestone.test.ts — NEW: comprehensive test suite
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* test: add edge case tests for park/discard milestone interactions
Covers 9 critical scenarios (31 assertions):
- Discard breaks depends_on chain → system correctly blocks
- Park blocks depends_on chain
- Queue order survives discards (QUEUE-ORDER.json pruned)
- Park all + discard all → clean pre-planning state
- Mixed states coexist (complete + parked + active + pending)
- Park then discard same milestone
- Discard milestone that has deps on others
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: address critical review findings for park/discard feature
Fixes 7 issues found by adversarial code review:
1. CRITICAL: auto-mode crashed with "Unexpected: N incomplete" error when
all milestones were parked. Filter now excludes 'parked' status, and
pre-planning phase is recognized as a valid stop condition.
2. Merge-to-main was skipped when parked milestones existed — same
incomplete filter now excludes parked.
3. Completed milestones could be parked, corrupting depends_on
satisfaction. parkMilestone() now guards against SUMMARY.md existence.
4. Escape during park reason picker silently parked with literal
"not_yet" as reason. Now properly cancels the operation.
5. Parked milestones lost their human-readable title in registry
(showed ID instead). Phase 1 now caches roadmap for parked
milestones too, for title extraction.
6. GSD_MILESTONE_LOCK bypassed parked check — parallel workers locked
to a parked milestone now correctly return null.
7. Parked milestones were eligible for parallel execution, wasting
worker slots. parallel-eligibility now skips parked milestones.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: complete parked status display across all surfaces
- Visualizer: parked milestones show pause glyph (yellow) instead of
pending dot
- Doctor: parked milestones show pause emoji in registry report
- HTML export: add .dot-parked CSS (yellow), parked legend entry,
collapse parked milestone details by default
- Queue reorder: exclude parked milestones from movable list
Closes all remaining cosmetic findings from adversarial review.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat(ci): add version stamp script for dev publishes
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat(ci): add CLI smoke tests for pipeline test stage
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat(ci): add FixtureProvider for LLM conversation recording and replay
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat(ci): add fixture test runner and sample recordings
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat(ci): add live test stubs and pipeline npm scripts
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat(ci): add three-stage promotion pipeline workflow
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat(ci): add weekly cleanup workflow for stale dev versions
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat(ci): add fixture recording helper stub
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
.gsd/ contains per-worktree project state (milestones, db, decisions)
that should never be committed. Auto-commits were leaking these files
into the repo, causing the no-gsd-dir CI check to fail.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- token-profile.test.ts: read preferences-types, preferences-models, and
preferences-validation alongside preferences.ts for structural checks
- triage-dispatch.test.ts: search auto-post-unit.ts for triage/dispatch
markers that moved during extraction, update comment markers to match
actual code
- none-mode-gates.test.ts: skip "no prefs default" test when global
preferences file exists (cannot control ~/.gsd/preferences.md)
- preferences.test.ts: skip getIsolationMode default test (same reason)
Reduces test failures from 48 to 3 (all pre-existing: doctor-git,
worktree-e2e, stopAutoRemote).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: strip model variant suffix for all auth methods, not just OAuth (#1097)
The model ID variant suffix (e.g., `[1m]` in `claude-opus-4-6[1m]`) was
only stripped for OAuth token auth. When using an API key, the suffix was
sent to the Anthropic API as-is, causing a 400 "upstream_error" because
`claude-opus-4-6[1m]` is not a valid API model ID.
The default Anthropic model is `claude-opus-4-6[1m]` (1M context variant),
so every API key user hits this on every request.
Fix: strip `[...]` suffix unconditionally for all auth methods.
* fix: update source-reading tests for post-refactor file locations
triage-dispatch.test.ts: read auto-post-unit.ts (dispatch logic moved
from auto.ts) and update comment string matches to reflect renamed
section headers.
token-profile.test.ts: read preferences-types.ts, preferences-validation.ts,
and preferences-models.ts (GSDPreferences interface and validation logic
split from preferences.ts).
* feat: cache-ordered prompt assembly and dashboard cache hit rate
Add prompt section reordering for better Anthropic cache hit rates.
Sections are classified as static/semi-static/dynamic and reordered
so stable content appears first in the prefix.
- prompt-ordering.ts: section extraction, classification, and
reordering by cache stability (static -> semi-static -> dynamic)
- auto.ts: wire reorderForCaching into dispatch with logged warnings
on failure (not silent catch)
- auto-dashboard.ts: show cache hit rate percentage in progress widget
- dashboard-overlay.ts: show aggregate cache hit rate in status overlay
- auto-prompts.ts: respect compression_strategy preference before
compressing carry-forward sections
Includes 12 tests for reorderForCaching and analyzeCacheEfficiency.
Split from #1083 per review feedback.
* fix: update source-reading tests for post-refactor file locations
triage-dispatch.test.ts: read auto-post-unit.ts (dispatch logic moved
from auto.ts) and update comment string matches to reflect renamed
section headers.
token-profile.test.ts: read preferences-types.ts, preferences-validation.ts,
and preferences-models.ts (GSDPreferences interface and validation logic
split from preferences.ts).
* feat: add comprehensive API key manager (/gsd keys)
Add /gsd keys command with 6 subcommands for full API key lifecycle
management: list, add, remove, test, rotate, and doctor.
- list/status: Dashboard grouped by category (LLM, search, tool, remote)
with masked key previews, OAuth expiry, env var source detection
- add: Interactive provider picker with OAuth vs API key choice,
prefix validation, and env var activation
- remove: Multi-key support with individual or bulk removal
- test: Lightweight API validation per provider with latency reporting
and error classification (401/429/5xx/timeout)
- rotate: Remove-and-replace flow with optional pre-save validation
- doctor: Health checks for expired OAuth, empty keys, duplicates,
env var conflicts, file permissions, missing LLM provider
Includes unified provider registry (22 providers), tab completions,
and redirect from /gsd setup keys. 44 unit tests.
* fix: convert key-manager tests from vitest to node:test for CI typecheck
Extension tests use node:test + node:assert/strict (not vitest) since
tsconfig.extensions.json includes test files and vitest types are not
available in the CI typecheck step.
Extract three modules from the 1,348-line doctor.ts god file:
- doctor-types.ts: DoctorSeverity, DoctorIssueCode, DoctorIssue, DoctorReport, DoctorSummary
- doctor-format.ts: summarizeDoctorIssues, filterDoctorIssues, formatDoctorReport, formatDoctorIssuesForPrompt
- doctor-checks.ts: checkGitHealth, checkRuntimeHealth
All public exports are re-exported from doctor.ts so existing imports
from "./doctor.js" continue to work unchanged.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Extract milestone ID utilities (MILESTONE_ID_RE, generateMilestoneSuffix,
nextMilestoneId, extractMilestoneSeq, parseMilestoneId, milestoneIdSort,
maxMilestoneNum, findMilestoneIds) into milestone-ids.ts (~95 lines)
- Extract queue management (showQueue, handleQueueReorder, showQueueAdd,
buildExistingMilestonesContext) into guided-flow-queue.ts (~445 lines)
- Add re-exports from guided-flow.ts to preserve public API
- Fix circular dependency: queue-order.ts now imports milestoneIdSort
from milestone-ids.js instead of guided-flow.js
- guided-flow.ts reduced from 1611 to 1144 lines
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Extract types/interfaces/constants to preferences-types.ts (~200 lines),
validation logic to preferences-validation.ts (~490 lines), move skill
resolution into preferences-skills.ts (~160 lines), and model resolution
into preferences-models.ts (~270 lines). The retained preferences.ts
(~330 lines) handles loading, merging, rendering, hooks, and re-exports
all symbols so existing imports remain unmodified.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The high-risk card filter in buildBlockersSection only compared sliceId,
causing false positives when different milestones had slices with the
same ID (e.g. M001/S01 and M002/S01). Now matches on both milestoneId
and sliceId to correctly deduplicate.
* docs: add Node LTS pinning guide for macOS Homebrew users
New doc (docs/node-lts-macos.md) explains how to pin Node 24 LTS
via Homebrew to avoid running on odd-numbered development releases.
Covers brew install/link/pin, version managers as alternatives,
and verification steps.
Added notice banner in README linking to the guide.
* docs: document /gsd config for global API keys
Added Global API Keys section to configuration.md explaining:
- /gsd config saves keys to ~/.gsd/agent/auth.json
- Keys apply to all projects automatically
- Three supported keys: Tavily, Brave, Context7
- How precedence works (env vars > saved keys)
- Anthropic models don't need search keys
Updated commands.md gsd config entry to link to the new section.
Added Set up API keys section to getting-started.md for first-run.
* fix: prevent summarizing phase stall by retrying dropped agent_end events (#1072)
When handleAgentEnd dispatches a sub-unit (via hooks, triage, or quick-task
early-dispatch paths) and that unit completes before handleAgentEnd returns,
the resulting agent_end event is silently dropped by the reentrancy guard.
This leaves auto-mode active but permanently stalled — no unit running, no
watchdog set, process at high CPU doing nothing.
Add a pendingAgentEndRetry flag to AutoSession that the reentrancy guard sets
when it drops an agent_end event. The finally block in handleAgentEnd checks
this flag and schedules a deferred retry via setImmediate, ensuring the
completed unit's agent_end is always processed.
* fix: add dispatch stall guards to prevent auto-mode pause after slice completion (#1073)
After a slice completes all tasks, auto-mode can stall if newSession()
hangs or dispatchNextUnit gets permanently blocked at any await point.
The existing gap watchdog only fires AFTER dispatchNextUnit returns, so
it cannot recover from hangs inside the function itself.
- Wrap newSession() with Promise.race timeout (30s) to prevent permanent
hangs from session manager deadlocks or network issues
- Add pre-dispatch hang guard (60s) in handleAgentEnd that starts the
gap watchdog if dispatchNextUnit hasn't completed — catches hangs at
any await point (model selection, session creation, etc.)
- Add better diagnostics: notify user when session creation times out
or fails, with specific unit type/ID for debugging
When handleAgentEnd dispatches a sub-unit (via hooks, triage, or quick-task
early-dispatch paths) and that unit completes before handleAgentEnd returns,
the resulting agent_end event is silently dropped by the reentrancy guard.
This leaves auto-mode active but permanently stalled — no unit running, no
watchdog set, process at high CPU doing nothing.
Add a pendingAgentEndRetry flag to AutoSession that the reentrancy guard sets
when it drops an agent_end event. The finally block in handleAgentEnd checks
this flag and schedules a deferred retry via setImmediate, ensuring the
completed unit's agent_end is always processed.
* docs: add Node LTS pinning guide for macOS Homebrew users
New doc (docs/node-lts-macos.md) explains how to pin Node 24 LTS
via Homebrew to avoid running on odd-numbered development releases.
Covers brew install/link/pin, version managers as alternatives,
and verification steps.
Added notice banner in README linking to the guide.
* refactor: remove auto-draft-pause.test.ts — redundant with auto-dashboard.test.ts
auto-draft-pause.test.ts tested describeNextUnit() for needs-discussion,
pre-planning, and executing phases. All of these are already covered by
auto-dashboard.test.ts which has proper node:test structure.
The removed file also had fragile structural tests (string-matching source
code) that break on refactors. The behavioral coverage is complete in the
existing file.
1296 tests pass, 0 fail.
Addresses state safety issues found during #1062 deep dive:
1. completed-units.json writes in auto-worktree.ts and auto-worktree-sync.ts
used plain writeFileSync which could produce truncated/corrupt files on
crash, losing completion keys and causing unit re-dispatch. Switched to
atomicWriteSync (temp file + rename) for crash safety.
2. Plan file checkbox reconciliation in auto-worktree.ts also switched to
atomicWriteSync to prevent partial PLAN.md writes on crash.
3. db-writer.ts functions (saveDecisionToDb, updateRequirementInDb,
saveArtifactToDb) wrote markdown files via saveFile() without invalidating
caches afterward. Added targeted cache invalidation (state + path + parse)
so deriveState() always sees fresh data. Uses individual invalidation
functions rather than invalidateAllCaches() to avoid clearing the artifacts
table that was just written to.
The verification gate's discoverCommands() was passing prose descriptions
from task plan Verify: fields through sanitizeCommand(), which only checked
for shell injection characters. English prose like "Document exists, contains
all 5 scale names..." passed the filter and was executed via spawnSync,
causing exit code 127 false negatives.
Added isLikelyCommand() heuristic that distinguishes executable commands
from prose descriptions by checking:
- Known command prefixes (npm, node, tsc, eslint, etc.)
- Path-like first tokens (./script.sh, /usr/bin/check)
- Flag-like tokens (-v, --check)
- Uppercase-initial words with 4+ tokens (prose pattern)
- Comma-space clause separators (prose pattern)
Prose Verify: fields now fall through to package.json scripts or "none"
instead of being executed. Valid commands continue to work as before.
Closes#1066
When a model fails during auto-mode and the fallback chain is exhausted
(or absent), the error recovery path previously fell through to pause
without attempting to restore the session's original model. Meanwhile,
the fallback chain itself was read fresh from disk via
loadEffectiveGSDPreferences(), which could pick up models configured by
a different concurrent GSD session sharing the same global preferences
file.
This adds a session model recovery step between fallback exhaustion and
pause. After the existing fallback chain logic, we now check whether the
current model has diverged from the model captured at auto-mode start
(autoModeStartModel). If so, we restore the session model and retry
before giving up and pausing.
Changes:
- auto.ts: export getAutoModeStartModel() getter for the session's
captured start model
- index.ts: add session model recovery block after fallback chain
exhaustion, using the session-scoped model instead of re-reading
global preferences from disk
- model-isolation.test.ts: add 4 tests covering cross-session leakage
detection, divergence checks, and null safety