Add 12 commands that exist in GSDv1 but had no v2 equivalent:
High priority:
- ship: Create PR from milestone artifacts (title, body, metrics)
- add-slice: Append slice to roadmap via engine updateRoadmap()
- insert-slice: Insert slice at position with reordering
- remove-slice: Remove pending slice (--force for planned slices)
- do: Natural language routing via keyword matching (30 routes)
- session-report: Session cost/tokens/work summary (--json, --save)
Medium priority:
- backlog: Structured backlog with 999.x numbering (add/promote/remove)
- pr-branch: Clean PR branch filtering .gsd/ commits via cherry-pick
- add-tests: LLM-dispatched test generation for completed slices
- map-codebase: Codebase analysis (tech/arch/quality/concerns)
All slice mutations go through the engine's updateRoadmap() command,
preserving the single-writer architecture. No direct markdown edits.
Includes 46 unit tests across 6 test files, 2 prompt templates,
catalog entries with nested completions for all commands.
The workflow-logger per-unit buffer API (_resetLogs / drainAndSummarize /
formatForNotification) had zero callers outside tests, so accumulated
warnings never reached users as a consolidated post-unit alert and the
buffer leaked across units in the same Node process. Several state-layer
sites also silently swallowed errors that should have surfaced.
- auto/phases.ts: reset logger in runUnitPhase, drain + ctx.ui.notify in
runFinalize success path, drain in both finalize timeout branches so
timed-out unit logs don't bleed into the next iteration
- auto/detect-stuck.ts: enrich stuck reasons with summarizeLogs() so
recovery has the diagnostic context (read-only peek, no drain)
- auto.ts: call setLogBasePath(base) in startAuto to pin the audit log
on /clear resume and hot-reload paths that bypass dynamic-tools bootstrap
- workflow-manifest.ts: log snapshotState ROLLBACK failures (split-brain
signal) instead of silently swallowing them
- state.ts: log reconcileDiskToDb roadmap read failures instead of silent
continue
- workflow-projections.ts: log renderStateProjection DB handle probe
failures instead of silent return
New regression tests cover the phases.ts wiring (source-scan), setLogBasePath
in startAuto, detect-stuck enrichment runtime behavior (including the
read-only peek invariant), and the three silent-catch fixes.
Route every INSERT/UPDATE/DELETE/REPLACE against .gsd/gsd.db through typed
wrappers in gsd-db.ts and add a structural test that fails CI if a new
bypass appears. Previously 13 call sites across 10 modules reached into
_getAdapter() and issued raw write SQL, making the "single writer"
architecture unenforceable in-process.
New wrappers in gsd-db.ts: deleteDecisionById, deleteRequirementById,
deleteArtifactByPath, clearEngineHierarchy, insertOrIgnoreSlice,
insertOrIgnoreTask, setSliceReplanTriggeredAt, upsertQualityGate,
restoreManifest, bulkInsertLegacyHierarchy, readTransaction, and eight
memory-store helpers (insertMemoryRow, rewriteMemoryId, etc).
workflow-manifest.restore() is lifted verbatim into gsd-db.restoreManifest
with a type-only import of StateManifest to avoid circular runtime deps.
tools/workflow-tool-executors and workflow-manifest.snapshotState swap
their manual BEGIN DEFERRED/COMMIT/ROLLBACK dance for readTransaction().
unit-ownership.ts stays outside the invariant: it writes to a separate
.gsd/unit-claims.db by design.
tests/single-writer-invariant.test.ts walks every .ts file under gsd/
(excluding tests/ and the allowlist) and fails with a grouped violations
list on any regex match for .prepare/.exec raw writes, plus a positive
assertion that gsd-db.ts still exports each expected wrapper so the
structural test can't silently become a no-op.
https://claude.ai/code/session_01FZgXD3bjcddoFYsTEY6JhC
On Windows, execSync spawns cmd.exe which cannot resolve git when Git for
Windows is installed via MSYS2/bash but not in cmd.exe's PATH. This caused
every auto-commit to fail silently, leaving all milestone work uncommitted.
All other fallback paths in native-git-bridge already use execFileSync — the
three affected functions were the outliers. execFileSync resolves the binary
directly without a shell intermediary and supports identical stdio/input options.
Closes#4180
- Include dist/ and packages/*/dist/ in the TypeScript incremental cache
so that when tsbuildinfo indicates no changes, the compiled output files
are still present. Without this, tsc with incremental:true skips emission
when tsbuildinfo exists but dist/ is absent (fresh checkout + cache restore),
causing downstream packages like @gsd/pi-tui to fail resolving @gsd/native
subpath exports.
- Also hash source files in the cache key so dist is invalidated on code changes.
- Replace process.stderr.write with logWarning("bootstrap", ...) in catch blocks
to satisfy the workflow-logger coverage test (#3348).
- Update extension-bootstrap-isolation tests to match the new logWarning pattern.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
On Windows, static imports in register-shortcuts.ts (added in v2.72) can
fail at module load time, causing the entire GSD extension to silently fail
to register. This makes /gsd unavailable despite the welcome screen
suggesting it.
Three changes:
- index.ts: register /gsd command before importing register-extension.js,
wrapped in try-catch so bootstrap failures don't prevent core command
- register-extension.ts: remove duplicate registerGSDCommand call, wrap
non-critical registrations (tools, shortcuts, hooks) in individual
try-catch blocks so one failure doesn't prevent others
- Add structural contract tests verifying isolation properties
Closes#4168, closes#4172
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Read-side twin of #4175. `deriveStateFromDb` had a SUMMARY-file fallback
that could mark a milestone complete even when the DB row said otherwise,
allowing an orphan SUMMARY.md (crashed complete-milestone turn, partial
merge, manual edit) to cascade into a false auto-merge.
- buildCompletenessSet: drop SUMMARY fallback; only DB status decides.
- buildRegistryAndFindActive: remove SUMMARY from the completeness check;
still consult SUMMARY as a title fallback for DB-certified milestones.
- allSlicesDone branch: drop `!summaryFile` clause so a terminal-validation
+ orphan-SUMMARY path flows through to `completing-milestone` instead of
short-circuiting, letting complete-milestone re-run idempotently.
Regression tests: orphan SUMMARY with in-flight slice stays active; orphan
SUMMARY with all-slices-done + validation-terminal lands at
completing-milestone (does not report as already complete).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Guards the three cooperating fixes shipped in #4178 via source inspection
so a future refactor cannot silently reintroduce the false-merge path:
- stopAuto now uses the DB getMilestone() status as the authoritative
milestone-complete signal (falls back to SUMMARY presence only when
the project DB is unavailable).
- postUnitPreVerification pauses auto-mode for complete-milestone after
retries are exhausted instead of writing a stub blocker placeholder.
- recoverTimedOutUnit pauses complete-milestone instead of writing a
stub blocker placeholder.
Unblocks the CI lint / require-tests.sh gate on PR #4178.
When complete-milestone failed verification, auto-mode could end up merging
the worktree to main anyway and emit a metadata-only merge warning, creating
a misleading near-complete signal while the SUMMARY was never actually written.
The blocker-placeholder path for complete-milestone wrote a stub SUMMARY
without updating DB status, and stopAuto's SUMMARY-presence check treated
the stub as a legitimate completion signal.
- auto-post-unit.ts: skip blocker placeholder and pause auto-mode on
complete-milestone verification retry exhaustion.
- auto-timeout-recovery.ts: same guard for the idle/hard timeout path.
- auto.ts: make stopAuto Step 4 DB-authoritative (getMilestone.status ===
"complete") with SUMMARY-presence fallback only for DB-unavailable
legacy projects.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
When validate-milestone wrote VALIDATION.md with verdict=needs-remediation
but the agent failed to call gsd_reassess_roadmap to add remediation
slices, state.ts re-derived phase: validating-milestone indefinitely
because the existing #3596/#3670 guard treats needs-remediation as
non-terminal regardless of whether new work was queued. The stuck
detector only fired after 3 consecutive dispatches (~$3 + ~12 min wasted
per incident). Reproduced on M022 and M024.
Add a post-unit guard in runPostUnitVerification for validate-milestone
units: if VALIDATION.md verdict is needs-remediation and no incomplete
slices exist for the milestone (DB-authoritative via getMilestoneSlices,
filesystem fallback via parseRoadmap), pause auto-mode immediately with
a clear blocker. The legitimate re-validation flow is preserved — when
remediation slices have been queued (any non-closed status), the guard
returns continue and the existing state machine handles the work.
Tests cover: pause on all-closed scenario, skipped-status handled as
closed, continue when a queued remediation slice exists, continue on
verdict=pass, and continue when no VALIDATION file is present.
Extracts the step-complete notification text into buildStepCompleteMessage
and STEP_COMPLETE_FALLBACK_MESSAGE so the copy can be unit-tested
directly (milestone complete, mid-flight with next unit, unknown phase,
and deriveState-failure fallback). Resolves require-tests CI failure
on PR #4173.
In step mode, /gsd would run one unit and then silently exit the auto
loop, leaving users with no hint that they should /clear and /gsd again
to run the next step. Emit an info notify before returning "step-wizard"
from postUnitPostVerification so the TUI surfaces the next unit label
and the /clear + /gsd guidance (or /gsd auto to switch to auto mode).
Falls back to a generic message if deriveState throws, and handles the
milestone-complete case with a dedicated review message.
https://claude.ai/code/session_015yrPQbZTyJPqTsM654Ym3s
The #4162 refactor removed parseCliArgs' inline --help handler assuming
loader.ts's fast-path covered it, but loader.ts only intercepts --help/-h
as argv[1]. That broke:
- gsd update --help — fell through to runUpdate() (subcommand help
check sat dead-code below the update handler)
- gsd --unknown --help in non-TTY — tripped the TTY gate and exited 1
Move the subcommand-help check ahead of every subcommand handler and
fall back to general help when no subcommand matches, so --help wins
whenever it appears anywhere in argv.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Tests now expect:
- pauseAuto instead of stopAuto for blocked state (resumable)
- phase:"planning" instead of "blocked" when partial-dep fallback
picks a slice (slice-level only; milestone-level blocked unchanged)
- activeSlice set via fallback instead of null
rebuildChatFromMessages() called populatePinnedFromMessages() which
re-populated the pinned zone with text already present in the chat
history, causing visible duplication during session state changes.
Additionally, the spinner interval at 80ms generated ~12.5 renders/s
for a purely cosmetic animation, and clearOnShrink triggered
unnecessary full redraws during pinned-zone transitions.
- Remove populatePinnedFromMessages() from rebuildChatFromMessages()
and add pinnedMessageContainer.clear() instead — the streaming
lifecycle in chat-controller manages pinned content during active work
- Reduce spinner interval 80ms→200ms with render-batching that skips
redundant renders when streaming already triggers requestRender()
- Debounce clearOnShrink: defer full redraw by one render tick so
pinned-clear→new-streaming transitions avoid a wasted full redraw
- Increase notification widget safety-net timer 5s→30s since the
store subscription already handles push-based updates
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Change phase:"blocked" from stopAuto to pauseAuto — sessions are now
resumable instead of requiring manual /gsd auto restart
- Default reassess_after_slice to true — reassessment fires after every
slice completion unless explicitly disabled (was opt-in, causing missed
reassessments in multi-slice milestones)
- Change dispatch no-match fallthrough from level:"info" (hard stop) to
level:"warning" (pause) — unhandled phases are now recoverable
- Add dependency-resolution fallback in resolveSliceDependencies — when
no slice has ALL deps satisfied, picks the one with the most deps met
instead of immediately returning blocked (both DB and file-based paths)
Tracked output from 2022 (commit d93956ba4) that's missing the modern
GSD_HOME env support and webPreferencesPath export present in the .ts
source. No runtime path consumes it, but the test compile script's
copyAssets step overlays src/* onto esbuild output in dist-test, so the
stale .js was shadowing the compiled app-paths and breaking any unit
test transitively importing webPreferencesPath.
Cover the canonical parseCliArgs export in cli-web-branch.ts including
the new mcp mode, worktree flag (boolean and named forms), and existing
short flags, web mode flags, list flags, and positional message handling.
Also remove src/app-paths.js — a stale tracked output (last touched in
2022, missing GSD_HOME and webPreferencesPath exports). The test compile
script copies all of src/ over esbuild's output, so this stale .js was
shadowing the compiled app-paths in dist-test and breaking any test that
transitively imported it. No runtime path uses it (production loads from
dist/app-paths.js; jiti/tsx prefer the .ts source).
Satisfies require-tests.sh on PR #4162.
Pure deletion/deduplication pass on top-level src/*.ts. External behavior
unchanged; all targeted unit tests still pass.
cli.ts (−170 net lines)
- Adopt canonical validateConfiguredModel from startup-model-validation.ts;
delete the drifted local copy with hardcoded model fallbacks.
- Import CliFlags + parseCliArgs from cli-web-branch.ts instead of keeping
a second, 90%-identical parser; pass cliFlags directly into
runWebCliBranch instead of re-parsing process.argv.
- Extract 3 helpers for verbatim duplicates:
* printNonTtyErrorAndExit (TTY gate, 2 call sites)
* printExtensionErrors (extension load errors, 2 call sites)
* reapplyValidatedModelOnFallback (post-createAgentSession fix, 2 sites)
- Factor runHeadlessFromAuto helper shared by the `gsd auto` shorthand
and the auto-piped-stdout redirect.
- Collapse ensureRtkBootstrap from hand-rolled _done flag to a
promise-memoized doRtkBootstrap.
- Drop redundant validateConfiguredModel pre-createAgentSession calls
(the post-createAgentSession call is the correct one per #2626).
- Delete dead --version/-v and --help/-h fast paths (loader.ts already
handles these before cli.ts is imported).
cli-web-branch.ts
- Unify CliFlags with worktree, 'mcp' mode, and _selectedSessionPath.
- Drop unused help?/version? flags (loader.ts intercepts them).
onboarding.ts
- Add runStep<T>() helper with shared cancel/warn handling; collapse 4
near-identical try/catch blocks around runLlmStep, runWebSearchStep,
runRemoteQuestionsStep, runToolKeysStep.
- Delete trivial isCancelError helper (inlined as p.isCancel).
- Rewrite loadPico() adapter to build PicoModule from chalk so we can
drop the redundant picocolors dependency.
package.json / package-lock.json
- Remove picocolors direct dep (chalk remains the single color library).
reconcileSliceTasks called updateTaskStatus without a completedAt
timestamp, leaving tasks.completed_at NULL for all tasks completed
via the file-existence reconcile path.
Closes#4129
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
InteractiveMode.renderWidgets() called Container.clear() on the
widgetContainerAbove/Below render mounts, which disposed every mounted
extension widget and then re-added the now-dead components. In AUTO mode
updateProgressWidget re-registers gsd-progress on every unit dispatch,
so gsd-notifications and gsd-health had their refresh timers and store
subscriptions killed after the first dispatch. Renders kept returning
the widgets' frozen cachedLines, making them look alive but never update
(/gsd notifications clear appeared to do nothing, belowEditor last-commit
went stale while the top-of-screen dashboard stayed correct).
Split detach from dispose: add Container.detachChildren() and use it from
the two widget-mount call sites. clear() still disposes for every other
caller (chat, editor, status, pinned-message containers). The
extensionWidgets* maps remain the single owner of widget disposal via
removeExisting() and clearExtensionWidgets().
While in AUTO, gsd-progress duplicates gsd-health on last commit, cost/
budget, and the health signal. Make gsd-progress the single source of
truth: hide gsd-health from auto-start and re-register it from every
exit point in auto.ts (lock-lost stop, cleanupAfterLoopExit !paused
guard, stopAuto, pauseAuto). gsd-notifications stays visible — it is
independent state and, with the detach fix, its subscription + 5s
refresh actually work again.
Tests: Container.detachChildren()/clear() contract guards added to
packages/pi-tui/src/__tests__/tui.test.ts. health-widget,
notification-{store,widget,overlay}, notifications-handler, notifications,
and auto-paused-ui-cleanup suites all pass.
When GSD is installed with `bun add -g`, running `gsd update` or
`/gsd update` previously shelled out to `npm install -g`, which fails
with EACCES on systems where npm has no write access to the global
node_modules directory.
Adds `resolveInstallCommand(pkg)` to `update-check.ts` that returns
`bun add -g <pkg>` when `process.versions.bun` is defined (i.e. the
current runtime is Bun), and `npm install -g <pkg>` otherwise. All
three update paths — `update-cmd.ts`, `commands-handlers.ts`, and the
interactive startup prompt in `update-check.ts` — now use this helper,
including the fallback error message shown to the user.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
CI on #4141 failed because threading an explicit flatRateCtx parameter
through resolvePreferredModelConfig broke two contracts the test suite
locks in:
1. interactive-routing-bypass (#3962) asserts that
resolvePreferredModelConfig is invoked with exactly three positional
arguments and that its `if (!isAutoMode) return undefined` guard
lives within the first 600 chars of the function body. The new
flatRateCtx param + JSDoc pushed the guard past that window and
lengthened the call site.
2. silent-catch-diagnostics (#3348) requires migrated files to route
through workflow-logger instead of leaving empty catch blocks. The
new buildFlatRateContext() swallowed registry lookup errors with a
comment-only catch.
Fix both without regressing flat-rate detection:
- Hang the flat-rate context off autoModeStartModel itself via an
optional `flatRateCtx` field. selectAndApplyModel now enriches
autoModeStartModel up front (preserving the variable name) and
resolvePreferredModelConfig reads autoModeStartModel.flatRateCtx —
signature shrinks back to three params, call site returns to the
3-arg form the test anchors on.
- Replace the empty catch in buildFlatRateContext() with a
logWarning("dispatch", ...) that surfaces the lookup failure while
still falling through with authMode undefined, matching the
fail-closed policy everywhere else in the file.
The 3-entry hard-coded FLAT_RATE_PROVIDERS set in auto-model-selection.ts
treated only github-copilot/copilot/claude-code as flat-rate, so dynamic
routing would happily downgrade units on user-registered subscription
proxies and any externalCli CLI wrapper — quality loss with no cost
benefit for users whose provider charges a flat rate per request.
Make isFlatRateProvider extensible by composing three signals:
1. Built-in list (unchanged, wins first for regression safety).
2. externalCli auto-detection via ctx.modelRegistry.getProviderAuthMode()
— any CLI wrapper around the user's subscription is inherently
flat-rate.
3. User-declared `flat_rate_providers` preference for private
subscription-backed proxies, enterprise-gated deployments, and custom
CLI wrappers the built-in list doesn't know about.
Add a buildFlatRateContext() helper so every call site constructs the
context the same way and degrades gracefully when ctx/prefs/registry are
unavailable (never breaks flat-rate detection).
Thread the context through:
- resolvePreferredModelConfig (routing synthesis guard)
- selectAndApplyModel primary-model and fallback provider checks
- auto-start.ts dynamic-routing banner so the startup message matches
dispatch-time reality
Preferences:
- Add `flat_rate_providers?: string[]` to GSDPreferences and
KNOWN_PREFERENCE_KEYS in preferences-types.ts.
- Add a string-array validator in preferences-validation.ts that trims
whitespace and drops empty entries.
Tests:
- Extend flat-rate-routing-guard.test.ts with 13 new cases covering
externalCli auto-detection, userFlatRate preference matching
(case-insensitive), combined signals, buildFlatRateContext() behavior
(including registry-lookup-throws and non-canonical auth-mode
responses), plus regression cases for the built-in list.
- Add 5 validator cases in preferences.test.ts for the new
flat_rate_providers field (string-array accepted, whitespace trimmed,
non-array rejected, non-string elements rejected, known-key warning
check).
When a user picks a custom-provider model via /gsd model (Ollama, vLLM,
LM Studio, OpenAI-compatible proxies — anything defined in
~/.gsd/agent/models.json) and then runs /gsd auto, the bootstrap silently
swaps it out for whichever model PREFERENCES.md happens to list. That
model is invariably a built-in provider (claude-code, anthropic) the user
isn't logged into, so auto-mode immediately fails with
"Not logged in · Please run /login", pauses, and resets the session to
claude-code/claude-sonnet-4-6.
Root cause: #3517 made resolveDefaultSessionModel() (PREFERENCES.md) take
priority over ctx.model (settings.json) in auto-start.ts. That fix was
correct for the scenario where settings.json had a stale built-in default
but PREFERENCES.md was freshly configured, but it has no awareness of
custom providers — PREFERENCES.md cannot reference them, so honoring it
when the session provider is custom always discards the user's explicit
choice.
Add isCustomProvider() to preferences-models.ts which checks whether a
provider is declared in ~/.gsd/agent/models.json (with ~/.pi/agent
fallback). Read the file directly with JSON.parse to avoid pulling in
the model-registry at this call site, and treat any read or parse error
as not-custom so a malformed models.json never breaks bootstrap.
In bootstrapAutoSession(), when the session provider is custom, use
ctx.model directly. Otherwise fall through to the existing #3517
behavior (preferredModel ?? ctx.model).
Tests:
- New behavioral regression in model-isolation.test.ts that mirrors
the auto-start.ts logic and verifies the four interesting cases:
custom session beats PREFERENCES.md, built-in session still defers
to PREFERENCES.md (#3517 preserved), custom session with no
PREFERENCES.md uses ctx.model, and null ctx.model falls through.
- New string-grep guard in auto-start-model-capture.test.ts that the
isCustomProvider() call is wired into the snapshot path.
- Updated #3517 grep to allow the new branching shape while still
asserting preferredModel remains a snapshot source for built-ins.
https://claude.ai/code/session_01QLYCeiXWjSFPEXFxjkSLni
* fix(ci): address 5 pipeline integrity issues from release audit
- version-stamp.mjs: regenerate package-lock.json after dev version stamp
(mirrors the same fix applied to bump-version.mjs in #4116)
- bump-version.mjs: regenerate root and web/package-lock.json after version
bump so both lockfiles are always in sync at release time
- pipeline.yml: add post-bump validation step that verifies all package.json
files parse as valid JSON before the release commit is made
- pipeline.yml: split "Commit, tag, and push" — commit+tag+rebase happen
before build, but git push is deferred until after build and npm publish
both succeed, preventing a broken tag from landing on main
- pipeline.yml: emit a :⚠️: annotation when live LLM tests fail so
failures are visible in the Actions UI instead of silently swallowed
Closes#4118
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix(gsd): address 3 silent-crash secondary issues from #3348 post-#3696
Three gaps that remained after the double-fault fix in #3696:
1. unhandledRejection not wired — installEpipeGuard only registered
uncaughtException; promise rejections that escaped without a catch
were not handled by the GSD error path. Added _gsdRejectionGuard
alongside _gsdEpipeGuard.
2. Non-fatal overcorrection — the #3696 fix replaced re-throwing with
log-and-continue, leaving the process running in an indeterminate
state after any non-EPIPE/non-ENOENT exception. Replaced with
writeCrashLog + process.exit(1). writeCrashLog is extracted into
bootstrap/crash-log.ts (zero deps) so tests can import it without
pulling in the full extension graph.
3. unit-end not emitted after crash-with-side-effects — hameltomor
observed that complete-milestone M001 wrote SUMMARY.md and updated
the DB but never emitted unit-end (#3348 comment-4237533440). Added
emitCrashRecoveredUnitEnd() in crash-recovery.ts: on the next
auto-mode startup, if a stale lock references a unit whose
unit-start has no matching unit-end in the journal, a synthetic
unit-end with status "crash-recovered" is emitted before the lock
is cleared. This closes the causal chain for downstream tooling
and forensics without requiring changes to the lock file schema.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>