Route every INSERT/UPDATE/DELETE/REPLACE against .gsd/gsd.db through typed
wrappers in gsd-db.ts and add a structural test that fails CI if a new
bypass appears. Previously 13 call sites across 10 modules reached into
_getAdapter() and issued raw write SQL, making the "single writer"
architecture unenforceable in-process.
New wrappers in gsd-db.ts: deleteDecisionById, deleteRequirementById,
deleteArtifactByPath, clearEngineHierarchy, insertOrIgnoreSlice,
insertOrIgnoreTask, setSliceReplanTriggeredAt, upsertQualityGate,
restoreManifest, bulkInsertLegacyHierarchy, readTransaction, and eight
memory-store helpers (insertMemoryRow, rewriteMemoryId, etc).
workflow-manifest.restore() is lifted verbatim into gsd-db.restoreManifest
with a type-only import of StateManifest to avoid circular runtime deps.
tools/workflow-tool-executors and workflow-manifest.snapshotState swap
their manual BEGIN DEFERRED/COMMIT/ROLLBACK dance for readTransaction().
unit-ownership.ts stays outside the invariant: it writes to a separate
.gsd/unit-claims.db by design.
tests/single-writer-invariant.test.ts walks every .ts file under gsd/
(excluding tests/ and the allowlist) and fails with a grouped violations
list on any regex match for .prepare/.exec raw writes, plus a positive
assertion that gsd-db.ts still exports each expected wrapper so the
structural test can't silently become a no-op.
https://claude.ai/code/session_01FZgXD3bjcddoFYsTEY6JhC
- Include dist/ and packages/*/dist/ in the TypeScript incremental cache
so that when tsbuildinfo indicates no changes, the compiled output files
are still present. Without this, tsc with incremental:true skips emission
when tsbuildinfo exists but dist/ is absent (fresh checkout + cache restore),
causing downstream packages like @gsd/pi-tui to fail resolving @gsd/native
subpath exports.
- Also hash source files in the cache key so dist is invalidated on code changes.
- Replace process.stderr.write with logWarning("bootstrap", ...) in catch blocks
to satisfy the workflow-logger coverage test (#3348).
- Update extension-bootstrap-isolation tests to match the new logWarning pattern.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
On Windows, static imports in register-shortcuts.ts (added in v2.72) can
fail at module load time, causing the entire GSD extension to silently fail
to register. This makes /gsd unavailable despite the welcome screen
suggesting it.
Three changes:
- index.ts: register /gsd command before importing register-extension.js,
wrapped in try-catch so bootstrap failures don't prevent core command
- register-extension.ts: remove duplicate registerGSDCommand call, wrap
non-critical registrations (tools, shortcuts, hooks) in individual
try-catch blocks so one failure doesn't prevent others
- Add structural contract tests verifying isolation properties
Closes#4168, closes#4172
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Read-side twin of #4175. `deriveStateFromDb` had a SUMMARY-file fallback
that could mark a milestone complete even when the DB row said otherwise,
allowing an orphan SUMMARY.md (crashed complete-milestone turn, partial
merge, manual edit) to cascade into a false auto-merge.
- buildCompletenessSet: drop SUMMARY fallback; only DB status decides.
- buildRegistryAndFindActive: remove SUMMARY from the completeness check;
still consult SUMMARY as a title fallback for DB-certified milestones.
- allSlicesDone branch: drop `!summaryFile` clause so a terminal-validation
+ orphan-SUMMARY path flows through to `completing-milestone` instead of
short-circuiting, letting complete-milestone re-run idempotently.
Regression tests: orphan SUMMARY with in-flight slice stays active; orphan
SUMMARY with all-slices-done + validation-terminal lands at
completing-milestone (does not report as already complete).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Guards the three cooperating fixes shipped in #4178 via source inspection
so a future refactor cannot silently reintroduce the false-merge path:
- stopAuto now uses the DB getMilestone() status as the authoritative
milestone-complete signal (falls back to SUMMARY presence only when
the project DB is unavailable).
- postUnitPreVerification pauses auto-mode for complete-milestone after
retries are exhausted instead of writing a stub blocker placeholder.
- recoverTimedOutUnit pauses complete-milestone instead of writing a
stub blocker placeholder.
Unblocks the CI lint / require-tests.sh gate on PR #4178.
When complete-milestone failed verification, auto-mode could end up merging
the worktree to main anyway and emit a metadata-only merge warning, creating
a misleading near-complete signal while the SUMMARY was never actually written.
The blocker-placeholder path for complete-milestone wrote a stub SUMMARY
without updating DB status, and stopAuto's SUMMARY-presence check treated
the stub as a legitimate completion signal.
- auto-post-unit.ts: skip blocker placeholder and pause auto-mode on
complete-milestone verification retry exhaustion.
- auto-timeout-recovery.ts: same guard for the idle/hard timeout path.
- auto.ts: make stopAuto Step 4 DB-authoritative (getMilestone.status ===
"complete") with SUMMARY-presence fallback only for DB-unavailable
legacy projects.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
When validate-milestone wrote VALIDATION.md with verdict=needs-remediation
but the agent failed to call gsd_reassess_roadmap to add remediation
slices, state.ts re-derived phase: validating-milestone indefinitely
because the existing #3596/#3670 guard treats needs-remediation as
non-terminal regardless of whether new work was queued. The stuck
detector only fired after 3 consecutive dispatches (~$3 + ~12 min wasted
per incident). Reproduced on M022 and M024.
Add a post-unit guard in runPostUnitVerification for validate-milestone
units: if VALIDATION.md verdict is needs-remediation and no incomplete
slices exist for the milestone (DB-authoritative via getMilestoneSlices,
filesystem fallback via parseRoadmap), pause auto-mode immediately with
a clear blocker. The legitimate re-validation flow is preserved — when
remediation slices have been queued (any non-closed status), the guard
returns continue and the existing state machine handles the work.
Tests cover: pause on all-closed scenario, skipped-status handled as
closed, continue when a queued remediation slice exists, continue on
verdict=pass, and continue when no VALIDATION file is present.
Extracts the step-complete notification text into buildStepCompleteMessage
and STEP_COMPLETE_FALLBACK_MESSAGE so the copy can be unit-tested
directly (milestone complete, mid-flight with next unit, unknown phase,
and deriveState-failure fallback). Resolves require-tests CI failure
on PR #4173.
In step mode, /gsd would run one unit and then silently exit the auto
loop, leaving users with no hint that they should /clear and /gsd again
to run the next step. Emit an info notify before returning "step-wizard"
from postUnitPostVerification so the TUI surfaces the next unit label
and the /clear + /gsd guidance (or /gsd auto to switch to auto mode).
Falls back to a generic message if deriveState throws, and handles the
milestone-complete case with a dedicated review message.
https://claude.ai/code/session_015yrPQbZTyJPqTsM654Ym3s
The #4162 refactor removed parseCliArgs' inline --help handler assuming
loader.ts's fast-path covered it, but loader.ts only intercepts --help/-h
as argv[1]. That broke:
- gsd update --help — fell through to runUpdate() (subcommand help
check sat dead-code below the update handler)
- gsd --unknown --help in non-TTY — tripped the TTY gate and exited 1
Move the subcommand-help check ahead of every subcommand handler and
fall back to general help when no subcommand matches, so --help wins
whenever it appears anywhere in argv.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- DynamicBorder: verify lastExternalRender tracking suppresses redundant
renders during streaming, and standalone renders fire when idle
- TUI clearOnShrink: verify debounce flag lifecycle — deferred shrink
preserves maxLinesRendered, flag resets when content grows back
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Tests now expect:
- pauseAuto instead of stopAuto for blocked state (resumable)
- phase:"planning" instead of "blocked" when partial-dep fallback
picks a slice (slice-level only; milestone-level blocked unchanged)
- activeSlice set via fallback instead of null
rebuildChatFromMessages() called populatePinnedFromMessages() which
re-populated the pinned zone with text already present in the chat
history, causing visible duplication during session state changes.
Additionally, the spinner interval at 80ms generated ~12.5 renders/s
for a purely cosmetic animation, and clearOnShrink triggered
unnecessary full redraws during pinned-zone transitions.
- Remove populatePinnedFromMessages() from rebuildChatFromMessages()
and add pinnedMessageContainer.clear() instead — the streaming
lifecycle in chat-controller manages pinned content during active work
- Reduce spinner interval 80ms→200ms with render-batching that skips
redundant renders when streaming already triggers requestRender()
- Debounce clearOnShrink: defer full redraw by one render tick so
pinned-clear→new-streaming transitions avoid a wasted full redraw
- Increase notification widget safety-net timer 5s→30s since the
store subscription already handles push-based updates
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Change phase:"blocked" from stopAuto to pauseAuto — sessions are now
resumable instead of requiring manual /gsd auto restart
- Default reassess_after_slice to true — reassessment fires after every
slice completion unless explicitly disabled (was opt-in, causing missed
reassessments in multi-slice milestones)
- Change dispatch no-match fallthrough from level:"info" (hard stop) to
level:"warning" (pause) — unhandled phases are now recoverable
- Add dependency-resolution fallback in resolveSliceDependencies — when
no slice has ALL deps satisfied, picks the one with the most deps met
instead of immediately returning blocked (both DB and file-based paths)
Tracked output from 2022 (commit d93956ba4) that's missing the modern
GSD_HOME env support and webPreferencesPath export present in the .ts
source. No runtime path consumes it, but the test compile script's
copyAssets step overlays src/* onto esbuild output in dist-test, so the
stale .js was shadowing the compiled app-paths and breaking any unit
test transitively importing webPreferencesPath.
Cover the canonical parseCliArgs export in cli-web-branch.ts including
the new mcp mode, worktree flag (boolean and named forms), and existing
short flags, web mode flags, list flags, and positional message handling.
Also remove src/app-paths.js — a stale tracked output (last touched in
2022, missing GSD_HOME and webPreferencesPath exports). The test compile
script copies all of src/ over esbuild's output, so this stale .js was
shadowing the compiled app-paths in dist-test and breaking any test that
transitively imported it. No runtime path uses it (production loads from
dist/app-paths.js; jiti/tsx prefer the .ts source).
Satisfies require-tests.sh on PR #4162.
Pure deletion/deduplication pass on top-level src/*.ts. External behavior
unchanged; all targeted unit tests still pass.
cli.ts (−170 net lines)
- Adopt canonical validateConfiguredModel from startup-model-validation.ts;
delete the drifted local copy with hardcoded model fallbacks.
- Import CliFlags + parseCliArgs from cli-web-branch.ts instead of keeping
a second, 90%-identical parser; pass cliFlags directly into
runWebCliBranch instead of re-parsing process.argv.
- Extract 3 helpers for verbatim duplicates:
* printNonTtyErrorAndExit (TTY gate, 2 call sites)
* printExtensionErrors (extension load errors, 2 call sites)
* reapplyValidatedModelOnFallback (post-createAgentSession fix, 2 sites)
- Factor runHeadlessFromAuto helper shared by the `gsd auto` shorthand
and the auto-piped-stdout redirect.
- Collapse ensureRtkBootstrap from hand-rolled _done flag to a
promise-memoized doRtkBootstrap.
- Drop redundant validateConfiguredModel pre-createAgentSession calls
(the post-createAgentSession call is the correct one per #2626).
- Delete dead --version/-v and --help/-h fast paths (loader.ts already
handles these before cli.ts is imported).
cli-web-branch.ts
- Unify CliFlags with worktree, 'mcp' mode, and _selectedSessionPath.
- Drop unused help?/version? flags (loader.ts intercepts them).
onboarding.ts
- Add runStep<T>() helper with shared cancel/warn handling; collapse 4
near-identical try/catch blocks around runLlmStep, runWebSearchStep,
runRemoteQuestionsStep, runToolKeysStep.
- Delete trivial isCancelError helper (inlined as p.isCancel).
- Rewrite loadPico() adapter to build PicoModule from chalk so we can
drop the redundant picocolors dependency.
package.json / package-lock.json
- Remove picocolors direct dep (chalk remains the single color library).
The two new sub-turn shrink regression tests created a pinned
DynamicBorder (via message_update with pinnable text + tool) but never
emitted message_end, so the spinner's setInterval kept the test process
alive until CI timed out after 15 minutes. Append a message_end to
each test so the module-level pinnedBorder is torn down.
Commit c8c416802 (#4144) introduced module-level renderedSegments state
to track interleaved text/tool components per assistant turn, but never
reset it when an adapter shrinks streamingMessage.content[] back to 0/1
at a provider sub-turn boundary within one assistant lifecycle (the
claude-code adapter does this). Consequence chain: the segment walker
finds the stale text-run entry at startIndex=0, calls updateContent on
it with the new (shrunk) message, and the in-place edit destroys the
prior sub-turn's visible text. New tool blocks at contentIndex=1 then
collide with stale registrations, causing visual ordering corruption.
hasToolsInTurn stays sticky-true and lastPinnedText never clears, so
the pinned "Working - Latest Output" mirror freezes on the pre-shrink
snapshot.
Track lastContentLength explicitly. On shrink, clear renderedSegments,
reset lastPinnedText, and reset lastProcessedContentIndex so the
walker treats the new sub-turn as fresh segments that append after
prior sub-turn children. Prior history stays rendered as frozen
components; pendingTools and the spinner border are untouched.
Adds two regression tests in chat-controller-ordering.test.ts: one
verifies prior sub-turn components are not overwritten and new tools
append in content[] order after a shrink, the other verifies the
pinned markdown updates from the first sub-turn's text to the second
sub-turn's text across a shrink boundary.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
reconcileSliceTasks called updateTaskStatus without a completedAt
timestamp, leaving tasks.completed_at NULL for all tasks completed
via the file-existence reconcile path.
Closes#4129
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>