Commit graph

2382 commits

Author SHA1 Message Date
Jeremy McSpadden
3e37264e3f feat(gsd): add v1→v2 command parity — 12 missing commands
Add 12 commands that exist in GSDv1 but had no v2 equivalent:

High priority:
- ship: Create PR from milestone artifacts (title, body, metrics)
- add-slice: Append slice to roadmap via engine updateRoadmap()
- insert-slice: Insert slice at position with reordering
- remove-slice: Remove pending slice (--force for planned slices)
- do: Natural language routing via keyword matching (30 routes)
- session-report: Session cost/tokens/work summary (--json, --save)

Medium priority:
- backlog: Structured backlog with 999.x numbering (add/promote/remove)
- pr-branch: Clean PR branch filtering .gsd/ commits via cherry-pick
- add-tests: LLM-dispatched test generation for completed slices
- map-codebase: Codebase analysis (tech/arch/quality/concerns)

All slice mutations go through the engine's updateRoadmap() command,
preserving the single-writer architecture. No direct markdown edits.

Includes 46 unit tests across 6 test files, 2 prompt templates,
catalog entries with nested completions for all commands.
2026-04-14 17:47:33 -05:00
Jeremy McSpadden
d34adb1e5f Merge pull request #4199 from jeremymcs/claude/workflow-logger-state-system-ZNnYm
Wire workflow-logger through the state system
2026-04-14 17:16:13 -05:00
Jeremy McSpadden
f100b9d00a Merge pull request #4190 from NilsR0711/fix/native-git-bridge-exec-sync-windows
fix(gsd): replace execSync with execFileSync in nativeCommit, nativeIsRepo, nativeResetHard fallbacks
2026-04-14 15:25:34 -05:00
Jeremy McSpadden
c8d7474f57 Merge pull request #4192 from mastertyko/fix/4130-workflow-run-quoted-overrides
fix(gsd): preserve quoted workflow run overrides
2026-04-14 15:18:08 -05:00
Jeremy McSpadden
d749a010ad Merge pull request #4193 from mastertyko/fix/4123-headless-query-open-project-db
fix(gsd): open project DB in headless query
2026-04-14 15:16:17 -05:00
Claude
74ca7cd8cd Wire workflow-logger through the state system
The workflow-logger per-unit buffer API (_resetLogs / drainAndSummarize /
formatForNotification) had zero callers outside tests, so accumulated
warnings never reached users as a consolidated post-unit alert and the
buffer leaked across units in the same Node process. Several state-layer
sites also silently swallowed errors that should have surfaced.

- auto/phases.ts: reset logger in runUnitPhase, drain + ctx.ui.notify in
  runFinalize success path, drain in both finalize timeout branches so
  timed-out unit logs don't bleed into the next iteration
- auto/detect-stuck.ts: enrich stuck reasons with summarizeLogs() so
  recovery has the diagnostic context (read-only peek, no drain)
- auto.ts: call setLogBasePath(base) in startAuto to pin the audit log
  on /clear resume and hot-reload paths that bypass dynamic-tools bootstrap
- workflow-manifest.ts: log snapshotState ROLLBACK failures (split-brain
  signal) instead of silently swallowing them
- state.ts: log reconcileDiskToDb roadmap read failures instead of silent
  continue
- workflow-projections.ts: log renderStateProjection DB handle probe
  failures instead of silent return

New regression tests cover the phases.ts wiring (source-scan), setLogBasePath
in startAuto, detect-stuck enrichment runtime behavior (including the
read-only peek invariant), and the three silent-catch fixes.
2026-04-14 14:01:20 -05:00
Jeremy McSpadden
0d66433b4f Merge pull request #4198 from jeremymcs/claude/single-writer-database-pattern-qOaQb
Enforce single-writer invariant for engine database writes
2026-04-14 13:56:54 -05:00
Jeremy McSpadden
a57be5267d Merge branch 'main' into fix/4130-workflow-run-quoted-overrides 2026-04-14 13:43:01 -05:00
Jeremy McSpadden
c203bf69bf Merge branch 'main' into fix/4123-headless-query-open-project-db 2026-04-14 13:43:00 -05:00
Claude
5e9196e5c9 refactor(gsd): enforce single-writer invariant for engine DB
Route every INSERT/UPDATE/DELETE/REPLACE against .gsd/gsd.db through typed
wrappers in gsd-db.ts and add a structural test that fails CI if a new
bypass appears. Previously 13 call sites across 10 modules reached into
_getAdapter() and issued raw write SQL, making the "single writer"
architecture unenforceable in-process.

New wrappers in gsd-db.ts: deleteDecisionById, deleteRequirementById,
deleteArtifactByPath, clearEngineHierarchy, insertOrIgnoreSlice,
insertOrIgnoreTask, setSliceReplanTriggeredAt, upsertQualityGate,
restoreManifest, bulkInsertLegacyHierarchy, readTransaction, and eight
memory-store helpers (insertMemoryRow, rewriteMemoryId, etc).

workflow-manifest.restore() is lifted verbatim into gsd-db.restoreManifest
with a type-only import of StateManifest to avoid circular runtime deps.
tools/workflow-tool-executors and workflow-manifest.snapshotState swap
their manual BEGIN DEFERRED/COMMIT/ROLLBACK dance for readTransaction().
unit-ownership.ts stays outside the invariant: it writes to a separate
.gsd/unit-claims.db by design.

tests/single-writer-invariant.test.ts walks every .ts file under gsd/
(excluding tests/ and the allowlist) and fails with a grouped violations
list on any regex match for .prepare/.exec raw writes, plus a positive
assertion that gsd-db.ts still exports each expected wrapper so the
structural test can't silently become a no-op.

https://claude.ai/code/session_01FZgXD3bjcddoFYsTEY6JhC
2026-04-14 18:28:24 +00:00
mastertyko
8eb1fa515d fix(gsd): open project DB in headless query 2026-04-14 19:25:51 +02:00
mastertyko
fa507a2375 fix(gsd): preserve quoted workflow run overrides 2026-04-14 19:25:46 +02:00
Nils Reeh
508af42707 fix(gsd): replace execSync with execFileSync in nativeCommit, nativeIsRepo, nativeResetHard fallbacks
On Windows, execSync spawns cmd.exe which cannot resolve git when Git for
Windows is installed via MSYS2/bash but not in cmd.exe's PATH. This caused
every auto-commit to fail silently, leaving all milestone work uncommitted.

All other fallback paths in native-git-bridge already use execFileSync — the
three affected functions were the outliers. execFileSync resolves the binary
directly without a shell intermediary and supports identical stdio/input options.

Closes #4180
2026-04-14 19:17:59 +02:00
Nils Reeh
563a1e1b21 fix(ci): cache dist alongside tsbuildinfo and use workflow-logger in catch blocks
- Include dist/ and packages/*/dist/ in the TypeScript incremental cache
  so that when tsbuildinfo indicates no changes, the compiled output files
  are still present. Without this, tsc with incremental:true skips emission
  when tsbuildinfo exists but dist/ is absent (fresh checkout + cache restore),
  causing downstream packages like @gsd/pi-tui to fail resolving @gsd/native
  subpath exports.
- Also hash source files in the cache key so dist is invalidated on code changes.
- Replace process.stderr.write with logWarning("bootstrap", ...) in catch blocks
  to satisfy the workflow-logger coverage test (#3348).
- Update extension-bootstrap-isolation tests to match the new logWarning pattern.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 18:49:08 +02:00
Nils Reeh
517e7cf0fe fix(gsd): isolate /gsd command registration from extension bootstrap failures
On Windows, static imports in register-shortcuts.ts (added in v2.72) can
fail at module load time, causing the entire GSD extension to silently fail
to register. This makes /gsd unavailable despite the welcome screen
suggesting it.

Three changes:
- index.ts: register /gsd command before importing register-extension.js,
  wrapped in try-catch so bootstrap failures don't prevent core command
- register-extension.ts: remove duplicate registerGSDCommand call, wrap
  non-critical registrations (tools, shortcuts, hooks) in individual
  try-catch blocks so one failure doesn't prevent others
- Add structural contract tests verifying isolation properties

Closes #4168, closes #4172

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 18:36:19 +02:00
Jeremy
bc22ce95bc Cap thinking output for tool-bearing assistant turns 2026-04-14 10:15:43 -05:00
Jeremy McSpadden
2cafd66bed Merge pull request #4184 from jeremymcs/claude/refactor-code-cleanup-078AQ
fix: preserve image blocks in Claude Code SDK prompt path
2026-04-14 09:39:03 -05:00
Jeremy
01857ea180 fix(claude-code-cli): forward image blocks in SDK query prompt (#4183) 2026-04-14 09:30:02 -05:00
Jeremy McSpadden
7f77322fe2 Merge pull request #4182 from jeremymcs/claude/refactor-code-cleanup-078AQ
fix: keep assistant text visible when thinking traces are long
2026-04-14 09:17:49 -05:00
Jeremy
759bed7dae test: add regression coverage for thinking/chat visibility
Add a regression test for #4181 to ensure assistant-message caps thinking block height when text content is present.
2026-04-14 09:05:29 -05:00
Jeremy McSpadden
bdef500c85 Merge pull request #4178 from jeremymcs/fix/4175-complete-milestone-false-merge
fix(auto-mode): prevent false milestone merge after complete-milestone failure (#4175)
2026-04-14 07:53:41 -05:00
Jeremy
4a2045d290 fix(state): DB-authoritative milestone completeness (#4179)
Read-side twin of #4175. `deriveStateFromDb` had a SUMMARY-file fallback
that could mark a milestone complete even when the DB row said otherwise,
allowing an orphan SUMMARY.md (crashed complete-milestone turn, partial
merge, manual edit) to cascade into a false auto-merge.

- buildCompletenessSet: drop SUMMARY fallback; only DB status decides.
- buildRegistryAndFindActive: remove SUMMARY from the completeness check;
  still consult SUMMARY as a title fallback for DB-certified milestones.
- allSlicesDone branch: drop `!summaryFile` clause so a terminal-validation
  + orphan-SUMMARY path flows through to `completing-milestone` instead of
  short-circuiting, letting complete-milestone re-run idempotently.

Regression tests: orphan SUMMARY with in-flight slice stays active; orphan
SUMMARY with all-slices-done + validation-terminal lands at
completing-milestone (does not report as already complete).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-14 07:43:29 -05:00
Jeremy
d07b9bf473 test(#4175): add regression guards for complete-milestone false-merge
Guards the three cooperating fixes shipped in #4178 via source inspection
so a future refactor cannot silently reintroduce the false-merge path:

- stopAuto now uses the DB getMilestone() status as the authoritative
  milestone-complete signal (falls back to SUMMARY presence only when
  the project DB is unavailable).
- postUnitPreVerification pauses auto-mode for complete-milestone after
  retries are exhausted instead of writing a stub blocker placeholder.
- recoverTimedOutUnit pauses complete-milestone instead of writing a
  stub blocker placeholder.

Unblocks the CI lint / require-tests.sh gate on PR #4178.
2026-04-14 07:35:00 -05:00
Jeremy
9957152024 fix(auto-mode): prevent false milestone merge after complete-milestone failure (#4175)
When complete-milestone failed verification, auto-mode could end up merging
the worktree to main anyway and emit a metadata-only merge warning, creating
a misleading near-complete signal while the SUMMARY was never actually written.

The blocker-placeholder path for complete-milestone wrote a stub SUMMARY
without updating DB status, and stopAuto's SUMMARY-presence check treated
the stub as a legitimate completion signal.

- auto-post-unit.ts: skip blocker placeholder and pause auto-mode on
  complete-milestone verification retry exhaustion.
- auto-timeout-recovery.ts: same guard for the idle/hard timeout path.
- auto.ts: make stopAuto Step 4 DB-authoritative (getMilestone.status ===
  "complete") with SUMMARY-presence fallback only for DB-unavailable
  legacy projects.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-14 07:24:12 -05:00
Jeremy McSpadden
4f76d434b7 Merge pull request #4155 from NilsR0711/fix/completed-at-null-on-reconcile
fix(gsd): set completed_at when reconciling task status to complete
2026-04-14 06:44:07 -05:00
Jeremy McSpadden
cdcdb1459e Merge pull request #4176 from jeremymcs/worktree-fix-4094-validate-milestone-loop
fix(auto): pause validate-milestone loop when needs-remediation has no slices (#4094)
2026-04-14 06:42:51 -05:00
Jeremy McSpadden
a09b69e27d Merge pull request #4173 from jeremymcs/claude/gsd-step-guidance-5FNrM
Add user feedback when completing steps in step mode
2026-04-14 06:42:37 -05:00
Jeremy
6b21ccbd54 fix(auto): pause on validate-milestone needs-remediation without slices (#4094)
When validate-milestone wrote VALIDATION.md with verdict=needs-remediation
but the agent failed to call gsd_reassess_roadmap to add remediation
slices, state.ts re-derived phase: validating-milestone indefinitely
because the existing #3596/#3670 guard treats needs-remediation as
non-terminal regardless of whether new work was queued. The stuck
detector only fired after 3 consecutive dispatches (~$3 + ~12 min wasted
per incident). Reproduced on M022 and M024.

Add a post-unit guard in runPostUnitVerification for validate-milestone
units: if VALIDATION.md verdict is needs-remediation and no incomplete
slices exist for the milestone (DB-authoritative via getMilestoneSlices,
filesystem fallback via parseRoadmap), pause auto-mode immediately with
a clear blocker. The legitimate re-validation flow is preserved — when
remediation slices have been queued (any non-closed status), the guard
returns continue and the existing state machine handles the work.

Tests cover: pause on all-closed scenario, skipped-status handled as
closed, continue when a queued remediation slice exists, continue on
verdict=pass, and continue when no VALIDATION file is present.
2026-04-14 06:32:42 -05:00
Jeremy
2cec5a1014 test(gsd): cover step-mode completion message helper
Extracts the step-complete notification text into buildStepCompleteMessage
and STEP_COMPLETE_FALLBACK_MESSAGE so the copy can be unit-tested
directly (milestone complete, mid-flight with next unit, unknown phase,
and deriveState-failure fallback). Resolves require-tests CI failure
on PR #4173.
2026-04-14 06:11:53 -05:00
Jeremy McSpadden
5958184e2a Merge pull request #4162 from jeremymcs/claude/refactor-code-cleanup-078AQ
Refactor CLI arg parsing and consolidate shared helpers
2026-04-14 06:10:21 -05:00
Claude
8fec87b6f2 fix(gsd): notify users what to do next after /gsd step finishes
In step mode, /gsd would run one unit and then silently exit the auto
loop, leaving users with no hint that they should /clear and /gsd again
to run the next step. Emit an info notify before returning "step-wizard"
from postUnitPostVerification so the TUI surfaces the next unit label
and the /clear + /gsd guidance (or /gsd auto to switch to auto mode).
Falls back to a generic message if deriveState throws, and handles the
milestone-complete case with a dedicated review message.

https://claude.ai/code/session_015yrPQbZTyJPqTsM654Ym3s
2026-04-14 11:03:04 +00:00
Jeremy
1a8ba9a43b fix(cli): restore --help handling when it follows a subcommand or unknown flag
The #4162 refactor removed parseCliArgs' inline --help handler assuming
loader.ts's fast-path covered it, but loader.ts only intercepts --help/-h
as argv[1]. That broke:

- gsd update --help — fell through to runUpdate() (subcommand help
  check sat dead-code below the update handler)
- gsd --unknown --help in non-TTY — tripped the TTY gate and exited 1

Move the subcommand-help check ahead of every subcommand handler and
fall back to general help when no subcommand matches, so --help wins
whenever it appears anywhere in argv.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-14 05:50:47 -05:00
Jeremy McSpadden
78e8665c59 Merge pull request #4163 from deseltrus/fix/auto-mode-premature-stops
fix(auto): prevent premature auto-mode stops on blocked phase + missing reassessment
2026-04-14 05:28:11 -05:00
deseltrus
68bf425606 test: update assertions for blocked-phase behavior change
Tests now expect:
- pauseAuto instead of stopAuto for blocked state (resumable)
- phase:"planning" instead of "blocked" when partial-dep fallback
  picks a slice (slice-level only; milestone-level blocked unchanged)
- activeSlice set via fallback instead of null
2026-04-14 06:20:00 +02:00
deseltrus
73f9434d11 fix(tui): eliminate pinned output duplication and reduce render overhead
rebuildChatFromMessages() called populatePinnedFromMessages() which
re-populated the pinned zone with text already present in the chat
history, causing visible duplication during session state changes.
Additionally, the spinner interval at 80ms generated ~12.5 renders/s
for a purely cosmetic animation, and clearOnShrink triggered
unnecessary full redraws during pinned-zone transitions.

- Remove populatePinnedFromMessages() from rebuildChatFromMessages()
  and add pinnedMessageContainer.clear() instead — the streaming
  lifecycle in chat-controller manages pinned content during active work
- Reduce spinner interval 80ms→200ms with render-batching that skips
  redundant renders when streaming already triggers requestRender()
- Debounce clearOnShrink: defer full redraw by one render tick so
  pinned-clear→new-streaming transitions avoid a wasted full redraw
- Increase notification widget safety-net timer 5s→30s since the
  store subscription already handles push-based updates

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 06:17:18 +02:00
deseltrus
8919a07962 fix(auto): prevent premature auto-mode stops on blocked phase + missing reassessment
- Change phase:"blocked" from stopAuto to pauseAuto — sessions are now
  resumable instead of requiring manual /gsd auto restart
- Default reassess_after_slice to true — reassessment fires after every
  slice completion unless explicitly disabled (was opt-in, causing missed
  reassessments in multi-slice milestones)
- Change dispatch no-match fallthrough from level:"info" (hard stop) to
  level:"warning" (pause) — unhandled phases are now recoverable
- Add dependency-resolution fallback in resolveSliceDependencies — when
  no slice has ALL deps satisfied, picks the one with the most deps met
  instead of immediately returning blocked (both DB and file-based paths)
2026-04-14 06:00:25 +02:00
Jeremy
1c2096d8f4 chore: remove stale src/app-paths.js leftover
Tracked output from 2022 (commit d93956ba4) that's missing the modern
GSD_HOME env support and webPreferencesPath export present in the .ts
source. No runtime path consumes it, but the test compile script's
copyAssets step overlays src/* onto esbuild output in dist-test, so the
stale .js was shadowing the compiled app-paths and breaking any unit
test transitively importing webPreferencesPath.
2026-04-13 22:38:53 -05:00
Jeremy
6302c952fe test(cli): add unit tests for parseCliArgs
Cover the canonical parseCliArgs export in cli-web-branch.ts including
the new mcp mode, worktree flag (boolean and named forms), and existing
short flags, web mode flags, list flags, and positional message handling.

Also remove src/app-paths.js — a stale tracked output (last touched in
2022, missing GSD_HOME and webPreferencesPath exports). The test compile
script copies all of src/ over esbuild's output, so this stale .js was
shadowing the compiled app-paths in dist-test and breaking any test that
transitively imported it. No runtime path uses it (production loads from
dist/app-paths.js; jiti/tsx prefer the .ts source).

Satisfies require-tests.sh on PR #4162.
2026-04-13 22:37:14 -05:00
Claude
679b3177a8 refactor(cli): slim down top-level src/ — dedup, unused fallbacks, onboarding
Pure deletion/deduplication pass on top-level src/*.ts. External behavior
unchanged; all targeted unit tests still pass.

cli.ts (−170 net lines)
  - Adopt canonical validateConfiguredModel from startup-model-validation.ts;
    delete the drifted local copy with hardcoded model fallbacks.
  - Import CliFlags + parseCliArgs from cli-web-branch.ts instead of keeping
    a second, 90%-identical parser; pass cliFlags directly into
    runWebCliBranch instead of re-parsing process.argv.
  - Extract 3 helpers for verbatim duplicates:
      * printNonTtyErrorAndExit (TTY gate, 2 call sites)
      * printExtensionErrors (extension load errors, 2 call sites)
      * reapplyValidatedModelOnFallback (post-createAgentSession fix, 2 sites)
  - Factor runHeadlessFromAuto helper shared by the `gsd auto` shorthand
    and the auto-piped-stdout redirect.
  - Collapse ensureRtkBootstrap from hand-rolled _done flag to a
    promise-memoized doRtkBootstrap.
  - Drop redundant validateConfiguredModel pre-createAgentSession calls
    (the post-createAgentSession call is the correct one per #2626).
  - Delete dead --version/-v and --help/-h fast paths (loader.ts already
    handles these before cli.ts is imported).

cli-web-branch.ts
  - Unify CliFlags with worktree, 'mcp' mode, and _selectedSessionPath.
  - Drop unused help?/version? flags (loader.ts intercepts them).

onboarding.ts
  - Add runStep<T>() helper with shared cancel/warn handling; collapse 4
    near-identical try/catch blocks around runLlmStep, runWebSearchStep,
    runRemoteQuestionsStep, runToolKeysStep.
  - Delete trivial isCancelError helper (inlined as p.isCancel).
  - Rewrite loadPico() adapter to build PicoModule from chalk so we can
    drop the redundant picocolors dependency.

package.json / package-lock.json
  - Remove picocolors direct dep (chalk remains the single color library).
2026-04-14 01:51:22 +00:00
Jeremy
a80d9e0edf fix(cli): use junction symlinks in merged node_modules path 2026-04-13 20:44:08 -05:00
Nils Reeh
365b36d96a fix(gsd): set completed_at when reconciling task status to complete
reconcileSliceTasks called updateTaskStatus without a completedAt
timestamp, leaving tasks.completed_at NULL for all tasks completed
via the file-existence reconcile path.

Closes #4129

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-14 02:13:38 +02:00
Jeremy McSpadden
fc7d195e09 Merge pull request #4150 from jeremymcs/claude/debug-tui-auto-mode-vCnxA
Split Container.clear() into clear() and detachChildren()
2026-04-13 18:39:33 -05:00
Claude
33d9a26dd7 fix(tui): keep AUTO-mode widgets alive and drop duplicate health panel
InteractiveMode.renderWidgets() called Container.clear() on the
widgetContainerAbove/Below render mounts, which disposed every mounted
extension widget and then re-added the now-dead components. In AUTO mode
updateProgressWidget re-registers gsd-progress on every unit dispatch,
so gsd-notifications and gsd-health had their refresh timers and store
subscriptions killed after the first dispatch. Renders kept returning
the widgets' frozen cachedLines, making them look alive but never update
(/gsd notifications clear appeared to do nothing, belowEditor last-commit
went stale while the top-of-screen dashboard stayed correct).

Split detach from dispose: add Container.detachChildren() and use it from
the two widget-mount call sites. clear() still disposes for every other
caller (chat, editor, status, pinned-message containers). The
extensionWidgets* maps remain the single owner of widget disposal via
removeExisting() and clearExtensionWidgets().

While in AUTO, gsd-progress duplicates gsd-health on last commit, cost/
budget, and the health signal. Make gsd-progress the single source of
truth: hide gsd-health from auto-start and re-register it from every
exit point in auto.ts (lock-lost stop, cleanupAfterLoopExit !paused
guard, stopAuto, pauseAuto). gsd-notifications stays visible — it is
independent state and, with the detach fix, its subscription + 5s
refresh actually work again.

Tests: Container.detachChildren()/clear() contract guards added to
packages/pi-tui/src/__tests__/tui.test.ts. health-widget,
notification-{store,widget,overlay}, notifications-handler, notifications,
and auto-paused-ui-cleanup suites all pass.
2026-04-13 23:30:25 +00:00
Nils Reeh
e3e72174fa fix(gsd): use bun for update when installed via Bun (#4145)
When GSD is installed with `bun add -g`, running `gsd update` or
`/gsd update` previously shelled out to `npm install -g`, which fails
with EACCES on systems where npm has no write access to the global
node_modules directory.

Adds `resolveInstallCommand(pkg)` to `update-check.ts` that returns
`bun add -g <pkg>` when `process.versions.bun` is defined (i.e. the
current runtime is Bun), and `npm install -g <pkg>` otherwise.  All
three update paths — `update-cmd.ts`, `commands-handlers.ts`, and the
interactive startup prompt in `update-check.ts` — now use this helper,
including the fallback error message shown to the user.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 00:52:08 +02:00
Jeremy McSpadden
1ec1a8c4c4 Merge pull request #4060 from mastertyko/fix/3917-claude-code-effort
feat(claude-code): pass thinking level as effort
2026-04-13 16:07:59 -05:00
Jeremy
8cf8d2bcf2 fix(gsd): restore isAutoMode plumbing and workflow-logger catch in auto-model-selection
CI on #4141 failed because threading an explicit flatRateCtx parameter
through resolvePreferredModelConfig broke two contracts the test suite
locks in:

  1. interactive-routing-bypass (#3962) asserts that
     resolvePreferredModelConfig is invoked with exactly three positional
     arguments and that its `if (!isAutoMode) return undefined` guard
     lives within the first 600 chars of the function body. The new
     flatRateCtx param + JSDoc pushed the guard past that window and
     lengthened the call site.

  2. silent-catch-diagnostics (#3348) requires migrated files to route
     through workflow-logger instead of leaving empty catch blocks. The
     new buildFlatRateContext() swallowed registry lookup errors with a
     comment-only catch.

Fix both without regressing flat-rate detection:

- Hang the flat-rate context off autoModeStartModel itself via an
  optional `flatRateCtx` field. selectAndApplyModel now enriches
  autoModeStartModel up front (preserving the variable name) and
  resolvePreferredModelConfig reads autoModeStartModel.flatRateCtx —
  signature shrinks back to three params, call site returns to the
  3-arg form the test anchors on.
- Replace the empty catch in buildFlatRateContext() with a
  logWarning("dispatch", ...) that surfaces the lookup failure while
  still falling through with authMode undefined, matching the
  fail-closed policy everywhere else in the file.
2026-04-13 16:00:01 -05:00
Claude
9a93563a64 feat(gsd): extend flat-rate provider detection to custom/externalCli providers
The 3-entry hard-coded FLAT_RATE_PROVIDERS set in auto-model-selection.ts
treated only github-copilot/copilot/claude-code as flat-rate, so dynamic
routing would happily downgrade units on user-registered subscription
proxies and any externalCli CLI wrapper — quality loss with no cost
benefit for users whose provider charges a flat rate per request.

Make isFlatRateProvider extensible by composing three signals:

  1. Built-in list (unchanged, wins first for regression safety).
  2. externalCli auto-detection via ctx.modelRegistry.getProviderAuthMode()
     — any CLI wrapper around the user's subscription is inherently
     flat-rate.
  3. User-declared `flat_rate_providers` preference for private
     subscription-backed proxies, enterprise-gated deployments, and custom
     CLI wrappers the built-in list doesn't know about.

Add a buildFlatRateContext() helper so every call site constructs the
context the same way and degrades gracefully when ctx/prefs/registry are
unavailable (never breaks flat-rate detection).

Thread the context through:

- resolvePreferredModelConfig (routing synthesis guard)
- selectAndApplyModel primary-model and fallback provider checks
- auto-start.ts dynamic-routing banner so the startup message matches
  dispatch-time reality

Preferences:
- Add `flat_rate_providers?: string[]` to GSDPreferences and
  KNOWN_PREFERENCE_KEYS in preferences-types.ts.
- Add a string-array validator in preferences-validation.ts that trims
  whitespace and drops empty entries.

Tests:
- Extend flat-rate-routing-guard.test.ts with 13 new cases covering
  externalCli auto-detection, userFlatRate preference matching
  (case-insensitive), combined signals, buildFlatRateContext() behavior
  (including registry-lookup-throws and non-canonical auth-mode
  responses), plus regression cases for the built-in list.
- Add 5 validator cases in preferences.test.ts for the new
  flat_rate_providers field (string-array accepted, whitespace trimmed,
  non-array rejected, non-string elements rejected, known-key warning
  check).
2026-04-13 20:25:26 +00:00
Jeremy
4fad01694c Merge upstream/main into fix/4122 custom-provider bootstrap 2026-04-13 14:05:12 -05:00
Claude
73558e7557 fix(gsd): preserve custom-model selection on /gsd auto bootstrap (#4122)
When a user picks a custom-provider model via /gsd model (Ollama, vLLM,
LM Studio, OpenAI-compatible proxies — anything defined in
~/.gsd/agent/models.json) and then runs /gsd auto, the bootstrap silently
swaps it out for whichever model PREFERENCES.md happens to list. That
model is invariably a built-in provider (claude-code, anthropic) the user
isn't logged into, so auto-mode immediately fails with
"Not logged in · Please run /login", pauses, and resets the session to
claude-code/claude-sonnet-4-6.

Root cause: #3517 made resolveDefaultSessionModel() (PREFERENCES.md) take
priority over ctx.model (settings.json) in auto-start.ts. That fix was
correct for the scenario where settings.json had a stale built-in default
but PREFERENCES.md was freshly configured, but it has no awareness of
custom providers — PREFERENCES.md cannot reference them, so honoring it
when the session provider is custom always discards the user's explicit
choice.

Add isCustomProvider() to preferences-models.ts which checks whether a
provider is declared in ~/.gsd/agent/models.json (with ~/.pi/agent
fallback). Read the file directly with JSON.parse to avoid pulling in
the model-registry at this call site, and treat any read or parse error
as not-custom so a malformed models.json never breaks bootstrap.

In bootstrapAutoSession(), when the session provider is custom, use
ctx.model directly. Otherwise fall through to the existing #3517
behavior (preferredModel ?? ctx.model).

Tests:
- New behavioral regression in model-isolation.test.ts that mirrors
  the auto-start.ts logic and verifies the four interesting cases:
  custom session beats PREFERENCES.md, built-in session still defers
  to PREFERENCES.md (#3517 preserved), custom session with no
  PREFERENCES.md uses ctx.model, and null ctx.model falls through.
- New string-grep guard in auto-start-model-capture.test.ts that the
  isCustomProvider() call is wired into the snapshot path.
- Updated #3517 grep to allow the new branching shape while still
  asserting preferredModel remains a snapshot source for built-ins.

https://claude.ai/code/session_01QLYCeiXWjSFPEXFxjkSLni
2026-04-13 17:53:32 +00:00
Tom Boucher
3ff8989a62 fix(gsd): address 3 silent-crash secondary issues from #3348 post-#3696 (#4133)
* fix(ci): address 5 pipeline integrity issues from release audit

- version-stamp.mjs: regenerate package-lock.json after dev version stamp
  (mirrors the same fix applied to bump-version.mjs in #4116)

- bump-version.mjs: regenerate root and web/package-lock.json after version
  bump so both lockfiles are always in sync at release time

- pipeline.yml: add post-bump validation step that verifies all package.json
  files parse as valid JSON before the release commit is made

- pipeline.yml: split "Commit, tag, and push" — commit+tag+rebase happen
  before build, but git push is deferred until after build and npm publish
  both succeed, preventing a broken tag from landing on main

- pipeline.yml: emit a :⚠️: annotation when live LLM tests fail so
  failures are visible in the Actions UI instead of silently swallowed

Closes #4118

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(gsd): address 3 silent-crash secondary issues from #3348 post-#3696

Three gaps that remained after the double-fault fix in #3696:

1. unhandledRejection not wired — installEpipeGuard only registered
   uncaughtException; promise rejections that escaped without a catch
   were not handled by the GSD error path. Added _gsdRejectionGuard
   alongside _gsdEpipeGuard.

2. Non-fatal overcorrection — the #3696 fix replaced re-throwing with
   log-and-continue, leaving the process running in an indeterminate
   state after any non-EPIPE/non-ENOENT exception. Replaced with
   writeCrashLog + process.exit(1). writeCrashLog is extracted into
   bootstrap/crash-log.ts (zero deps) so tests can import it without
   pulling in the full extension graph.

3. unit-end not emitted after crash-with-side-effects — hameltomor
   observed that complete-milestone M001 wrote SUMMARY.md and updated
   the DB but never emitted unit-end (#3348 comment-4237533440). Added
   emitCrashRecoveredUnitEnd() in crash-recovery.ts: on the next
   auto-mode startup, if a stale lock references a unit whose
   unit-start has no matching unit-end in the journal, a synthetic
   unit-end with status "crash-recovered" is emitted before the lock
   is cleared. This closes the causal chain for downstream tooling
   and forensics without requiring changes to the lock file schema.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-13 12:33:16 -04:00