Commit graph

2147 commits

Author SHA1 Message Date
Jeremy McSpadden
76a85300ae fix(gsd): align ADR-009 integration with type-safe builds
Add ADR-009 docs and resolve compile/runtime typing regressions in UOK and extension modules.

Refs #4214
2026-04-14 20:46:46 -05:00
Jeremy McSpadden
bb1b9dce07 Integrate UOK model policy gates and kernel loop adapter 2026-04-14 20:46:46 -05:00
Nils Reeh
15bccca78f feat(graph): implement knowledge graph system (closes #4202)
Ports the v1 graphify system to v2 as a native TypeScript implementation.
The knowledge graph builds semantic relationships between milestones, slices,
tasks, and knowledge entries — and injects relevant subgraphs automatically
into every agent dispatch prompt.

## Core implementation (packages/mcp-server/src/readers/graph.ts)

- `buildGraph(projectDir)` — walks all .gsd/ artifacts (STATE.md,
  milestone PLANs, slice PLANs, KNOWLEDGE.md), extracts nodes and edges
  with confidence tiers (EXTRACTED / INFERRED / AMBIGUOUS). Parse errors
  skip the node rather than crashing.
- `writeGraph(gsdRoot, graph)` — atomic write via tmp file + rename.
- `writeSnapshot(gsdRoot)` — saves a diff baseline before each rebuild.
- `graphQuery(projectDir, term, budget?)` — BFS subgraph search with
  case-insensitive matching on label + description; trims AMBIGUOUS edges
  first, then INFERRED, respecting the token budget (default 4 000).
- `graphStatus(projectDir)` — freshness check; stale = older than 24 h.
- `graphDiff(projectDir)` — compares current graph to last snapshot,
  returns added / removed / changed counts for nodes and edges.

## MCP tool (packages/mcp-server/src/server.ts)

Registers `gsd_graph` immediately after `gsd_knowledge` with four modes:
build | query | status | diff. All errors returned as isError: true.

## CLI subcommand (src/cli.ts, src/help-text.ts)

`gsd graph build|status|query <term>|diff` — follows the established
`if (cliFlags.messages[0] === '...')` dispatch pattern. Uses
`resolveGsdRoot()` for git-root-aware path resolution (not a naive
`.gsd` append). Help text updated with correct positional argument format.

## Auto-rebuild after slice completion
(src/resources/extensions/gsd/tools/complete-slice.ts)

Fire-and-forget `buildGraph → writeGraph` triggered after every slice
completion. Uses `@gsd-build/mcp-server` package import (not a relative
src path) and `resolveGsdRoot()` for correct path resolution in monorepos.

## Graph-aware dispatch injection
(src/resources/extensions/gsd/graph-context.ts,
 src/resources/extensions/gsd/auto-prompts.ts)

`inlineGraphSubgraph(projectDir, term, { budget })` queries the graph and
formats the result as a `### Knowledge Graph Context` markdown block,
consistent with all other inlined context blocks. Adds a stale warning
annotation when the graph is older than 24 h. Returns null (graceful
skip) when graph.json is missing, the query returns zero nodes, or the
import fails — no agent dispatch is ever blocked by graph availability.

Injected into three prompt builders:
- `buildResearchSlicePrompt` — 3 000 token budget
- `buildPlanSlicePrompt`     — 3 000 token budget
- `buildExecuteTaskPrompt`   — 2 000 token budget

## Tests

- 22 tests for the core graph reader (graph.test.ts)
- 14 tests for the dispatch injection helper (graph-context.test.ts)
- All tests use real on-disk fixtures (no module mocking needed)
- Full suite: 6 318 passed, 0 failed

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-15 02:20:49 +02:00
Jeremy McSpadden
f4f365a27a Merge pull request #4028 from mastertyko/fix/3838-closeout-cancelled-units
fix(gsd): close out cancelled auto units
2026-04-14 18:29:38 -05:00
Jeremy
1c2150da03 fix(gsd): harden pr-branch/ship argv-safety and canonical artifact paths
- pr-branch: use execFileSync everywhere, validate --name with git check-ref-format,
  cherry-pick --no-commit + path-level reset to strip .gsd/.planning/PLAN.md from
  mixed commits, and assert no excluded paths leak into the final branch
- ship: argv-safe git push and gh pr create, validate base/current refs, and
  resolve ROADMAP/SUMMARY via resolveMilestoneFile/resolveSliceFile
- add-tests: resolve slice SUMMARY via resolveSliceFile instead of hardcoded path
2026-04-14 18:10:16 -05:00
Jeremy McSpadden
d8a472d4b4 fix: align v1→v2 commands with upstream types, remove engine-dependent slice mutations
- Fix ProjectTotals field names (totalCost→cost, totalDuration→duration, etc.)
- Fix UnitMetrics field names (status→finishedAt, unitId→id)
- Fix ModelAggregate field name (count→units)
- Fix ActiveRef usage (no phase/slices fields)
- Remove commands-slice-mutation.ts (depends on single-writer engine)
- Remove slice mutation routes from catalog, do command, and workflow handler
- Update backlog promote to work without engine slice commands
2026-04-14 17:48:27 -05:00
Jeremy McSpadden
0516a611e3 refactor(gsd): remove /gsd map-codebase command
Drop map-codebase from the v1→v2 parity PR: handler, prompt template,
catalog entry, nested completions, route in ops.ts, and keyword route
in the do command.
2026-04-14 17:48:02 -05:00
Jeremy McSpadden
3e37264e3f feat(gsd): add v1→v2 command parity — 12 missing commands
Add 12 commands that exist in GSDv1 but had no v2 equivalent:

High priority:
- ship: Create PR from milestone artifacts (title, body, metrics)
- add-slice: Append slice to roadmap via engine updateRoadmap()
- insert-slice: Insert slice at position with reordering
- remove-slice: Remove pending slice (--force for planned slices)
- do: Natural language routing via keyword matching (30 routes)
- session-report: Session cost/tokens/work summary (--json, --save)

Medium priority:
- backlog: Structured backlog with 999.x numbering (add/promote/remove)
- pr-branch: Clean PR branch filtering .gsd/ commits via cherry-pick
- add-tests: LLM-dispatched test generation for completed slices
- map-codebase: Codebase analysis (tech/arch/quality/concerns)

All slice mutations go through the engine's updateRoadmap() command,
preserving the single-writer architecture. No direct markdown edits.

Includes 46 unit tests across 6 test files, 2 prompt templates,
catalog entries with nested completions for all commands.
2026-04-14 17:47:33 -05:00
Jeremy McSpadden
d34adb1e5f Merge pull request #4199 from jeremymcs/claude/workflow-logger-state-system-ZNnYm
Wire workflow-logger through the state system
2026-04-14 17:16:13 -05:00
Jeremy McSpadden
f100b9d00a Merge pull request #4190 from NilsR0711/fix/native-git-bridge-exec-sync-windows
fix(gsd): replace execSync with execFileSync in nativeCommit, nativeIsRepo, nativeResetHard fallbacks
2026-04-14 15:25:34 -05:00
Jeremy McSpadden
c8d7474f57 Merge pull request #4192 from mastertyko/fix/4130-workflow-run-quoted-overrides
fix(gsd): preserve quoted workflow run overrides
2026-04-14 15:18:08 -05:00
Claude
74ca7cd8cd Wire workflow-logger through the state system
The workflow-logger per-unit buffer API (_resetLogs / drainAndSummarize /
formatForNotification) had zero callers outside tests, so accumulated
warnings never reached users as a consolidated post-unit alert and the
buffer leaked across units in the same Node process. Several state-layer
sites also silently swallowed errors that should have surfaced.

- auto/phases.ts: reset logger in runUnitPhase, drain + ctx.ui.notify in
  runFinalize success path, drain in both finalize timeout branches so
  timed-out unit logs don't bleed into the next iteration
- auto/detect-stuck.ts: enrich stuck reasons with summarizeLogs() so
  recovery has the diagnostic context (read-only peek, no drain)
- auto.ts: call setLogBasePath(base) in startAuto to pin the audit log
  on /clear resume and hot-reload paths that bypass dynamic-tools bootstrap
- workflow-manifest.ts: log snapshotState ROLLBACK failures (split-brain
  signal) instead of silently swallowing them
- state.ts: log reconcileDiskToDb roadmap read failures instead of silent
  continue
- workflow-projections.ts: log renderStateProjection DB handle probe
  failures instead of silent return

New regression tests cover the phases.ts wiring (source-scan), setLogBasePath
in startAuto, detect-stuck enrichment runtime behavior (including the
read-only peek invariant), and the three silent-catch fixes.
2026-04-14 14:01:20 -05:00
Jeremy McSpadden
0d66433b4f Merge pull request #4198 from jeremymcs/claude/single-writer-database-pattern-qOaQb
Enforce single-writer invariant for engine database writes
2026-04-14 13:56:54 -05:00
Jeremy McSpadden
a57be5267d Merge branch 'main' into fix/4130-workflow-run-quoted-overrides 2026-04-14 13:43:01 -05:00
Claude
5e9196e5c9 refactor(gsd): enforce single-writer invariant for engine DB
Route every INSERT/UPDATE/DELETE/REPLACE against .gsd/gsd.db through typed
wrappers in gsd-db.ts and add a structural test that fails CI if a new
bypass appears. Previously 13 call sites across 10 modules reached into
_getAdapter() and issued raw write SQL, making the "single writer"
architecture unenforceable in-process.

New wrappers in gsd-db.ts: deleteDecisionById, deleteRequirementById,
deleteArtifactByPath, clearEngineHierarchy, insertOrIgnoreSlice,
insertOrIgnoreTask, setSliceReplanTriggeredAt, upsertQualityGate,
restoreManifest, bulkInsertLegacyHierarchy, readTransaction, and eight
memory-store helpers (insertMemoryRow, rewriteMemoryId, etc).

workflow-manifest.restore() is lifted verbatim into gsd-db.restoreManifest
with a type-only import of StateManifest to avoid circular runtime deps.
tools/workflow-tool-executors and workflow-manifest.snapshotState swap
their manual BEGIN DEFERRED/COMMIT/ROLLBACK dance for readTransaction().
unit-ownership.ts stays outside the invariant: it writes to a separate
.gsd/unit-claims.db by design.

tests/single-writer-invariant.test.ts walks every .ts file under gsd/
(excluding tests/ and the allowlist) and fails with a grouped violations
list on any regex match for .prepare/.exec raw writes, plus a positive
assertion that gsd-db.ts still exports each expected wrapper so the
structural test can't silently become a no-op.

https://claude.ai/code/session_01FZgXD3bjcddoFYsTEY6JhC
2026-04-14 18:28:24 +00:00
mastertyko
fa507a2375 fix(gsd): preserve quoted workflow run overrides 2026-04-14 19:25:46 +02:00
Nils Reeh
508af42707 fix(gsd): replace execSync with execFileSync in nativeCommit, nativeIsRepo, nativeResetHard fallbacks
On Windows, execSync spawns cmd.exe which cannot resolve git when Git for
Windows is installed via MSYS2/bash but not in cmd.exe's PATH. This caused
every auto-commit to fail silently, leaving all milestone work uncommitted.

All other fallback paths in native-git-bridge already use execFileSync — the
three affected functions were the outliers. execFileSync resolves the binary
directly without a shell intermediary and supports identical stdio/input options.

Closes #4180
2026-04-14 19:17:59 +02:00
Nils Reeh
563a1e1b21 fix(ci): cache dist alongside tsbuildinfo and use workflow-logger in catch blocks
- Include dist/ and packages/*/dist/ in the TypeScript incremental cache
  so that when tsbuildinfo indicates no changes, the compiled output files
  are still present. Without this, tsc with incremental:true skips emission
  when tsbuildinfo exists but dist/ is absent (fresh checkout + cache restore),
  causing downstream packages like @gsd/pi-tui to fail resolving @gsd/native
  subpath exports.
- Also hash source files in the cache key so dist is invalidated on code changes.
- Replace process.stderr.write with logWarning("bootstrap", ...) in catch blocks
  to satisfy the workflow-logger coverage test (#3348).
- Update extension-bootstrap-isolation tests to match the new logWarning pattern.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 18:49:08 +02:00
Nils Reeh
517e7cf0fe fix(gsd): isolate /gsd command registration from extension bootstrap failures
On Windows, static imports in register-shortcuts.ts (added in v2.72) can
fail at module load time, causing the entire GSD extension to silently fail
to register. This makes /gsd unavailable despite the welcome screen
suggesting it.

Three changes:
- index.ts: register /gsd command before importing register-extension.js,
  wrapped in try-catch so bootstrap failures don't prevent core command
- register-extension.ts: remove duplicate registerGSDCommand call, wrap
  non-critical registrations (tools, shortcuts, hooks) in individual
  try-catch blocks so one failure doesn't prevent others
- Add structural contract tests verifying isolation properties

Closes #4168, closes #4172

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 18:36:19 +02:00
Jeremy McSpadden
2cafd66bed Merge pull request #4184 from jeremymcs/claude/refactor-code-cleanup-078AQ
fix: preserve image blocks in Claude Code SDK prompt path
2026-04-14 09:39:03 -05:00
Jeremy
01857ea180 fix(claude-code-cli): forward image blocks in SDK query prompt (#4183) 2026-04-14 09:30:02 -05:00
Jeremy McSpadden
bdef500c85 Merge pull request #4178 from jeremymcs/fix/4175-complete-milestone-false-merge
fix(auto-mode): prevent false milestone merge after complete-milestone failure (#4175)
2026-04-14 07:53:41 -05:00
Jeremy
4a2045d290 fix(state): DB-authoritative milestone completeness (#4179)
Read-side twin of #4175. `deriveStateFromDb` had a SUMMARY-file fallback
that could mark a milestone complete even when the DB row said otherwise,
allowing an orphan SUMMARY.md (crashed complete-milestone turn, partial
merge, manual edit) to cascade into a false auto-merge.

- buildCompletenessSet: drop SUMMARY fallback; only DB status decides.
- buildRegistryAndFindActive: remove SUMMARY from the completeness check;
  still consult SUMMARY as a title fallback for DB-certified milestones.
- allSlicesDone branch: drop `!summaryFile` clause so a terminal-validation
  + orphan-SUMMARY path flows through to `completing-milestone` instead of
  short-circuiting, letting complete-milestone re-run idempotently.

Regression tests: orphan SUMMARY with in-flight slice stays active; orphan
SUMMARY with all-slices-done + validation-terminal lands at
completing-milestone (does not report as already complete).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-14 07:43:29 -05:00
Jeremy
d07b9bf473 test(#4175): add regression guards for complete-milestone false-merge
Guards the three cooperating fixes shipped in #4178 via source inspection
so a future refactor cannot silently reintroduce the false-merge path:

- stopAuto now uses the DB getMilestone() status as the authoritative
  milestone-complete signal (falls back to SUMMARY presence only when
  the project DB is unavailable).
- postUnitPreVerification pauses auto-mode for complete-milestone after
  retries are exhausted instead of writing a stub blocker placeholder.
- recoverTimedOutUnit pauses complete-milestone instead of writing a
  stub blocker placeholder.

Unblocks the CI lint / require-tests.sh gate on PR #4178.
2026-04-14 07:35:00 -05:00
Jeremy
9957152024 fix(auto-mode): prevent false milestone merge after complete-milestone failure (#4175)
When complete-milestone failed verification, auto-mode could end up merging
the worktree to main anyway and emit a metadata-only merge warning, creating
a misleading near-complete signal while the SUMMARY was never actually written.

The blocker-placeholder path for complete-milestone wrote a stub SUMMARY
without updating DB status, and stopAuto's SUMMARY-presence check treated
the stub as a legitimate completion signal.

- auto-post-unit.ts: skip blocker placeholder and pause auto-mode on
  complete-milestone verification retry exhaustion.
- auto-timeout-recovery.ts: same guard for the idle/hard timeout path.
- auto.ts: make stopAuto Step 4 DB-authoritative (getMilestone.status ===
  "complete") with SUMMARY-presence fallback only for DB-unavailable
  legacy projects.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-14 07:24:12 -05:00
Jeremy McSpadden
4f76d434b7 Merge pull request #4155 from NilsR0711/fix/completed-at-null-on-reconcile
fix(gsd): set completed_at when reconciling task status to complete
2026-04-14 06:44:07 -05:00
Jeremy McSpadden
cdcdb1459e Merge pull request #4176 from jeremymcs/worktree-fix-4094-validate-milestone-loop
fix(auto): pause validate-milestone loop when needs-remediation has no slices (#4094)
2026-04-14 06:42:51 -05:00
Jeremy McSpadden
a09b69e27d Merge pull request #4173 from jeremymcs/claude/gsd-step-guidance-5FNrM
Add user feedback when completing steps in step mode
2026-04-14 06:42:37 -05:00
Jeremy
6b21ccbd54 fix(auto): pause on validate-milestone needs-remediation without slices (#4094)
When validate-milestone wrote VALIDATION.md with verdict=needs-remediation
but the agent failed to call gsd_reassess_roadmap to add remediation
slices, state.ts re-derived phase: validating-milestone indefinitely
because the existing #3596/#3670 guard treats needs-remediation as
non-terminal regardless of whether new work was queued. The stuck
detector only fired after 3 consecutive dispatches (~$3 + ~12 min wasted
per incident). Reproduced on M022 and M024.

Add a post-unit guard in runPostUnitVerification for validate-milestone
units: if VALIDATION.md verdict is needs-remediation and no incomplete
slices exist for the milestone (DB-authoritative via getMilestoneSlices,
filesystem fallback via parseRoadmap), pause auto-mode immediately with
a clear blocker. The legitimate re-validation flow is preserved — when
remediation slices have been queued (any non-closed status), the guard
returns continue and the existing state machine handles the work.

Tests cover: pause on all-closed scenario, skipped-status handled as
closed, continue when a queued remediation slice exists, continue on
verdict=pass, and continue when no VALIDATION file is present.
2026-04-14 06:32:42 -05:00
Jeremy
2cec5a1014 test(gsd): cover step-mode completion message helper
Extracts the step-complete notification text into buildStepCompleteMessage
and STEP_COMPLETE_FALLBACK_MESSAGE so the copy can be unit-tested
directly (milestone complete, mid-flight with next unit, unknown phase,
and deriveState-failure fallback). Resolves require-tests CI failure
on PR #4173.
2026-04-14 06:11:53 -05:00
Claude
8fec87b6f2 fix(gsd): notify users what to do next after /gsd step finishes
In step mode, /gsd would run one unit and then silently exit the auto
loop, leaving users with no hint that they should /clear and /gsd again
to run the next step. Emit an info notify before returning "step-wizard"
from postUnitPostVerification so the TUI surfaces the next unit label
and the /clear + /gsd guidance (or /gsd auto to switch to auto mode).
Falls back to a generic message if deriveState throws, and handles the
milestone-complete case with a dedicated review message.

https://claude.ai/code/session_015yrPQbZTyJPqTsM654Ym3s
2026-04-14 11:03:04 +00:00
Jeremy McSpadden
78e8665c59 Merge pull request #4163 from deseltrus/fix/auto-mode-premature-stops
fix(auto): prevent premature auto-mode stops on blocked phase + missing reassessment
2026-04-14 05:28:11 -05:00
deseltrus
68bf425606 test: update assertions for blocked-phase behavior change
Tests now expect:
- pauseAuto instead of stopAuto for blocked state (resumable)
- phase:"planning" instead of "blocked" when partial-dep fallback
  picks a slice (slice-level only; milestone-level blocked unchanged)
- activeSlice set via fallback instead of null
2026-04-14 06:20:00 +02:00
deseltrus
73f9434d11 fix(tui): eliminate pinned output duplication and reduce render overhead
rebuildChatFromMessages() called populatePinnedFromMessages() which
re-populated the pinned zone with text already present in the chat
history, causing visible duplication during session state changes.
Additionally, the spinner interval at 80ms generated ~12.5 renders/s
for a purely cosmetic animation, and clearOnShrink triggered
unnecessary full redraws during pinned-zone transitions.

- Remove populatePinnedFromMessages() from rebuildChatFromMessages()
  and add pinnedMessageContainer.clear() instead — the streaming
  lifecycle in chat-controller manages pinned content during active work
- Reduce spinner interval 80ms→200ms with render-batching that skips
  redundant renders when streaming already triggers requestRender()
- Debounce clearOnShrink: defer full redraw by one render tick so
  pinned-clear→new-streaming transitions avoid a wasted full redraw
- Increase notification widget safety-net timer 5s→30s since the
  store subscription already handles push-based updates

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 06:17:18 +02:00
deseltrus
8919a07962 fix(auto): prevent premature auto-mode stops on blocked phase + missing reassessment
- Change phase:"blocked" from stopAuto to pauseAuto — sessions are now
  resumable instead of requiring manual /gsd auto restart
- Default reassess_after_slice to true — reassessment fires after every
  slice completion unless explicitly disabled (was opt-in, causing missed
  reassessments in multi-slice milestones)
- Change dispatch no-match fallthrough from level:"info" (hard stop) to
  level:"warning" (pause) — unhandled phases are now recoverable
- Add dependency-resolution fallback in resolveSliceDependencies — when
  no slice has ALL deps satisfied, picks the one with the most deps met
  instead of immediately returning blocked (both DB and file-based paths)
2026-04-14 06:00:25 +02:00
Nils Reeh
365b36d96a fix(gsd): set completed_at when reconciling task status to complete
reconcileSliceTasks called updateTaskStatus without a completedAt
timestamp, leaving tasks.completed_at NULL for all tasks completed
via the file-existence reconcile path.

Closes #4129

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-14 02:13:38 +02:00
Jeremy McSpadden
fc7d195e09 Merge pull request #4150 from jeremymcs/claude/debug-tui-auto-mode-vCnxA
Split Container.clear() into clear() and detachChildren()
2026-04-13 18:39:33 -05:00
Claude
33d9a26dd7 fix(tui): keep AUTO-mode widgets alive and drop duplicate health panel
InteractiveMode.renderWidgets() called Container.clear() on the
widgetContainerAbove/Below render mounts, which disposed every mounted
extension widget and then re-added the now-dead components. In AUTO mode
updateProgressWidget re-registers gsd-progress on every unit dispatch,
so gsd-notifications and gsd-health had their refresh timers and store
subscriptions killed after the first dispatch. Renders kept returning
the widgets' frozen cachedLines, making them look alive but never update
(/gsd notifications clear appeared to do nothing, belowEditor last-commit
went stale while the top-of-screen dashboard stayed correct).

Split detach from dispose: add Container.detachChildren() and use it from
the two widget-mount call sites. clear() still disposes for every other
caller (chat, editor, status, pinned-message containers). The
extensionWidgets* maps remain the single owner of widget disposal via
removeExisting() and clearExtensionWidgets().

While in AUTO, gsd-progress duplicates gsd-health on last commit, cost/
budget, and the health signal. Make gsd-progress the single source of
truth: hide gsd-health from auto-start and re-register it from every
exit point in auto.ts (lock-lost stop, cleanupAfterLoopExit !paused
guard, stopAuto, pauseAuto). gsd-notifications stays visible — it is
independent state and, with the detach fix, its subscription + 5s
refresh actually work again.

Tests: Container.detachChildren()/clear() contract guards added to
packages/pi-tui/src/__tests__/tui.test.ts. health-widget,
notification-{store,widget,overlay}, notifications-handler, notifications,
and auto-paused-ui-cleanup suites all pass.
2026-04-13 23:30:25 +00:00
Nils Reeh
e3e72174fa fix(gsd): use bun for update when installed via Bun (#4145)
When GSD is installed with `bun add -g`, running `gsd update` or
`/gsd update` previously shelled out to `npm install -g`, which fails
with EACCES on systems where npm has no write access to the global
node_modules directory.

Adds `resolveInstallCommand(pkg)` to `update-check.ts` that returns
`bun add -g <pkg>` when `process.versions.bun` is defined (i.e. the
current runtime is Bun), and `npm install -g <pkg>` otherwise.  All
three update paths — `update-cmd.ts`, `commands-handlers.ts`, and the
interactive startup prompt in `update-check.ts` — now use this helper,
including the fallback error message shown to the user.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 00:52:08 +02:00
Jeremy McSpadden
1ec1a8c4c4 Merge pull request #4060 from mastertyko/fix/3917-claude-code-effort
feat(claude-code): pass thinking level as effort
2026-04-13 16:07:59 -05:00
Jeremy
8cf8d2bcf2 fix(gsd): restore isAutoMode plumbing and workflow-logger catch in auto-model-selection
CI on #4141 failed because threading an explicit flatRateCtx parameter
through resolvePreferredModelConfig broke two contracts the test suite
locks in:

  1. interactive-routing-bypass (#3962) asserts that
     resolvePreferredModelConfig is invoked with exactly three positional
     arguments and that its `if (!isAutoMode) return undefined` guard
     lives within the first 600 chars of the function body. The new
     flatRateCtx param + JSDoc pushed the guard past that window and
     lengthened the call site.

  2. silent-catch-diagnostics (#3348) requires migrated files to route
     through workflow-logger instead of leaving empty catch blocks. The
     new buildFlatRateContext() swallowed registry lookup errors with a
     comment-only catch.

Fix both without regressing flat-rate detection:

- Hang the flat-rate context off autoModeStartModel itself via an
  optional `flatRateCtx` field. selectAndApplyModel now enriches
  autoModeStartModel up front (preserving the variable name) and
  resolvePreferredModelConfig reads autoModeStartModel.flatRateCtx —
  signature shrinks back to three params, call site returns to the
  3-arg form the test anchors on.
- Replace the empty catch in buildFlatRateContext() with a
  logWarning("dispatch", ...) that surfaces the lookup failure while
  still falling through with authMode undefined, matching the
  fail-closed policy everywhere else in the file.
2026-04-13 16:00:01 -05:00
Claude
9a93563a64 feat(gsd): extend flat-rate provider detection to custom/externalCli providers
The 3-entry hard-coded FLAT_RATE_PROVIDERS set in auto-model-selection.ts
treated only github-copilot/copilot/claude-code as flat-rate, so dynamic
routing would happily downgrade units on user-registered subscription
proxies and any externalCli CLI wrapper — quality loss with no cost
benefit for users whose provider charges a flat rate per request.

Make isFlatRateProvider extensible by composing three signals:

  1. Built-in list (unchanged, wins first for regression safety).
  2. externalCli auto-detection via ctx.modelRegistry.getProviderAuthMode()
     — any CLI wrapper around the user's subscription is inherently
     flat-rate.
  3. User-declared `flat_rate_providers` preference for private
     subscription-backed proxies, enterprise-gated deployments, and custom
     CLI wrappers the built-in list doesn't know about.

Add a buildFlatRateContext() helper so every call site constructs the
context the same way and degrades gracefully when ctx/prefs/registry are
unavailable (never breaks flat-rate detection).

Thread the context through:

- resolvePreferredModelConfig (routing synthesis guard)
- selectAndApplyModel primary-model and fallback provider checks
- auto-start.ts dynamic-routing banner so the startup message matches
  dispatch-time reality

Preferences:
- Add `flat_rate_providers?: string[]` to GSDPreferences and
  KNOWN_PREFERENCE_KEYS in preferences-types.ts.
- Add a string-array validator in preferences-validation.ts that trims
  whitespace and drops empty entries.

Tests:
- Extend flat-rate-routing-guard.test.ts with 13 new cases covering
  externalCli auto-detection, userFlatRate preference matching
  (case-insensitive), combined signals, buildFlatRateContext() behavior
  (including registry-lookup-throws and non-canonical auth-mode
  responses), plus regression cases for the built-in list.
- Add 5 validator cases in preferences.test.ts for the new
  flat_rate_providers field (string-array accepted, whitespace trimmed,
  non-array rejected, non-string elements rejected, known-key warning
  check).
2026-04-13 20:25:26 +00:00
Jeremy
4fad01694c Merge upstream/main into fix/4122 custom-provider bootstrap 2026-04-13 14:05:12 -05:00
Claude
73558e7557 fix(gsd): preserve custom-model selection on /gsd auto bootstrap (#4122)
When a user picks a custom-provider model via /gsd model (Ollama, vLLM,
LM Studio, OpenAI-compatible proxies — anything defined in
~/.gsd/agent/models.json) and then runs /gsd auto, the bootstrap silently
swaps it out for whichever model PREFERENCES.md happens to list. That
model is invariably a built-in provider (claude-code, anthropic) the user
isn't logged into, so auto-mode immediately fails with
"Not logged in · Please run /login", pauses, and resets the session to
claude-code/claude-sonnet-4-6.

Root cause: #3517 made resolveDefaultSessionModel() (PREFERENCES.md) take
priority over ctx.model (settings.json) in auto-start.ts. That fix was
correct for the scenario where settings.json had a stale built-in default
but PREFERENCES.md was freshly configured, but it has no awareness of
custom providers — PREFERENCES.md cannot reference them, so honoring it
when the session provider is custom always discards the user's explicit
choice.

Add isCustomProvider() to preferences-models.ts which checks whether a
provider is declared in ~/.gsd/agent/models.json (with ~/.pi/agent
fallback). Read the file directly with JSON.parse to avoid pulling in
the model-registry at this call site, and treat any read or parse error
as not-custom so a malformed models.json never breaks bootstrap.

In bootstrapAutoSession(), when the session provider is custom, use
ctx.model directly. Otherwise fall through to the existing #3517
behavior (preferredModel ?? ctx.model).

Tests:
- New behavioral regression in model-isolation.test.ts that mirrors
  the auto-start.ts logic and verifies the four interesting cases:
  custom session beats PREFERENCES.md, built-in session still defers
  to PREFERENCES.md (#3517 preserved), custom session with no
  PREFERENCES.md uses ctx.model, and null ctx.model falls through.
- New string-grep guard in auto-start-model-capture.test.ts that the
  isCustomProvider() call is wired into the snapshot path.
- Updated #3517 grep to allow the new branching shape while still
  asserting preferredModel remains a snapshot source for built-ins.

https://claude.ai/code/session_01QLYCeiXWjSFPEXFxjkSLni
2026-04-13 17:53:32 +00:00
Tom Boucher
3ff8989a62 fix(gsd): address 3 silent-crash secondary issues from #3348 post-#3696 (#4133)
* fix(ci): address 5 pipeline integrity issues from release audit

- version-stamp.mjs: regenerate package-lock.json after dev version stamp
  (mirrors the same fix applied to bump-version.mjs in #4116)

- bump-version.mjs: regenerate root and web/package-lock.json after version
  bump so both lockfiles are always in sync at release time

- pipeline.yml: add post-bump validation step that verifies all package.json
  files parse as valid JSON before the release commit is made

- pipeline.yml: split "Commit, tag, and push" — commit+tag+rebase happen
  before build, but git push is deferred until after build and npm publish
  both succeed, preventing a broken tag from landing on main

- pipeline.yml: emit a :⚠️: annotation when live LLM tests fail so
  failures are visible in the Actions UI instead of silently swallowed

Closes #4118

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(gsd): address 3 silent-crash secondary issues from #3348 post-#3696

Three gaps that remained after the double-fault fix in #3696:

1. unhandledRejection not wired — installEpipeGuard only registered
   uncaughtException; promise rejections that escaped without a catch
   were not handled by the GSD error path. Added _gsdRejectionGuard
   alongside _gsdEpipeGuard.

2. Non-fatal overcorrection — the #3696 fix replaced re-throwing with
   log-and-continue, leaving the process running in an indeterminate
   state after any non-EPIPE/non-ENOENT exception. Replaced with
   writeCrashLog + process.exit(1). writeCrashLog is extracted into
   bootstrap/crash-log.ts (zero deps) so tests can import it without
   pulling in the full extension graph.

3. unit-end not emitted after crash-with-side-effects — hameltomor
   observed that complete-milestone M001 wrote SUMMARY.md and updated
   the DB but never emitted unit-end (#3348 comment-4237533440). Added
   emitCrashRecoveredUnitEnd() in crash-recovery.ts: on the next
   auto-mode startup, if a stale lock references a unit whose
   unit-start has no matching unit-end in the journal, a synthetic
   unit-end with status "crash-recovered" is emitted before the lock
   is cleared. This closes the causal chain for downstream tooling
   and forensics without requiring changes to the lock file schema.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-13 12:33:16 -04:00
mastertyko
3c44e3d4e2 fix(gsd): tolerate corrupt task arrays (#4056) 2026-04-13 12:09:51 -04:00
mastertyko
5474e99ae2 feat(claude-code): pass thinking level as effort 2026-04-13 18:05:19 +02:00
mastertyko
e6110976e7 fix(gsd): discard milestone DB and worktree state (#4065) 2026-04-13 12:04:38 -04:00
Nils Reeh
1a635ac72c fix(gsd): wire subagent_model preference through to dispatch prompt builders
reactive_execution.subagent_model was validated and stored but never
passed to the prompt builders that generate subagent dispatch instructions.
The executing agent therefore autonomously chose its default model instead
of the configured preference.

- buildReactiveExecutePrompt: add subagentModel? param, inject into
  instruction string; auto-dispatch passes reactiveConfig.subagent_model
  with fallback to resolveModelWithFallbacksForUnit("subagent")
- buildParallelResearchSlicesPrompt: same pattern, resolves from
  models.subagent preference
- buildGateEvaluatePrompt: same pattern
- system-context: inject configured subagent model into system prompt
  so the executing agent always knows which model to use for subagents

Closes #4078

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-13 14:59:04 +02:00
Iouri Goussev
71f10a0d53 chore(gsd): delete 3 unreferenced dead files and orphaned test (#3728)
Phase 0 of #3631 — remove dead code before screaming architecture reorg.

- auto-observability.ts (72 LOC): zero imports anywhere in codebase
- rtk-status.ts (53 LOC): zero imports anywhere in codebase
- file-watcher.ts (100 LOC): zero imports anywhere in codebase
- file-watcher.test.ts: test for dead file-watcher.ts

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 08:30:32 -04:00