Commit graph

2590 commits

Author SHA1 Message Date
Jeremy McSpadden
adeedef328 Merge pull request #3562 from jeremymcs/fix/harden-flat-rate-guard
fix(gsd): harden flat-rate routing guard against alias/resolution gaps
2026-04-05 15:16:01 -05:00
Jeremy McSpadden
d3a38bb771 Merge pull request #3552 from Tibsfox/fix/disable-routing-copilot
fix(gsd): disable dynamic model routing for flat-rate providers
2026-04-05 15:15:50 -05:00
Jeremy McSpadden
3e09184493 Merge pull request #3566 from jeremymcs/fix/complete-slice-string-coercion
fix(gsd): coerce string arrays to objects in complete-slice/task tools
2026-04-05 13:44:40 -05:00
Jeremy
0b7764349c chore(gsd): remove copyright line from test file 2026-04-05 13:33:13 -05:00
Jeremy
e210b7efdf fix(gsd): follow CONTRIBUTING standards for #3565
- Move new coercion tests to standalone file using node:test +
  node:assert/strict (per CONTRIBUTING testing standards)
- Remove tests from legacy complete-slice.test.ts to avoid mixing
  test frameworks in the same file
2026-04-05 13:32:56 -05:00
Jeremy
6046a31c6f fix(gsd): address Codex adversarial review findings for #3565
- verificationEvidence coercion now uses sentinel values (exitCode: -1,
  verdict: "unknown") instead of fabricating passing results
- String coercion for requirements fields now parses "ID — detail"
  delimiter format to preserve semantic payload
- Added regression tests for sentinel values and delimiter parsing

Closes #3565
2026-04-05 13:30:09 -05:00
Jeremy
0742cf3493 fix(gsd): coerce string arrays to objects in complete-slice/task tools (#3565)
LLMs sometimes pass plain strings instead of the expected object shape
for array fields like filesModified and requires, causing TypeBox
validation to reject the input before the execute function runs. This
adds Type.Union schemas to accept both formats and normalizes strings
to objects with sensible defaults in the execute functions.

Closes #3565
2026-04-05 13:23:30 -05:00
Jeremy
3a1e9e3416 fix(gsd): harden flat-rate routing guard against alias/resolution gaps
The flat-rate provider guard from #3552 can fail open in two scenarios:

1. Provider alias mismatch — isFlatRateProvider only matched the exact
   string "github-copilot", but "copilot" appears as a provider alias
   in the codebase. Case variations could also bypass the check.
   Fix: add "copilot" alias and lowercase input before set membership.

2. Unresolved primary model — when resolveModelId returns undefined
   (stale model ID, registry mismatch), the guard was skipped entirely,
   allowing dynamic routing to downgrade models on a flat-rate backend.
   Fix: fall back to autoModeStartModel.provider and ctx.model.provider
   when primary resolution fails, disabling routing if either indicates
   a flat-rate provider.

Ref: #3453
2026-04-05 13:09:44 -05:00
Tibsfox
9ab675a843 fix(gsd): disable dynamic model routing for flat-rate providers 2026-04-05 10:24:52 -07:00
Jeremy McSpadden
a6b7febc5e Merge pull request #3545 from jeremymcs/feat/ollama-native-chat-provider
feat(ollama): native /api/chat provider with full option exposure
2026-04-05 11:05:47 -05:00
Jeremy McSpadden
092d1c0a9e Merge pull request #3546 from jeremymcs/worktree-issue-3541-ollama-native
fix(gsd): prevent LLM from querying gsd.db directly via bash
2026-04-05 10:51:01 -05:00
Jeremy
563fdae8e2 ci: add scanignore for doctor-heal.md false positive
The prompt injection scan flags "You are now responsible" in
doctor-heal.md as role injection (matches "you are now [a-z]").
This is a pre-existing legitimate prompt instruction, not injection.
2026-04-05 10:22:03 -05:00
Jeremy
bc20104a44 perf(gsd): trim promptGuidelines to 1 line to reduce per-turn token cost
promptGuidelines from every registered tool are injected into the system
prompt on every API call. The return shape details were redundant (the
JSON response is self-describing). Keep only the sqlite3 prohibition.
2026-04-05 10:11:15 -05:00
Jeremy
7d74081434 fix(gsd): address Codex adversarial review findings
1. Replace ensureDbOpen() with isDbAvailable() in gsd_milestone_status
   so the read-only tool cannot create/migrate the DB as a side effect
2. Wrap all reads in a BEGIN/COMMIT transaction for snapshot consistency
   under concurrent WAL writes
3. Broaden negative regex in guardrail tests to catch sqlite3 with
   flags, relative paths, absolute paths, and quoted paths
2026-04-05 09:56:19 -05:00
Jeremy
4d9eb9ead0 fix(gsd): prevent LLM from querying gsd.db directly via bash (#3541)
Add 4-layer defense-in-depth to enforce single-writer WAL discipline:

1. Global anti-pattern in system.md protecting all 35+ auto-mode units
2. DB access safety blocks in 5 high-risk prompts (validate-milestone,
   complete-milestone, doctor-heal, forensics, reassess-roadmap)
3. New gsd_milestone_status read-only query tool giving the LLM a
   sanctioned path to inspect milestone/slice/task state
4. 14 regression tests (8 prompt guardrails + 6 tool coverage)

Closes #3541
2026-04-05 09:43:56 -05:00
Jeremy
4ba2d5a219 feat(ollama): native /api/chat provider with full option exposure
Replace the OpenAI-compat shim with a native Ollama /api/chat streaming
provider that exposes all commonly-used Ollama options and surfaces
inference performance metrics.

Key changes:
- Native NDJSON streaming from /api/chat (no more OpenAI shim)
- Known models send num_ctx from capability table; unknown models defer
  to Ollama's default to avoid OOM on constrained hosts
- Exposes: temperature, top_p, top_k, repeat_penalty, seed, num_gpu,
  keep_alive, num_predict via per-model providerOptions
- Extracts <think>...</think> blocks for reasoning models (deepseek-r1, qwq)
- Surfaces InferenceMetrics (tokens/sec, durations) on AssistantMessage
- Adds remove and show actions to ollama_manage LLM tool
- Adds "ollama-chat" to KnownApi, providerOptions to Model<TApi>
- NDJSON parser uses strict mode for chat (fails on malformed frames)
- Mixed content+tool_call chunks handled independently

Closes #3544
2026-04-05 09:01:40 -05:00
Jeremy McSpadden
dcf41154b8 Merge pull request #3540 from Tibsfox/fix/seed-requirements-from-markdown
fix(gsd): seed requirements table from REQUIREMENTS.md on first update
2026-04-05 08:11:59 -05:00
Jeremy McSpadden
5c7e5efcf4 Merge pull request #3539 from Tibsfox/fix/inject-slice-context-into-prompts
fix(gsd): inject S##-CONTEXT.md from slice discussion into all prompt builders
2026-04-05 08:07:19 -05:00
Tibsfox
58c19ed48d fix(gsd): seed requirements table from REQUIREMENTS.md on first update
When requirements are authored in REQUIREMENTS.md during the discussion
phase (the standard workflow), the DB requirements table stays empty.
gsd_requirement_update then fails with not_found for every requirement
at milestone completion, burning tokens on retries.

When updateRequirementInDb encounters a requirement ID not in the DB,
it now parses REQUIREMENTS.md via parseRequirementsSections() and seeds
all requirements into the DB before retrying the lookup. This preserves
the original content (class, description, why, source, validation)
instead of creating an empty skeleton.

The seeding is:
- Lazy: only runs on first miss, not on every update
- Collision-safe: skips IDs already in the DB
- Non-blocking: falls through to skeleton if REQUIREMENTS.md is
  missing or unparseable

Adds 1 regression test verifying that updating R005 when the DB is
empty seeds all 3 requirements from REQUIREMENTS.md with their
original content preserved.

Closes #3346
2026-04-05 05:44:06 -07:00
Tibsfox
01a1295e4d fix(gsd): inject S##-CONTEXT.md from slice discussion into all prompt builders
S##-CONTEXT.md files produced by /gsd discuss (require_slice_discussion)
are never injected into downstream prompt builders. Discussed
requirements, acceptance criteria, and design decisions are silently
dropped — the researcher, planner, completer, replanner, and
reassessor never see them.

Add resolveSliceFile(base, mid, sid, "CONTEXT") + inlineFileOptional()
to all 5 affected builders:

  1. buildResearchSlicePrompt
  2. buildPlanSlicePrompt
  3. buildCompleteSlicePrompt
  4. buildReplanSlicePrompt
  5. buildReassessRoadmapPrompt

The slice CONTEXT is placed immediately after the roadmap and before
other context (research, decisions, requirements) so the discussed
scope is visible before detailed planning artifacts.

Uses the existing inlineFileOptional() pattern — if no S##-CONTEXT.md
exists, nothing is injected (zero cost for projects not using slice
discussion).

Adds 5 regression tests verifying each builder resolves and inlines
the slice CONTEXT file.

Closes #3452
2026-04-05 05:41:14 -07:00
Jeremy McSpadden
f64a7c517d Merge pull request #3538 from jeremymcs/docs/documentation-refresh
docs: refresh documentation for v2.63.0
2026-04-05 07:40:18 -05:00
Jeremy McSpadden
8eba02f59f Merge pull request #3537 from jeremymcs/fix/model-fallback-race
fix(pi-coding-agent): resolve model fallback race that ignores configured provider
2026-04-05 07:38:21 -05:00
Jeremy
f4b87bf940 docs: refresh documentation for v2.63.0
Update What's New section from v2.52 to v2.63, expand native engine
docs to cover all 20+ modules, add missing extensions and ADRs to
indexes, update version references and Node.js requirements.
2026-04-05 07:37:31 -05:00
Jeremy
e3cd354d58 fix(cli): guard model re-apply against session restore and async rejection
Address Codex adversarial review findings:

1. Only re-apply the validated model when createAgentSession() signals
   a fallback (modelFallbackMessage is truthy). This prevents silently
   overriding the persisted model of resumed conversations.

2. Use modelRegistry.getAvailable() instead of find() to ensure the
   model's provider is request-ready before calling setModel().

3. Await session.setModel() and wrap in try/catch so provider auth
   failures don't surface as unhandled promise rejections at startup.

Applies to both print-mode and interactive-mode startup paths.
2026-04-05 07:27:26 -05:00
Jeremy
9fe13da3f2 fix(pi-coding-agent): resolve model fallback race that ignores configured provider (#3534)
Extension-provided models (e.g. claude-code/*) were unavailable during
findInitialModel() because pendingProviderRegistrations had not been
flushed yet, causing the fallback chain to select Google Gemini even
when the user explicitly configured claude-code as their default.

Three compounding issues fixed:

(A) Flush pendingProviderRegistrations in createAgentSession() before
    findInitialModel() runs, so extension models are in the registry
    when initial model selection happens.

(B) Re-apply the validated model to the session after
    validateConfiguredModel() in both print and interactive CLI paths.
    Previously, validation updated settingsManager but never called
    session.setModel(), leaving the session on the wrong model.

(C) Update defaultModelPerProvider.anthropic from "claude-opus-4-6[1m]"
    to "claude-opus-4-6" — the [1m] variant was removed from the model
    registry when the base model was upgraded to 1M context, causing the
    Anthropic fallback to silently fail and skip to Google.

Closes #3534
2026-04-05 07:14:24 -05:00
Tom Boucher
261e2a6d5f fix(detection): add xcodegen and Xcode bundle support to project detection (#1882)
* fix: detect Xcode bundles by suffix scan in worktree health check (#1882)

Xcode project directories have project-specific names (e.g. Sudokuxyz.xcodeproj)
that cannot be matched by the exact-filename PROJECT_FILES list. Add a
readdirSync suffix scan for *.xcodeproj and *.xcworkspace so iOS/macOS projects
are not incorrectly treated as greenfield when the health check runs.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: replace empty catch with debugLog in Xcode bundle scan

The silent-catch-diagnostics test (#3348) bans empty catch blocks in
migrated auto-mode files. Replace the bare `catch { /* best-effort */ }`
with a debugLog call to satisfy the workflow-logger requirement.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-05 08:13:42 -04:00
Tom Boucher
4cb6252b9b fix(perf): share jiti module cache across extension loads (#3308)
* fix(perf): share jiti module cache across extension loads (#2108)

Each extension was creating a new jiti instance with moduleCache: false,
causing shared dependencies to be recompiled for every extension. Use a
shared singleton with moduleCache: true so shared modules are compiled once.

Export resetExtensionLoaderCache() for test teardown and explicit reload.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: correct loader path in extension-load-perf test (4 → 3 levels up)

The test file is at src/tests/ (2 levels deep from repo root), so
fileURLToPath(import.meta.url) + 3x'..' reaches the repo root.
Using 4 levels exits the repo into the GitHub Actions workspace parent,
causing ERR_MODULE_NOT_FOUND for loader.js in dist/.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: use process.cwd() for loader path in perf test (source/compiled portability)

import.meta.url resolves to different depths in source (src/tests/) vs compiled
(dist-test/src/tests/), so relative '../' navigation produces the wrong path in
the build phase. process.cwd() is always the repo root in CI regardless of
where the test file is compiled to.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: trek-e <trek-e@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-05 07:59:17 -04:00
Tom Boucher
e9dabdc649 fix(resource-sync): prune removed bundled subdirectory extensions on upgrade (#1972)
* fix(resource-sync): prune removed bundled subdirectory extensions on upgrade

The managed-resources manifest and pruning system only tracked root-level
files, not subdirectory extensions. When a bundled subdirectory extension
like mcporter/ was removed from the bundle in a newer GSD version, the
previously-synced copy in ~/.gsd/agent/extensions/ persisted indefinitely,
causing tool name conflicts with its replacement (mcp-client/).

- Add installedExtensionDirs to the manifest alongside installedExtensionRootFiles,
  recording directory names present in the bundled extensions dir at sync time.
- In pruneRemovedBundledExtensions, diff previous installedExtensionDirs against
  current bundled dirs and rmSync({ recursive: true }) any that were removed.
- Add mcporter to the hardcoded stale-entry list for pre-manifest upgrades.
- Fix extension conflict error prefix: also match "conflicts with" (not just
  "supersedes") so extension-vs-extension conflicts are classified as warnings
  rather than hard errors.

Fixes #1955

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(resource-loader): repair mangled lines from conflict resolution

The Python regex used to resolve cherry-pick conflicts stripped trailing
newlines, causing declarations and comments to merge onto the same line.
Replace the file with the upstream/main version which contains all the
installedExtensionDirs logic correctly formatted.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(resource-loader): sweep all installed extension dirs not in current bundle

The manifest-based pruner only removed dirs it had previously recorded.
Extensions installed by pre-manifest versions (or manually) were never
tracked, so they survived upgrades. Add a sweep of the actual installed
extensions directory that removes any subdirectory absent from the current
bundle, regardless of manifest history.

Fixes the mcporter stale-dir regression test (#1972).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: check external-state DB path before symlink-resolved handler (#2952)

The external-state handler added in c609d813 was placed after the generic
symlink-resolved handler, which matches the same /.gsd/projects/<hash>/worktrees/
pattern and short-circuits to the wrong result. Move the external-state check
(which uses the more specific hex-hash regex) first so it takes precedence.

Fixes shared-wal test: external-state worktree path resolves to project state DB.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test: update db-path-worktree-symlink expectations for external-state (#2952)

/.gsd/projects/<hash>/worktrees/ paths now resolve to <hash>/gsd.db
after the external-state handler from #2952 was placed before the
symlink-resolved handler. On POSIX, getcwd() returns canonical paths so
<proj>/.gsd/projects/<hash>/worktrees/ would in practice appear as
~/.gsd/projects/<hash>/worktrees/ after OS symlink resolution — both
correctly handled by the external-state behavior.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: trek-e <trek-e@users.noreply.github.com>
2026-04-05 07:44:51 -04:00
Tom Boucher
29517f177d refactor(web): consolidate subprocess boilerplate into shared runner (#1899)
* refactor(web): consolidate subprocess boilerplate into shared runner

Extract subprocess-runner.ts with runSubprocess<T>() and resolveModulePaths()
to replace identical execFile+Promise+JSON.parse callback blocks duplicated
across 12 web service files. Adds 30s default timeout to all subprocess calls.

Fixes #1888

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: check external-state DB path before symlink-resolved handler (#2952)

The external-state handler added in c609d813 was placed after the generic
symlink-resolved handler, which matches the same /.gsd/projects/<hash>/worktrees/
pattern and short-circuits to the wrong result. Move the external-state check
(which uses the more specific hex-hash regex) first so it takes precedence.

Fixes shared-wal test: external-state worktree path resolves to project state DB.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test: update db-path-worktree-symlink expectations for external-state (#2952)

/.gsd/projects/<hash>/worktrees/ paths now resolve to <hash>/gsd.db
after the external-state handler from #2952 was placed before the
symlink-resolved handler. On POSIX, getcwd() returns canonical paths so
<proj>/.gsd/projects/<hash>/worktrees/ would in practice appear as
~/.gsd/projects/<hash>/worktrees/ after OS symlink resolution — both
correctly handled by the external-state behavior.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: trek-e <trek-e@users.noreply.github.com>
2026-04-05 07:44:32 -04:00
Tom Boucher
87a0475291 fix: recognize U+2705 checkmark emoji as completion marker in prose roadmaps (#1897)
* fix: recognize  (U+2705) as completion marker in prose roadmaps (#1884)

LLMs naturally use  (U+2705) to mark slices complete, but the parser
only recognized ✓ (U+2713), causing permanent dispatch blocks.

- roadmap-slices.ts: add U+2705 to headerPattern, prefixCheckPattern,
  and title prefix/suffix detection in parseProseSliceHeaders
- roadmap-mutations.ts: recognize U+2705 as "already done" to prevent
  double-marking
- doctor.ts: add prose-format fallback to markSliceDoneInRoadmap so
  the doctor fix works on H3 headers, not just checkbox format

Fixes #1884

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: check external-state DB path before symlink-resolved handler (#2952)

The external-state handler added in c609d813 was placed after the generic
symlink-resolved handler, which matches the same /.gsd/projects/<hash>/worktrees/
pattern and short-circuits to the wrong result. Move the external-state check
(which uses the more specific hex-hash regex) first so it takes precedence.

Fixes shared-wal test: external-state worktree path resolves to project state DB.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test: update db-path-worktree-symlink expectations for external-state (#2952)

/.gsd/projects/<hash>/worktrees/ paths now resolve to <hash>/gsd.db
after the external-state handler from #2952 was placed before the
symlink-resolved handler. On POSIX, getcwd() returns canonical paths so
<proj>/.gsd/projects/<hash>/worktrees/ would in practice appear as
~/.gsd/projects/<hash>/worktrees/ after OS symlink resolution — both
correctly handled by the external-state behavior.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: trek-e <trek-e@users.noreply.github.com>
2026-04-05 07:44:08 -04:00
Tom Boucher
ba71db3b28 fix(web): use safePackageRootFromImportUrl for cross-platform package root (#1881) (#1893)
Replace raw fileURLToPath in getDefaultPackageRoot with safePackageRootFromImportUrl
which returns null instead of throwing when the URL is not a valid local file URL.
This prevents the standalone bundle from crashing on Windows when import.meta.url
is baked in at build time with a Linux file:// path.

Co-authored-by: trek-e <trek-e@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-05 07:43:46 -04:00
Tom Boucher
b155c92708 fix: isolate CmuxClient stdio to prevent TUI hangs in CMUX (#3306)
* fix(cmux): isolate CmuxClient stdio to prevent TUI hangs (#1922)

Replace execFileAsync (promisify) with spawn in runAsync to allow explicit
stdio isolation. Both runSync and runAsync now set stdio: ["ignore", "pipe",
"pipe"] so the cmux CLI child process cannot inherit the parent's stdin/stderr
and steal keyboard input or corrupt TUI rendering.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* ci: retrigger after integration flake

---------

Co-authored-by: trek-e <trek-e@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-05 07:43:21 -04:00
Tom Boucher
a81701979f feat(parallel): slice-level parallelism with dependency-aware dispatch (#3315)
* feat(parallel): add slice-level parallelism with dependency-aware dispatch

Fixes #2340

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(parallel): handle missing slice lock, add worktree cleanup, remove dead code

- state.ts: When GSD_SLICE_LOCK is set but the locked slice ID is not
  found in activeMilestoneSlices, log a warning and return a blocked
  state with a clear error message instead of silently continuing with
  activeSlice=undefined. Applied in both DB-backed and legacy paths.

- slice-parallel-orchestrator.ts: Add worktree cleanup via removeWorktree
  in stopSliceParallel (after killing workers) and in the catch block of
  startSliceParallel (for partially created worktrees). Store basePath in
  SliceOrchestratorState so stopSliceParallel can reference it.

- status-guards.ts: isInactiveStatus does not exist on this branch
  (only isClosedStatus is defined), so no removal needed.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(state): remove duplicate logWarning import after rebase conflict resolution

The rebase merge left two import lines for logWarning from workflow-logger.
Consolidated into a single import including logError.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: trek-e <trek-e@users.noreply.github.com>
2026-04-05 01:55:20 -04:00
Tom Boucher
56617891d9 fix: worktree health check walks parent dirs for monorepo support (#3313)
* fix: worktree health check walks parent dirs for monorepo support (#2347)

The health check only looked for project markers (package.json,
Cargo.toml, etc.) in s.basePath directly. In monorepos, these files
live in a parent directory, causing false rejections.

Now walks up parent directories to find project markers before
triggering the greenfield warning.

Fixes #2347

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(health-check): cap parent walk at git repo boundary

The parent directory walk in the worktree health check was unbounded,
walking all the way to the filesystem root. Ancestor directories like ~
or /usr/local may contain unrelated package.json files, causing false
positive health checks. Now stops at the .git boundary.

Also adds a test assertion verifying the .git boundary guard exists.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(ts): remove duplicate imports introduced during rebase

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: trek-e <trek-e@users.noreply.github.com>
2026-04-05 01:28:05 -04:00
Tom Boucher
5fa34d438b fix(gsd): promote milestone status from queued to active in plan-milestone (#3317)
* fix(gsd): promote milestone status from queued to active in plan-milestone

upsertMilestonePlanning() did not include title or status in its UPDATE
statement. When a milestone row was pre-created by ensureMilestoneDbRow
with status "queued", the INSERT OR IGNORE in insertMilestone() silently
skipped the row, and upsertMilestonePlanning() never updated the status.
This left the milestone permanently stuck as "queued", preventing proper
state-machine phase transitions during milestone completion.

Add title and status columns to the upsertMilestonePlanning() UPDATE
statement and pass them from handlePlanMilestone(). Uses COALESCE with
NULLIF to preserve existing values when empty strings are passed.

Closes #3022

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(ts): remove extra title arg from upsertMilestonePlanning call

* fix(test): move title into planning object for 2-param upsertMilestonePlanning

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: trek-e <trek-e@users.noreply.github.com>
2026-04-05 01:27:25 -04:00
Tom Boucher
1c95b94720 fix(worktree): correct merge failure notification command from /complete-milestone to /gsd dispatch complete-milestone (#1901)
Fixes #1891

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 01:05:12 -04:00
Tom Boucher
2a40f3f35d feat(mcp-client): add OAuth auth provider for HTTP transport (#3295)
* feat(mcp-client): add OAuth auth provider for HTTP transport

MCP HTTP transport was created without any auth options, causing 401
errors when connecting to remote servers (Sentry, Linear, etc.) that
require OAuth or bearer token authentication.

Add a new auth module that builds StreamableHTTPClientTransport options
from two config fields:
- `headers`: static headers with ${VAR} env resolution (for bearer tokens)
- `oauth`: full OAuthClientProvider for servers implementing MCP OAuth

The config is parsed from .mcp.json / .gsd/mcp.json and passed through
to the SDK transport constructor.

Fixes #2160

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* chore: retrigger CI

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: trek-e <trek-e@users.noreply.github.com>
2026-04-05 01:05:10 -04:00
Tom Boucher
d9cea627bf fix: detect and block Gemini CLI OAuth tokens used as API keys (#3296)
* fix: detect and block Gemini CLI OAuth tokens used as API keys

Users who install Google's standalone Gemini CLI may inadvertently set
GEMINI_API_KEY to an OAuth access token (ya29.*) instead of an AI Studio
API key (AIza*). These tokens fail at the Google API with a confusing
error. This adds early detection at three entry points:

- AuthStorage.set(): throws when storing ya29.* as api_key for "google"
- AuthStorage.getApiKey(): blocks ya29.* from runtime overrides (--api-key)
- AuthStorage.getApiKey(): blocks ya29.* from environment variables

Each path provides a clear error message explaining the issue and
directing users to either get an API key from aistudio.google.com or
use /login google-gemini-cli for OAuth-based access.

Fixes #2157

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* chore: retrigger CI

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: trek-e <trek-e@users.noreply.github.com>
2026-04-05 01:05:08 -04:00
Tom Boucher
290b8e9104 fix(auto): break retry loop on tool invocation errors (malformed JSON) (#3298)
* fix(auto): break retry loop on tool invocation errors (malformed JSON)

When a GSD tool (e.g. gsd_complete_slice) is invoked with truncated or
malformed JSON arguments, the tool execution fails but postUnitPreVerification
only checks artifact existence — it cannot distinguish "tool never ran" from
"tool ran but forgot to write artifact". This causes a stuck retry loop where
the same malformed invocation is re-dispatched indefinitely.

Add isToolInvocationError() classifier to detect JSON validation/parse errors
in tool results. Track the last tool invocation error on AutoSession via the
tool_execution_end hook. In postUnitPreVerification, when artifact verification
fails AND a tool invocation error was recorded, pause auto-mode with a
diagnostic message instead of setting up pendingVerificationRetry.

Closes #2883

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* chore: retrigger CI

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: trek-e <trek-e@users.noreply.github.com>
2026-04-05 01:05:05 -04:00
Tom Boucher
fa3ca6206e fix(git): use git add -u in symlink .gsd fallback to prevent hang (#3299)
* fix(git): use git add -u in symlink .gsd fallback to prevent hang on large repos (#1977)

When .gsd is a symlink, nativeAddAllWithExclusions falls back from
pathspec exclusions (which git rejects with "beyond a symbolic link").
The previous fallback used nativeAddAll (git add -A), which traverses
the entire working tree — hanging indefinitely on repos with large
untracked data directories (444GB+ observed).

Change the fallback to git add -u, which only stages changes to
already-tracked files. This is O(tracked) instead of O(filesystem),
making it safe for repos with massive untracked trees.

The tradeoff — new untracked files are not staged in the symlink
fallback path — is acceptable because auto-commit primarily captures
changes the agent made to existing tracked files.

Regression of #1712 which introduced the symlink fallback but chose
git add -A. That fix assumed .gitignore covered all large directories,
but scientific/ML repos often have large untracked data dirs not in
.gitignore.

Fixes #1977

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* chore: retrigger CI

* fix(test): update symlink autoCommit test to use tracked file modification

The PR changed the symlink fallback from git add -A to git add -u.
git add -u only stages changes to tracked files; untracked files are
skipped. The existing test created a new untracked file which git add -u
ignores, causing autoCommit to return null (nothing to commit).

Pre-commit the source file before the scenario and modify it so git add
-u can stage the tracked change.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: trek-e <trek-e@users.noreply.github.com>
2026-04-05 01:05:03 -04:00
Tom Boucher
c91b6eca9c fix: handle complete-slice context exhaustion to unblock downstream slices (#3300)
* fix: handle complete-slice context exhaustion to unblock downstream slices

Closes #2653

Three interacting bugs caused complete-slice to silently fail on context
exhaustion, permanently blocking all downstream slices:

1. Zero-tool-call guard only covered execute-task — extended to all unit
   types so any unit completing with 0 tool calls (context exhaustion) is
   treated as failed and retried in a fresh context.

2. Verification retry had no max-attempt limit — added a cap of 3 retries
   before escalating to writeBlockerPlaceholder, preventing infinite retry
   loops when the unit never produces its artifact.

3. writeBlockerPlaceholder only updated DB for execute-task — extended to
   also update slice status for complete-slice, breaking the circular
   dependency where verification needs DB status "complete" but only
   gsd_slice_complete sets that.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: import diagnoseExpectedArtifact from auto-recovery

* fix(auto): replace empty catch comments with logWarning in auto-recovery

The silent-catch-diagnostics test requires migrated files to have no
empty catch bodies. The /* non-fatal */ inline catches in auto-recovery
were flagged as empty since comments don't count as executable code.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: trek-e <trek-e@users.noreply.github.com>
2026-04-05 01:05:00 -04:00
Tom Boucher
db7a6372a6 fix: cap consecutive tool validation failures to prevent stuck-loop (#3301)
* fix: cap consecutive tool validation failures to prevent stuck-loop (#2783)

When the LLM repeatedly emits tool calls with arguments that fail schema
validation, the agent loop retries indefinitely — each failed validation
returns an error tool result, the LLM retries with the same broken args,
and the cycle burns budget with no progress.

Add a consecutive-failure counter in runLoop that tracks turns where ALL
tool calls fail. After MAX_CONSECUTIVE_VALIDATION_FAILURES (3) consecutive
all-error turns, the loop emits a diagnostic stop message and terminates
cleanly. The counter resets whenever any tool call in a turn succeeds, so
intermittent failures do not trigger early termination.

Closes #2783

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* chore: retrigger CI

* fix(test): repair agent-loop.test.ts — close unclosed blocks, merge imports

Two test suites were concatenated without closing the first suite's
it+describe blocks, placing the second suite's imports inside a function
body and triggering 'Unexpected "{" ' from esbuild. Merged into a single
well-structured file with consolidated imports and proper closings.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: trek-e <trek-e@users.noreply.github.com>
2026-04-05 01:04:58 -04:00
Tom Boucher
4f6b3433d6 fix: make enrichment tool params optional for limited-toolcall models (#3302)
* fix: make enrichment tool params optional for limited-toolcall models

Models with limited tool-calling capability (kimi-k2.5, glm-5-turbo)
cannot populate 23 top-level parameters in a single tool call, causing
stuck-loop retries that burn tokens. This reduces required params to
only core identification and content fields (7 for slice_complete, 4
for plan_milestone, 6 for task_complete, 5 for complete_milestone, 4
for plan_slice) by making enrichment/metadata arrays optional with
sensible defaults in the handlers.

Closes #2771

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(complete-task): add null-coalescing defaults to renderSummaryMarkdown

Add regression test verifying handleCompleteTask does not crash when
optional enrichment fields (keyFiles, keyDecisions, verificationEvidence,
blockerDiscovered) are omitted. The code paths already apply ?? defaults
after the #2720 refactor, but no test covered the minimal-params case.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* chore: retrigger CI

* fix: add null coalescing for optional string params in complete-task

params.deviations and params.knownIssues are optional (string | undefined)
but the task row expects string. Add ?? "" to satisfy TS2322.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: trek-e <trek-e@users.noreply.github.com>
2026-04-05 01:04:55 -04:00
Tom Boucher
7bbc0dd621 fix: add filesystem safety guard to complete-slice.md (#3304)
* fix: add filesystem safety guard to complete-slice.md (closes #2935)

Port the EISDIR prevention instruction from complete-milestone.md to
complete-slice.md so the LLM never passes a directory path to the
read tool when task summaries are truncated by the 30k-char cap.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* chore: retrigger CI (flaky #2912 MERGE_HEAD test)

* chore: retrigger CI

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: trek-e <trek-e@users.noreply.github.com>
2026-04-05 01:04:52 -04:00
Tom Boucher
34e73b01d1 fix(extensions): use bundledExtensionKeys for conflict detection instead of broken path heuristic (#3305)
Fixes #2075

The isBuiltIn check in detectExtensionConflicts used a path-based
heuristic that excluded paths containing /.gsd/agent/extensions/.
Since initResources() syncs all bundled extensions into that same
directory, the heuristic could never return true for bundled extensions,
preventing the "supersedes" hint from being appended. This caused
downstream cli.ts to display "Extension load error" instead of the
intended "Extension conflict" for benign precedence collisions.

Replace the path heuristic with an explicit bundledExtensionKeys set
(already computed by buildResourceLoader) threaded through to
detectExtensionConflicts. The set is matched against the extension
directory name extracted from the owner path.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 01:04:50 -04:00
Tom Boucher
85ece7ea69 fix: scope tools during discuss flows to prevent grammar overflow (#3307)
* fix: scope tools during discuss flows to prevent grammar overflow (#2949)

xAI/Grok (and other providers using grammar-based constrained decoding)
reject tool schemas that exceed their grammar complexity limit with HTTP
400 "Grammar is too complex." The full GSD tool set registers ~33 tools
with deeply nested schemas; discuss flows only need a small subset.

Add DISCUSS_TOOLS_ALLOWLIST to constants.ts listing the 10 GSD tools
(5 canonical + 5 aliases) actually referenced by discuss prompts. In
dispatchWorkflow, when unitType starts with "discuss-", filter active
tools to exclude heavy planning/execution/completion tools before
dispatching.

Closes #2949

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* chore: retrigger CI

* fix: remove resolveModelWithFallbacksForUnit import from guided-flow.ts

This import was incorrectly added as part of the discuss grammar changes.
The regression guard test (#2958) requires this function not be imported
in guided-flow.ts.

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: trek-e <trek-e@users.noreply.github.com>
2026-04-05 01:04:48 -04:00
Tom Boucher
1fe316477d fix(preferences): warn on silent parse failure for non-frontmatter files (#3310)
* fix(preferences): warn when preferences file lacks YAML frontmatter delimiters

parsePreferencesMarkdown() silently returned null for non-frontmatter
content, causing all preferences (including git.isolation: none) to be
ignored. The system fell back to default worktree isolation with no
indication to the user.

Now emits a stderr warning when a non-empty preferences file cannot be
parsed due to missing --- fences, so users know their file was skipped.

Fixes #2036

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* ci: retrigger after perf test fix

* fix(test): update preferences test to expect heading+list parser result

The heading+list fallback parser (parseHeadingListFormat) now handles
non-frontmatter markdown content like "## Git\n- isolation: none",
so the test should expect a parsed object instead of null.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* chore: retrigger CI

* fix: add missing closing brace for test block in preferences.test.ts

The 'unrecognized format warning' test block was missing its closing });
after the finally clause, causing TS1005 syntax error at line 535.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(preferences): include 'unrecognized format' in warn-once message (#2373)

The test filters for "unrecognized format" but the message only said
"does not use YAML frontmatter delimiters". Add the phrase so the
warn-once regression test can find its expected output.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: trek-e <trek-e@users.noreply.github.com>
2026-04-05 01:04:46 -04:00
Tom Boucher
5c2d8988bb fix: track remote-questions in managed-resources manifest (#3312)
* fix: track remote-questions extension in managed-resources manifest

writeManagedResourceManifest only checked for index.js/index.ts when
deciding if a subdirectory is an extension. remote-questions uses mod.ts
as its entry point and was missed, causing it to be pruned on upgrades.

Also check for extension-manifest.json which is the canonical marker for
bundled extensions.

Fixes #2367

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* chore: retrigger CI

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: trek-e <trek-e@users.noreply.github.com>
2026-04-05 01:04:44 -04:00
Tom Boucher
5d194d8701 fix(auto): add timeout guard for postUnitPostVerification in runFinalize (#3314)
* fix(auto): add timeout guard for postUnitPostVerification in runFinalize (#2344)

After plan-slice completes, the auto-loop can hang indefinitely if
postUnitPostVerification() never resolves (e.g., module import deadlock,
SQLite transaction hang). The terminal becomes unresponsive with no
error, no notification, and no recovery path.

Changes:
- Add withTimeout() utility in auto/finalize-timeout.ts that races a
  promise against a configurable timeout, returning a discriminated
  result instead of throwing
- Wrap the postUnitPostVerification() call in runFinalize() with a 60s
  timeout guard — on timeout, log a warning and force-return "next" so
  the loop continues to the next iteration
- Emit iteration-end journal event in the catch block of autoLoop() so
  the journal always records iteration completion, even on error

Fixes #2344

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* chore: retrigger CI

* fix(ts): remove duplicate imports introduced during rebase

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: trek-e <trek-e@users.noreply.github.com>
2026-04-05 01:04:41 -04:00
Tom Boucher
ff5015431e fix(gsd): handle large markdown parameters in complete-milestone JSON parsing (#3316)
* fix(gsd): sanitize complete-milestone params to handle LLM JSON quirks

When smaller models (e.g. claude-haiku) generate tool-call JSON with large
markdown parameters, the deserialized params can arrive with wrong types:
numbers instead of strings, null instead of arrays, string "true" instead
of boolean true. This caused crashes in handleCompleteMilestone.

Add sanitizeCompleteMilestoneParams() that normalizes all fields before
the handler runs, preventing TypeError crashes on type mismatches.

Closes #3013

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: add type narrowing for optional fields in sanitize-complete-milestone

followUps and deviations are required strings in CompleteMilestoneParams,
so use toStr() directly (which returns "" for nullish values) instead of
conditionally returning undefined.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: update test assertion to match toStr empty string default

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* chore: retrigger CI

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: trek-e <trek-e@users.noreply.github.com>
2026-04-05 01:04:38 -04:00