Commit graph

2509 commits

Author SHA1 Message Date
Jeremy McSpadden
12ef95024c Merge pull request #3451 from deseltrus/fix/stale-retries-after-model-switch
fix: cancel stale retries after model switch
2026-04-04 18:27:33 -05:00
Jeremy McSpadden
dbaf37ae78 Merge pull request #3491 from Tibsfox/fix/claude-code-skill-directory-support
fix(gsd): add Claude Code official skill directories to skill resolution
2026-04-04 18:24:51 -05:00
Jeremy McSpadden
677ca806df Merge pull request #3494 from Tibsfox/fix/decision-save-transaction-race
fix(gsd): wrap decision and requirement saves in transaction to prevent ID races
2026-04-04 18:23:15 -05:00
Tibsfox
c70eacea89 fix(gsd): wrap decision and requirement saves in transaction to prevent ID races
nextDecisionId() and nextRequirementId() compute the next ID via
SELECT MAX then pass it to a separate upsertDecision/upsertRequirement
call. When parallel tool calls hit these functions concurrently, both
read the same MAX value and produce the same ID — the second insert
silently overwrites the first.

Move the SELECT MAX + INSERT into a single transaction() call from
gsd-db.ts, which uses BEGIN/COMMIT/ROLLBACK and works on both
better-sqlite3 and node:sqlite providers. The transaction is
re-entrant safe (nested calls skip the BEGIN).

Same fix applied to saveRequirementToDb for consistency.

Closes #3326, closes #3339, closes #3459
2026-04-04 15:16:52 -07:00
Tibsfox
e107828363 fix(gsd): add Claude Code official skill directories to skill resolution
GSD-2 only searches ~/.agents/skills/ and .agents/skills/ for skills.
Claude Code's official skill directories (~/.claude/skills/ and
.claude/skills/) are not included in the search path, making GSD-2
blind to any skills managed there.

The skills.sh CLI (npx skills list -g) already recognises both
~/.agents/skills/ and ~/.claude/skills/ as valid global skill
directories. This commit aligns GSD-2's resolution logic with
that behaviour.

Affected functions:
- getSkillSearchDirs(): adds ~/.claude/skills/ and .claude/skills/
- captureAvailableSkills(): includes Claude Code dir in telemetry
- detectStaleSkills(): includes Claude Code dir in staleness checks
- detectNewSkills(): resolves SKILL.md from either directory
- isPackInstalled(): checks both dirs before recommending installs
- formatSkillDetail(): finds SKILL.md in either directory
2026-04-04 15:15:29 -07:00
Jeremy McSpadden
099e6f3120 Merge pull request #3511 from jeremymcs/fix/steer-worktree-path
fix(gsd): steer writes overrides to worktree when active
2026-04-04 16:10:30 -05:00
github-actions[bot]
40fc92a2a6 release: v2.62.0 2026-04-04 21:10:16 +00:00
github-actions[bot]
f6521cd92e release: v2.61.0 2026-04-04 20:53:42 +00:00
Jeremy
ee87924636 fix(gsd): gate steer worktree routing on active session, fix messaging
Address adversarial review findings:

1. [high] Override routing now requires an active auto-mode session
   (in-process or remote via checkRemoteAutoSession) before writing
   to a worktree path. Previously, any existing worktree directory
   would receive the override even if no agent was running there —
   a leftover worktree from a previous session would silently eat
   the override.

2. [medium] Success messages now report the actual resolved override
   location (worktree vs project root .gsd/OVERRIDES.md) so operators
   know exactly where to look during recovery or manual rewrite.

Additional tests cover: inactive worktree fallback, double-gate
(autoRunning + valid .git), and getAutoWorktreePath null on missing .git.

Closes #3476
2026-04-04 15:37:13 -05:00
Jeremy McSpadden
82906524a8 Merge pull request #3482 from NilsR0711/fix/remote-questions-interactive-mode
fix(remote-questions): fire configured channels in interactive mode
2026-04-04 15:32:19 -05:00
Jeremy
bd863e3e21 fix(gsd): resolve steer overrides to worktree path when worktree is active
handleSteer used process.cwd() as the base path for appendOverride,
which writes to project/.gsd/OVERRIDES.md. When auto-mode runs in a
worktree, it reads from worktree/.gsd/ — so overrides written from a
second terminal were never seen by the agent.

Now checks for an active worktree via getAutoWorktreePath and writes
the override there when one exists, falling back to the project root
when no worktree is active.

Closes #3476
2026-04-04 15:25:26 -05:00
Jeremy McSpadden
af82c37041 Merge pull request #2755 from jeremymcs/feat/capability-aware-model-routing-pr
feat: capability-aware model routing (ADR-004)
2026-04-04 15:23:38 -05:00
Jeremy McSpadden
2acf5292d0 Merge pull request #3508 from jeremymcs/fix/audit-log-hardening
fix(gsd): harden audit log persistence and demote probe warnings
2026-04-04 15:12:14 -05:00
Jeremy McSpadden
1c4219ee2e Merge pull request #3510 from jeremymcs/feat/codebase-map-enhancements
feat(gsd): enhance /gsd codebase with preferences, --collapse-threshold, and auto-init
2026-04-04 15:11:36 -05:00
Jeremy
4ddb9ca8a5 fix(gsd): add codebase validation in validatePreferences so preferences are not silently dropped
The codebase preferences block was accepted as a known key but never
validated or assigned in validatePreferences(), causing all user-configured
codebase defaults to be silently discarded. Adds validation for
exclude_patterns (string[]), max_files (positive int), and collapse_threshold
(positive int) with unknown-key warnings and 4 new tests.
2026-04-04 15:01:15 -05:00
Jeremy
bbe67da02c feat(gsd): enhance /gsd codebase with preferences, --collapse-threshold, and auto-init
Add configurable codebase map options via preferences.md (exclude_patterns,
max_files, collapse_threshold), expose --collapse-threshold as a CLI flag,
and auto-generate CODEBASE.md during project init for instant agent orientation.

Closes #3509
2026-04-04 14:51:51 -05:00
Jeremy McSpadden
5cd25cf5df Remove copyright notice from test file
Removed copyright notice from capability-router tests.
2026-04-04 14:33:50 -05:00
Jeremy
a290708573 fix(test): update db-path-worktree-symlink test for simplified diagnostic logging
The ensureDbOpen catch block now logs via logWarning with error message
instead of structured diagnostic object. Update source-level assertion
to match the new pattern.
2026-04-04 14:33:12 -05:00
Jeremy
6eb532bf9d fix(gsd): update tests for errors-only audit persistence, fix empty catch blocks
Update existing workflow-logger tests to use logError for audit
persistence assertions (warnings are now ephemeral). Add void
expression to empty catch blocks in detectMainBranch to satisfy
the no-empty-catch CI check.
2026-04-04 14:29:00 -05:00
Jeremy
10cd4a12c5 test(gsd): add workflow-logger audit persistence tests
Covers error-only persistence policy, warning ephemeral behavior,
message truncation, context field allowlist sanitization, and
mixed severity filtering.
2026-04-04 14:22:56 -05:00
Jeremy
2396ecf1db fix(gsd): harden audit log persistence — errors-only, sanitized, demote probe warnings
Only persist error-severity entries to audit-log.jsonl (warnings stay
ephemeral in stderr + buffer). Sanitize persisted entries with message
truncation and context field allowlisting. Demote expected main/master
branch probe failures to silent control flow. Remove JSON.stringify of
diagnostic objects embedding cwd/paths in warning messages.

Addresses Codex adversarial review findings on workflow-logger migration.
2026-04-04 14:19:36 -05:00
Jeremy McSpadden
104d103d14 Merge pull request #3501 from jeremymcs/fix/upgrade-kotlin-lsp
fix: upgrade Kotlin LSP to official Kotlin/kotlin-lsp
2026-04-04 14:04:40 -05:00
Jeremy McSpadden
7a1c6213a0 Merge pull request #3507 from jeremymcs/refactor/workflow-logger-migration
refactor(gsd): migrate all catch blocks to centralized workflow-logger
2026-04-04 14:04:26 -05:00
Jeremy McSpadden
1a21915572 Merge pull request #3505 from jeremymcs/pr-3496
fix(gsd): fail-closed stop guard, harden backtrack parsing, fix prompt params
2026-04-04 13:59:04 -05:00
Jeremy
64fe364fdb fix(gsd): address adversarial review findings on workflow-logger migration
workflow-events.ts: stop logging raw event line content to audit log —
log byte length only to avoid persisting potentially sensitive payload
fragments to .gsd/audit-log.jsonl.

parallel-orchestrator.ts: revert worker NDJSON parse failure to silent
drop — non-JSON lines (progress text, tool output) are expected in
worker stdout and logging each one creates I/O pressure and audit log
bloat in the parallel execution hot path.
2026-04-04 13:53:16 -05:00
Jeremy
3d6d72c04d refactor(gsd): migrate all catch blocks to centralized workflow-logger
Replace raw process.stderr.write(), console.error(), and empty catch
blocks across 50 GSD files with structured logWarning/logError calls
from the centralized workflow-logger system.

Add 13 new LogComponent types to cover all subsystems: recovery,
session, prompt, dashboard, timer, worktree, command, parallel, fs,
bootstrap, guided, registry, renderer.

Every migrated catch block now automatically:
- Shows in terminal (stderr) with component tag
- Gets buffered for auto-loop stuck-detection summary
- Persists to .gsd/audit-log.jsonl for post-mortem analysis

Update regression test to verify catch blocks use workflow-logger
instead of raw stderr/console, covering auto-mode files and all
explicitly migrated infrastructure files.

Closes #3506
Supersedes the approach in #3496
2026-04-04 13:42:55 -05:00
Jeremy
abe887de10 fix(gsd): fail-closed stop guard, harden backtrack parsing, fix prompt params
- Stop/backtrack guard now calls pauseAuto before marking captures executed,
  and returns break on any exception to prevent silently dropping user halt intent
- Backtrack target parsing excludes current milestone ID and rejects ambiguous
  multi-target strings instead of guessing first match
- Fixed gsd_skip_slice parameter names in rethink prompt (milestone_id → milestoneId)
2026-04-04 13:09:16 -05:00
Tibsfox
4f896cc561 fix(gsd): add diagnostic logging to empty catch blocks in auto-mode
Auto-mode has empty catch blocks across 11 files that silently
swallow errors. When these operations fail (DB writes, git commands,
file sync, worktree operations), the error is lost and downstream
systems see stale or inconsistent state — leading to stuck loops,
phantom milestones, and silent data loss.

Replace every empty catch with a process.stderr.write() call that
logs the operation context and error message. Format:

  gsd [filename]: <operation> failed: <error.message>

For catches already annotated with /* non-fatal */ or /* best-effort */
comments, the logging is added alongside the annotation to preserve
the original intent while making failures observable.

Adds a regression test that scans all auto-mode source files and
asserts no empty catch blocks remain.

Files modified (11):
  auto-worktree.ts, auto.ts, auto-recovery.ts, auto-prompts.ts,
  auto-dashboard.ts, auto-start.ts, auto-timers.ts, auto-post-unit.ts,
  auto-dispatch.ts, auto-unit-closeout.ts, auto/phases.ts

No behavioral changes — only diagnostic output added.

Addresses #3348, addresses #3345
2026-04-04 10:38:54 -07:00
Jeremy McSpadden
d07f573799 Merge pull request #3499 from jeremymcs/test/state-machine-edge-cases
test(gsd): fill state machine E2E verification gaps
2026-04-04 11:57:17 -05:00
Jeremy
4df55a51c8 fix(lsp): add legacy alias for renamed kotlin-language-server key
Users with existing lsp.json overrides referencing the old
"kotlin-language-server" key would silently lose their Kotlin
LSP config after the rename to "kotlin-lsp". LEGACY_ALIASES
map remaps old keys during mergeServers() so overrides still
merge correctly.
2026-04-04 11:45:58 -05:00
Jeremy McSpadden
7365b85b4a Merge pull request #3503 from jeremymcs/fix/interview-notes-loop
fix: break infinite notes loop on "None of the above"
2026-04-04 11:37:46 -05:00
Jeremy
1e31ca4b29 ci: trigger CI re-run 2026-04-04 11:23:59 -05:00
Jeremy
e0884375e6 test: add regression test for interview-ui notes loop (#3502)
Exercises the goNextOrSubmit → notes auto-open path to ensure:
- Enter after typing a note advances instead of looping
- Empty notes still trigger the auto-open
- Normal option selection is unaffected

Fixes #3502
2026-04-04 11:22:15 -05:00
Jeremy
f153745c4f fix: break infinite notes loop when selecting "None of the above"
goNextOrSubmit() unconditionally reopened the notes field whenever the
cursor sat on the "None of the above" slot, even after the user had
already typed a note and pressed Enter. This trapped users in an
endless loop where Enter always bounced back to notes mode.

Add a `!states[currentIdx].notes` guard so the auto-open only fires
when notes are still empty.

Fixes #3502
2026-04-04 11:12:17 -05:00
Jeremy
e3288e8dad fix: align defaultRoutingConfig capability_routing to true
The feature branch intends capability_routing to default to true when
routing is enabled. Conflict resolution incorrectly kept the false
default from the earlier commit.
2026-04-04 11:04:27 -05:00
Jeremy
946eec3bd1 docs(01-05): update dynamic-model-routing.md with capability-aware routing features
- Add Capability Profiles section: 7 dimensions, 9 built-in profiles, uniform-50 cold-start
- Add How Scoring Works section: pipeline order, weighted average formula, task requirements table
- Add User Overrides section: modelOverrides JSON example, deep-merge semantics
- Update Configuration section: document capability_routing flag
- Add Verbose Output section: scoring breakdown format, selectionMethod field
- Add Extension Hook section: before_model_select payload, return value, first-override-wins
2026-04-04 10:57:16 -05:00
Jeremy
6dc7c0ec1d test(01-05): add capability-aware routing integration tests
- Full pipeline with capability_routing: true returns capability-scored decision
- capability_routing: false falls back to tier-only with no capabilityScores
- Single eligible model (pinned) skips scoring and uses tier-only
- Unknown model gets uniform score of 50 and competes in scoring
- capabilityOverrides change scoring outcome in scoreEligibleModels
- capabilityOverrides pass through resolveModelForComplexity to STEP 2
- Regression guards: routing disabled, unknown model, no-downgrade-needed all pass
- All 51 tests pass (42 existing + 9 new integration)
2026-04-04 10:56:23 -05:00
Jeremy
1645be072c feat(01-05): fire before_model_select hook, add verbose scoring output, load capability overrides
- Fire pi.emitBeforeModelSelect() in selectAndApplyModel before resolveModelForComplexity
- Hook override bypasses capability scoring entirely with tier-only selectionMethod
- Verbose output shows capability-scored breakdown: model scores sorted descending
- Add loadCapabilityOverrides() to model-router.ts for deep-merge with built-in profiles
- Extend resolveModelForComplexity signature with optional capabilityOverrides parameter
- Pass capabilityOverrides through to scoreEligibleModels in STEP 2
2026-04-04 10:56:22 -05:00
Jeremy
6cc42bb504 feat(01-04): register before_model_select placeholder handler in GSD hooks
- Add before_model_select handler registration inside registerHooks()
- Handler returns undefined (no override) to let capability scoring proceed
- Comment references ADR-004 for traceability
- Serves as documentation and ensures event type is registered for Plan 05 wiring
2026-04-04 10:56:06 -05:00
Jeremy
1cea7fb8bc feat(01-04): add BeforeModelSelectEvent to extension API and wire emission
- Add BeforeModelSelectEvent interface and BeforeModelSelectResult type to types.ts
- Add on('before_model_select') subscription overload to ExtensionAPI interface
- Add emitBeforeModelSelect() method to ExtensionAPI interface and ExtensionRuntimeState
- Implement emitBeforeModelSelect() on ExtensionRunner using invokeHandlers (first-override-wins)
- Bind runner's emitBeforeModelSelect into shared runtime at construction time
- Wire emitBeforeModelSelect delegation through createExtensionAPI in loader.ts
2026-04-04 10:56:06 -05:00
Jeremy
1866ccf781 feat(01-03): wire taskMetadata from selectAndApplyModel to resolveModelForComplexity
- Pass unitType and classification.taskMetadata as 5th and 6th args to resolveModelForComplexity
- Completes end-to-end data pipeline: classifier extracts metadata, attaches to ClassificationResult, auto-model-selection passes through to router for capability scoring
2026-04-04 10:56:06 -05:00
Jeremy
accee43563 feat(01-03): insert STEP 2 capability scoring into resolveModelForComplexity
- Add unitType and taskMetadata optional params to resolveModelForComplexity
- Replace findModelForTier with getEligibleModels for multi-model eligible set
- Insert STEP 2 scoring block: activates when capability_routing enabled, eligible.length > 1, unitType provided
- Add buildFallbackChain helper to deduplicate fallback assembly logic
- Scoring returns capability-scored selectionMethod with capabilityScores and taskRequirements
- Single-model and zero-model paths fall through to tier-only behavior
- All 42 existing tests pass unchanged (backward compat via optional params)
2026-04-04 10:55:28 -05:00
Jeremy
bf918d30d5 test(01-02): add unit tests for scoring functions and taskMetadata passthrough
- Add scoreModel, computeTaskRequirements, scoreEligibleModels, getEligibleModels
  describe blocks to model-router.test.ts (27 new tests)
- Add ClassificationResult taskMetadata describe block to complexity-classifier.test.ts
  (4 new tests: execute-task populated, hook undefined, plan-slice undefined, extractTaskMetadata export)
- Add getModelTier unknown-default tests verifying standard tier (not heavy) per D-15
- All 42 model-router tests pass, all 32 complexity-classifier tests pass
- All 36 pre-existing capability-router tests continue to pass
2026-04-04 10:54:02 -05:00
Jeremy
409cd77cbc feat(01-01): add taskMetadata to ClassificationResult and export extractTaskMetadata
- Add taskMetadata?: TaskMetadata to ClassificationResult in complexity-classifier.ts
- Add taskMetadata?: TaskMetadata to ClassificationResult in types.ts (duplicate interface)
- Export extractTaskMetadata() so it can be imported by model-router.ts
- Refactor classifyUnitComplexity() to extract metadata once for execute-task (eliminates double-extraction at adaptive learning step)
- Populate taskMetadata field on ClassificationResult for execute-task units
- Set taskMetadata: undefined explicitly on hook unit fast-path
2026-04-04 10:53:45 -05:00
Jeremy
0ccd3fd8a4 feat(01-01): add capability types, data tables, and scoring functions to model-router
- Import TaskMetadata from complexity-classifier
- Add capability_routing?: boolean to DynamicRoutingConfig
- Add capabilityScores, taskRequirements, selectionMethod fields to RoutingDecision
- Add ModelCapabilities interface (7 dimensions: coding, debugging, research, reasoning, speed, longContext, instruction)
- Add MODEL_CAPABILITY_PROFILES data table with 9 model profiles
- Add BASE_REQUIREMENTS data table with 11 unit type vectors
- Add exported scoreModel() pure function (weighted average, 0-100 range)
- Add exported computeTaskRequirements() with metadata-driven vector refinement
- Add exported scoreEligibleModels() with cost-preferring tie-break sorting
- Add exported getEligibleModels() extracted from findModelForTier() logic
- Add selectionMethod: "tier-only" to all 5 return sites in resolveModelForComplexity()
- Change getModelTier() unknown default from "heavy" to "standard" (per D-15)
- Add capability_routing: true to defaultRoutingConfig()
2026-04-04 10:53:45 -05:00
Jeremy
e89bf7d18e test(01-01): add failing tests for capability-aware model routing
- Tests for scoreModel weighted average, edge cases (empty/unknown dims)
- Tests for computeTaskRequirements with all branch paths (docs, concurrency, migration, large-file)
- Tests for MODEL_CAPABILITY_PROFILES (9 models, 7 dimensions each)
- Tests for BASE_REQUIREMENTS (all 11 unit types)
- Tests for scoreEligibleModels (sorting, tie-breaking, unknown models, overrides)
- Tests for getEligibleModels (tier filtering, explicit config, empty result)
- Tests for DynamicRoutingConfig.capability_routing and RoutingDecision.selectionMethod
2026-04-04 10:51:31 -05:00
Jeremy
851bb0bebe fix(pi-coding-agent): upgrade Kotlin LSP to official Kotlin/kotlin-lsp
Closes #3493
2026-04-04 10:45:15 -05:00
Jeremy
3f9fa9351f fix(test): use correct RequirementCounts type fields in edge case tests
Replace non-existent `invalidated` field with the correct type fields
(`outOfScope`, `blocked`, `total`) to pass typecheck.
2026-04-04 10:25:00 -05:00
Jeremy
181243a933 chore: init gsd 2026-04-04 10:00:43 -05:00
Jeremy
62cc474002 test(gsd): fill state machine E2E verification gaps (#3498)
Add 102 integration tests across two new files covering state machine
edge cases, runtime failures, and boundary conditions not exercised
by the existing live-validation suite.

Closes #3498
2026-04-04 10:00:07 -05:00