Commit graph

76 commits

Author SHA1 Message Date
Tom Boucher
7afefc73ac fix: add session-level search budget to prevent unbounded native web search (#1309) (#1529)
The Anthropic API's max_uses resets per request — when pause_turn triggers
a resubmit, the model gets a fresh budget each time. This allowed unlimited
total searches across a research unit, overwhelming the TUI render buffer.

Fix:
- Count web_search_tool_result blocks in conversation history on each
  before_provider_request to track cumulative searches per session
- Cap total native searches at 15 per session (3 full turns of 5)
- Dynamically set max_uses to min(5, remaining) — preserves per-turn cap
  while enforcing session ceiling
- When budget exhausted, omit web_search tool entirely instead of letting
  the model hit max_uses_exceeded repeatedly
- Reset counter on session_start (new agent unit)
- Add web search budget guidance to research prompts (defense in depth)

Tests: 5 new tests covering budget tracking, exhaustion, and reset.
All 35 native-search tests pass.
2026-03-19 20:08:15 -06:00
Jeremy McSpadden
b247c3510e feat: integrate cmux with gsd runtime (#1532) 2026-03-19 20:05:06 -06:00
Tom Boucher
8aa71bfb55 fix: prevent ensureGitignore from adding .gsd when tracked in git (#1364) (#1367)
* rfc: GitOps branching & versioning strategy proposal

Proposes a Git-Flow Lite model with automated integration branches:

  main          ← production-ready, tagged releases only
  next          ← integration branch for next minor (PRs target here)
  release/X.Y   ← stabilization branch, only bugfixes allowed
  hotfix/X.Y.Z  ← emergency fixes cherry-picked to release

Includes:
  - RFC document with lifecycle diagrams, migration path, open questions
  - Workflow scaffolds (in docs/proposals/workflows/, NOT .github/):
    - create-release.yml: manual dispatch to cut release branch from next
    - sync-next.yml: auto-sync next branch after version tags
    - backmerge.yml: auto back-merge release fixes to next

This is an experimental proposal requesting community feedback before
any implementation. The workflow files are inert scaffolds — they do
not run in CI.

* fix: prevent ensureGitignore from adding .gsd when tracked in git (#1364)

CRITICAL DATA-LOSS FIX: ensureGitignore() unconditionally added '.gsd' to
.gitignore even when .gsd/ was a real git-tracked directory, causing git to
report ~889 tracked files as deleted.

Root cause: BASELINE_PATTERNS included '.gsd' unconditionally, and the
gitignore modification ran BEFORE migration checks in auto-start.ts.

Changes:
- Add hasGitTrackedGsdFiles() helper using nativeLsFiles to detect tracked
  .gsd/ content
- ensureGitignore() now skips the '.gsd' pattern when .gsd/ has tracked files
- untrackRuntimeFiles() now skips entirely when .gsd/ has tracked files
- migrateToExternalState() aborts when .gsd/ has tracked files
- Reorder auto-start.ts: migration runs BEFORE gitignore modification
- Add 8 regression tests covering all scenarios

Fixes #1364

* fix: break recursive dialog loop when all milestones complete (#1348)

Two interacting bugs:

1. Recursive dialog loop: When all milestones are complete, bootstrapAutoSession
   calls showSmartEntry → sets pendingAutoStart → checkAutoStartAfterDiscuss
   calls startAuto → bootstrapAutoSession → showSmartEntry → infinite loop.
   The discuss workflow completes without producing a milestone directory, so
   phase stays 'complete' and the cycle never breaks.

   Fix: Add a re-entry counter (_consecutiveCompleteBootstraps) that tracks
   how many times bootstrapAutoSession enters the 'complete' branch without
   advancing. After 2 consecutive attempts, break the loop with a warning
   message and return false.

2. Missing _releaseFunction = null in retry lock onCompromised handler:
   The retry lock path in session-lock.ts set _lockCompromised but didn't
   null out _releaseFunction, which could leave a stale reference that
   masks the compromise detection in validateSessionLock().

Fixes #1348

* fix: self-heal stale roadmap checkbox for interrupted complete-slice (#1350)

When complete-slice is interrupted after writing SUMMARY.md and UAT.md but
before flipping the roadmap checkbox, auto-mode enters an infinite loop —
re-launching the same complete-slice unit because the dispatch loop uses
the roadmap checkbox as the sole 'slice done' signal.

Fix: Add a self-heal case in selfHealRuntimeRecords that detects when
SUMMARY + UAT exist but the roadmap checkbox is unchecked, and auto-fixes
the checkbox. This allows the verification to pass and the dispatch loop
to advance.

Fixes #1350

* fix: add EISDIR guard to complete/validate milestone prompts (#1343)

The LLM was passing tasks/ directory paths to the read tool during
milestone completion, causing EISDIR crashes. Added file system safety
instructions to both complete-milestone and validate-milestone prompts
telling the LLM to use ls/find for directory listing, not the read tool.

Fixes #1343

* feat: improve extension conflict messages with removal guidance (#1347)

When a user extension registers tools/commands that now ship as built-ins,
the conflict message now includes '(built-in tool supersedes — consider
removing <path>)' and the log level is downgraded from 'Extension load error'
to 'Extension conflict'.

Changes:
- resource-loader.ts: detect built-in vs user extension conflicts, add hint
- cli.ts: downgrade severity for superseded-tool conflicts

Fixes #1347

* test: fix always-skipped preferences test, add test:marketplace script

- preferences.test.ts: Replace always-skipped getIsolationMode test with
  a filesystem-independent version that validates the default through
  validatePreferences() instead of reading ~/.gsd/preferences.md.
  Reduces skipped count from 3 → 2.

- package.json: Add test:marketplace script for running marketplace
  contract tests (claude-import-tui, plugin-importer-live,
  marketplace-discovery) with GSD_TEST_CLONE_MARKETPLACES=1.
  These tests need external repos and self-skip in unit test runs.

Remaining 2 skips:
- Marketplace contract test suites (need external repos, run via test:marketplace)
- Windows-only tests in validate-directory.test.ts are platform-conditional
  and correctly skip on macOS

* fix: use execFileSync in regression tests for Windows portability

The regression tests used execSync with shell-dependent constructs:
- '&&' command chaining (works in bash/cmd but fragile)
- Single-quoted commit messages (bash-only, cmd.exe splits on spaces)

Replaced with execFileSync via a git() helper that bypasses the shell
entirely. Each git operation is a separate call with proper argument
arrays, eliminating all shell interpretation issues.

Fixes windows-portability CI failure.

* fix: guard milestone completion against missing slice summaries (#1368)

Auto-mode could report a milestone as complete after executing only the
last slice, skipping earlier unexecuted slices. The milestone completion
signal fired based on roadmap checkbox state, which could be stale or
inconsistent after worktree transitions.

Changes:
- auto-dispatch.ts: Added slice SUMMARY file existence check to both
  validating-milestone and completing-milestone dispatch rules. If any
  slice lacks a SUMMARY file, dispatch stops with a diagnostic error
  instead of proceeding to validation/completion.
- validate-milestone.test.ts: Updated tests to create slice summary
  files (required by the new guard).
- file-watcher.test.ts: Fixed flaky 'auth.json change emits auth-changed
  event' test by adding watcher initialization delay and increasing event
  propagation timeout (race condition when run in full suite).

Fixes #1368

* fix: warn on common misspelled preference keys + verify field guidance (#1373, #1341)

#1373: Users setting 'taskIsolation.mode: none' instead of 'git.isolation: none'
got a generic 'unknown key' warning. Added KEY_MIGRATION_HINTS that map common
misspellings (taskIsolation, task_isolation, isolation, manage_gitignore, auto_push,
main_branch) to their correct git.* equivalents with actionable messages.

#1341: Planning agent writes aspirational prose in Verify fields ('Sections 3.1
and 3.2 exist with exact formulas. Zero TBD.') instead of executable commands.
Added explicit verify field rules to the plan template: must be mechanically
executable, with examples of good vs bad patterns for content tasks.

Fixes #1373, partially addresses #1341

* refactor: extract roadmap-mutations.ts + shared test-utils.ts

Consolidation:
- roadmap-mutations.ts: Extracted markSliceDoneInRoadmap() and markTaskDoneInPlan()
  from duplicated implementations in doctor.ts, mechanical-completion.ts, and
  auto-recovery.ts. All three callers used identical regex patterns.
  mechanical-completion.ts and auto-recovery.ts now import the shared utility.
  (doctor.ts deferred — touched by PR #1349)

- test-utils.ts: Shared cross-platform test utilities for GSD extension tests.
  Provides git() helper (execFileSync, no shell), makeTempRepo() with
  core.autocrlf=false, cleanup(), createFile(), safeReadFile(), and
  writeMilestoneFixture(). 12 test files currently define their own versions
  of these helpers — new tests should import from test-utils.ts instead.

Security audit: No injection vectors (sid/tid are alphanumeric from roadmap
parser), no path traversal, no secrets, no new dependencies.

* fix: port conflict false positive on non-Node projects + paused worktree resume (#1381, #1383)

projects without package.json. macOS AirPlay Receiver listens on port 5000,
causing a spurious warning on non-Node projects.
Fix: Skip port checks entirely when no package.json exists. When using
default ports, filter out 5000 on macOS.

in-memory only. Re-entering /gsd started a fresh bootstrap from the project
root instead of the active worktree.
Fix: pauseAuto() now writes paused-session.json to .gsd/runtime/ with
milestoneId, worktreePath, originalBasePath, and stepMode. startAuto()
checks for this file before bootstrap and restores the paused session
context, including worktree re-entry. stopAuto() cleans up the file.

Fixes #1381, #1383

* fix: catch spawn ENOENT in uncaught exception guard + snapshot session lock path (#1384, #1363)

uncaught exception and crashes auto-mode. The EPIPE guard now also catches
ENOENT from spawn syscalls — logs the error and continues instead of
terminating the process.

the lock path differently via gsdRoot() because basePath could be either the
project root or a worktree path. gsdRoot() produces different results for
each, so the lock was written to one path and validated against another.
Fix: Snapshot the resolved lock path (_snapshotLockPath) at acquisition time
and reuse it for all subsequent lock operations within the session.

Fixes #1384, #1363

* fix: suppress false-positive lock compromise + skip migration with active worktrees (#1362, #1337)

because the event loop stall delays the heartbeat mtime update. The handler
now checks elapsed time since acquisition — if within the 30-minute stale
window, it logs a warning and continues instead of setting _lockCompromised.
Real takeovers (past the stale window) still trigger the compromise flag.

even when .gsd/worktrees/ contained active git worktrees with locked
directory handles. This caused EBUSY errors and destructive data loss.
Migration now checks for active worktree directories and skips entirely
if any are found.

Fixes #1362, #1337
2026-03-19 17:06:01 -06:00
TÂCHES
364bd5b65b Merge pull request #1430 from trek-e/fix/1423-session-cost-compaction
fix: accumulate session cost independently of message array (#1423)
2026-03-19 15:39:19 -06:00
John Brahy
c10e42b392 fix(mcp): preserve args for mcp_call tool invocations (#1354) 2026-03-19 15:29:19 -06:00
TÂCHES
d761e45a41 M001: The Minimal Machine — linear auto-loop, sole-authority state, sidecar queue, WorktreeResolver (#1419)
* refactor: replace recursive auto-dispatch with linear autoLoop, delete ~3k lines of dead code

Replace the complex recursive dispatch system (dispatchNextUnit, reentrancy
guards, stall detection, idempotency tracking, skip-depth machinery) with a
simple linear while(s.active) loop in auto-loop.ts.

Key changes:
- New auto-loop.ts with autoLoop(), runUnit(), resolveAgentEnd()
- Deleted auto-idempotency.ts, auto-stuck-detection.ts, session-lock.ts,
  mechanical-completion.ts, progress-score.ts, auto-constants.ts, unit-id.ts
- Extracted WorktreeResolver class for worktree path resolution
- Added auto-worktree-sync.ts for worktree synchronization
- Simplified auto.ts from ~1400 lines to ~400 lines
- Fixed 9 TypeScript errors (NotifyCtx type widening, capture typing)
- Comprehensive test coverage: 32 auto-loop tests + worktree resolver/DB tests

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: address 6 audit findings in auto-loop refactor

1. CRITICAL: Move pendingResolve to AutoSession + queue orphaned agent_end
   events instead of silently dropping them. Prevents permanent stalls when
   error-recovery sendMessage retries fire between loop iterations.

2. HIGH: Scope pendingResolve per-session via _activeSession ref, preventing
   concurrent /gsd auto sessions from corrupting each other's promises.

3. HIGH: Replace console.log in dispatchHookUnit with debugLog to prevent
   hook prompt content (potentially containing secrets) from leaking to stdout.

4. HIGH: Restore parked milestone handling in state.ts — Phase 1 skips
   parked milestones so they don't satisfy depends_on, Phase 2 registers
   them as 'parked' status. Add 'parked' to MilestoneRegistryEntry type.

5. MEDIUM: Restore queuePhaseActive parameter in shouldBlockContextWrite
   and re-export setQueuePhaseActive for guided-flow-queue.ts consumers.

6. MEDIUM: Add MAX_LOOP_ITERATIONS (500) lifetime cap to autoLoop to prevent
   runaway loops when units alternate between IDs.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: resolve build breakers, add correctness fixes, and graduated recovery

Build breakers (CRITICAL):
- Restore unit-id.ts (deleted but still imported by complexity-classifier.ts, metrics.ts)
- Restore progress-score.ts (deleted but still imported by commands.ts, dashboard-overlay.ts, doctor.ts)
- Rewrite worktree-sync-milestones.test.ts to use new syncProjectRootToWorktree API

Correctness fixes (MEDIUM):
- Cap pendingAgentEndQueue to 3 entries to prevent unbounded growth from stale events
- Add milestoneId path traversal validation in WorktreeResolver
- Clear depthVerificationDone on session_start to prevent cross-session leaks in RPC mode
- Add verification gate for non-hook sidecar units (triage, quick-tasks)
- Remove dead handleAgentEnd import from index.ts

Graduated recovery (Jeremy's feedback):
- Blanket try/catch around loop body — one bad iteration no longer kills the session
- Graduated stuck recovery: at count 3 try artifact verification + cache invalidation,
  at count 5 hard stop (was: binary stop at 5 with no recovery attempt)
- Graduated error recovery: 1st error retries, 2nd invalidates caches, 3rd stops

Test results: 32/32 auto-loop, 28/28 worktree-resolver, 11/11 sidecar-queue, tsc clean.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: restore copyWorktreeDb/reconcileWorktreeDb exports and fix loadToolApiKeys import

Two missing exports caused ~90% of the 120 pre-existing test failures:

1. copyWorktreeDb + reconcileWorktreeDb — imported by auto-worktree.ts but
   never added to gsd-db.ts. Restored with the original implementations.
2. loadToolApiKeys — moved to commands-config.ts but index.ts still imported
   from commands.ts. Fixed the import path.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: move loadToolApiKeys import to commands-config.js

loadToolApiKeys was moved to commands-config.ts but index.ts still
imported it from commands.ts, causing runtime failures in all tests
that transitively load the extension entry point.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* test: fix provider error assertion on windows

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 14:56:00 -06:00
Tom Boucher
8b0727c0e5 fix: accumulate session cost independently of message array (#1423)
getSessionStats() calculated cost by summing usage from assistant messages
in state.messages. After auto-compaction, pre-compaction messages are
replaced by a compactionSummary with no usage field — dropping the cost.

Fix: Added cumulative accumulators (_cumulativeCost, _cumulativeInputTokens,
_cumulativeOutputTokens, _cumulativeToolCalls) that are incremented on
every assistant message event, independent of the message array.
getSessionStats() now returns max(array-sum, cumulative) to ensure
monotonically non-decreasing values.

Fixes #1423
2026-03-19 12:44:11 -04:00
Tom Boucher
f0fe4b2443 fix: emit agent_end after abort during tool execution (#1414) (#1417)
* fix: sync worktree completion artifacts back to external state before merge (#1412)

When a worktree's .gsd/ was a real directory (not symlinked to external
state), milestone completion artifacts (SUMMARY, VALIDATION, updated
ROADMAP) were written locally but never synced back. The project root's
deriveState() read from external state and found no SUMMARY — reporting
the milestone as incomplete.

Changes:
- auto-worktree.ts: Added syncWorktreeStateBack() that copies milestone
  and slice .md files from worktree .gsd/ to the main external state dir
- auto.ts: Call syncWorktreeStateBack() in tryMergeMilestone before the
  git merge, ensuring artifacts are visible from the project root

Fixes #1412

* fix: emit agent_end after abort during tool execution (#1414)

When a user aborts a turn while a tool call is running, the abort RPC
succeeds but agent_end was never emitted. RPC consumers tracking turn
lifecycle via events got stuck in a 'streaming' state permanently.

Fix: After abort() + waitForIdle(), emit a synthetic agent_end if the
agent is no longer streaming. This ensures consumers always see the
turn-complete signal regardless of how the turn ended.

Fixes #1414
2026-03-19 10:24:39 -06:00
Jeremy McSpadden
d7bf3d4e72 Improve startup performance with lazy extension loading (#1336) 2026-03-19 07:38:50 -06:00
TÂCHES
5a36c131a9 fix: detect stale bundled resources via content fingerprint (#1193)
initResources() only re-synced when the GSD version changed. This meant
same-version content fixes (e.g. the subagent bundled-extension-paths.js
import fix in a2a701b1) never reached ~/.gsd/agent/extensions/ because
the version-only check saw 2.28.0 == 2.28.0 and skipped the sync.

Add a lightweight content fingerprint (sha256 of file paths + sizes) to
the managed-resources.json manifest. On startup, if the version matches
but the fingerprint doesn't, resources are re-synced. This covers:

- npm link dev workflows where source changes without version bumps
- hotfixes within a release that change bundled extension content
- upgrades from manifests without contentHash (treated as stale)

Cost: ~1ms of stat calls on ~100 files — no file reads needed.
2026-03-18 11:06:09 -06:00
Jeremy McSpadden
d24095971c feat: add pre-commit secret scanner and CI secret detection (#1148)
* feat: add pre-commit secret scanner and CI secret detection

Add a comprehensive secret scanning system to prevent accidental
credential leaks in commits and pull requests:

- scripts/secret-scan.sh: ERE-based scanner (macOS/Linux compatible)
  that detects AWS keys, API tokens, private keys, database URLs,
  GitHub/GitLab/Slack/Stripe/Google/npm tokens, and hardcoded passwords
- scripts/install-hooks.sh: one-command git pre-commit hook installer
- .secretscanignore: allowlist for known false positives (test fixtures,
  env var references, placeholder values)
- CI job: secret-scan step in ci.yml scans PR diffs against origin/main
- npm scripts: test:secret-scan, secret-scan, secret-scan:install-hook
- 17 tests covering detection, non-detection, binary skipping, CI mode

* fix: exclude secret-scan test file from CI scanning

The test file contains intentional fake secrets as test inputs.
Add it to .secretscanignore so CI doesn't flag them.

* fix: skip secret-scan tests on Windows (requires bash/POSIX grep)
2026-03-18 08:33:17 -06:00
Jeremy McSpadden
ce1ad35706 perf: skip initResources when version matches, consolidate startup I/O (#1052)
- Add version-match early return to initResources() — skips ~800ms of
  synchronous rmSync + cpSync when managed-resources.json already matches
  the running GSD version (steady-state on every launch)
- Consolidate package.json reads in loader.ts from 3 to 1 — single read
  reused for --version, --help, banner, and GSD_VERSION env var
- Replace blocking checkAndPromptForUpdates() with passive checkForUpdates()
  to avoid blocking startup on npm registry fetch + user prompt (up to 5s)
- Cache bundled extension keys in resource-loader to avoid redundant
  filesystem scan in buildResourceLoader()
- Use GSD_VERSION env var in getBundledGsdVersion() to skip package.json
  re-read from resource-loader.ts
- Add test verifying version-skip behavior: marker file survives when
  versions match, gets cleaned on mismatch
2026-03-17 21:57:13 -06:00
TÂCHES
94be09482f fix: add barrel files for remote-questions, ttsr, and shared extensions (#1048)
* fix: add barrel files for remote-questions, ttsr, and shared extensions

Centralizes public API surface for three extension directories behind
index.ts barrel files. External consumers now import from the barrel
instead of reaching into internal module files, reducing coupling and
making future refactors safer.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: rename barrel files to mod.ts to avoid extension loader auto-discovery

The extension loader auto-discovers extensions by looking for index.ts files
inside extensions/*/ directories. remote-questions/ and shared/ are utility
directories, not extensions — their index.ts barrel files caused load failures.

Renamed to mod.ts which the loader ignores, and updated all import paths.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-17 18:48:32 -06:00
TÂCHES
13f9d5585d fix(search): consolidate duplicate Brave API helpers (#1010)
* fix(search): consolidate duplicate Brave API helper functions

getBraveApiKey() and braveHeaders() were duplicated across provider.ts,
tool-llm-context.ts, and tool-search.ts. Export both from provider.ts
and import in the tool files.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(test): update provider export count to include braveHeaders

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-17 18:06:23 -06:00
Juan Francisco Lebrero
fe0f4f35e6 feat: add --events flag for JSONL stream filtering (#1000)
Allow orchestrators to filter the JSONL event stream to specific event
types, reducing stdout noise. The filter applies only to output —
internal processing (completion detection, supervised mode, answer
injection) is unaffected.

- New `--events <types>` flag (comma-separated, implies `--json`)
- Filter applied at stdout write point, all events still processed internally
- Updated help-text and SKILL.md with examples
- Tests for argument parsing and filter matching logic
2026-03-17 17:35:44 -06:00
TÂCHES
89ee5e439a fix: unify extension discovery logic (#995)
* fix: unify extension discovery between loader.ts and resource-loader.ts

Extract shared extension discovery logic (resolveExtensionEntries,
discoverExtensionEntryPaths) into extension-discovery.ts. Both loader.ts
and resource-loader.ts now use the same algorithm, which supports
package.json pi.extensions declarations in addition to index.ts/index.js
fallback.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: update test to match refactored extension discovery imports

Test checked for readdirSync which was replaced by discoverExtensionEntryPaths.
Updated import path from resource-loader.ts to extension-discovery.ts.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-17 17:14:04 -06:00
TÂCHES
2df7a2320b fix: remove dead github-client.ts (never imported) (#990)
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-17 17:05:44 -06:00
Lex Christopherson
6114ede489 fix: remove duplicate marketplace-discovery test
Migrated unique resolvePluginRoot and inspectPlugin tests from the older
file into the comprehensive contract test file at src/tests/, then
deleted the duplicate.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-17 16:18:45 -06:00
Lex Christopherson
2e013d70b5 merge: resolve 12 conflicts with main — integrate continueHere feature into refactored closeoutUnit
Conflicts arose because main added continueHereHandle cleanup and
buildSnapshotOpts (with continueHereFired) while the PR extracted
inline closeout code into closeoutUnit(). Resolution: use closeoutUnit()
with buildSnapshotOpts() to pass all fields including continueHereFired.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-17 13:20:20 -06:00
Colin Johnson
7733e12413 fix: reap orphan-prone child processes across session churn (#920)
* fix: reap orphan-prone child processes across session churn

* test: make bg-shell cleanup test shell-safe
2026-03-17 13:14:51 -06:00
Jeremy McSpadden
3c9762d9b7 refactor: extract 7 focused modules from auto.ts (#898)
Break up the 3,975-line auto.ts into focused, single-concern modules:

- auto-budget.ts: Pure budget alert level and enforcement functions
- auto-tool-tracking.ts: In-flight tool call tracking for idle detection
- auto-observability.ts: Pre-dispatch observability validation and repair
- auto-unit-closeout.ts: Consolidated metrics/activity/memory closeout helper
- auto-direct-dispatch.ts: Manual /gsd dispatch phase command handling
- auto-timeout-recovery.ts: Idle and hard timeout recovery with escalation
- auto-model-selection.ts: Model routing, complexity classification, fallback chains

auto.ts retains orchestration (start/stop/pause, handleAgentEnd, dispatchNextUnit)
and drops from 3,975 to 3,256 lines (-719 lines, -18%).

All extractions are pure moves with re-exports — no behavior changes.
All 1,092 unit tests and 30 integration tests pass.
2026-03-17 11:03:01 -05:00
Tom Boucher
ea9c5f7ee2 fix: headless mode exits early on progress notifications containing 'complete' (#879) (#888)
isTerminalNotification() used broad substring matching against
['complete', 'stopped', 'blocked']. Any notification containing these
words triggered early exit — including progress messages like:
  'All slices are complete — nothing to discuss.'
  'Override(s) resolved — rewrite-docs completed.'
  'Skipped 5+ completed units. Yielding to UI before continuing.'

Fix: Replace substring matching with prefix matching against the actual
stop signals emitted by stopAuto():
  'Auto-mode stopped...'
  'Step-mode stopped...'

These are the ONLY notifications that indicate auto-mode has genuinely
terminated. All other notifications (slice completion, override
resolution, skip yielding) are progress events and must not trigger
exit.

Also tighten isBlockedNotification to match 'blocked:' (with colon)
instead of bare 'blocked' to avoid false positives from unrelated
messages.

Added 15 regression tests covering:
- All real terminal notification variants
- 6 false-positive cases from the issue report
- Non-notify event rejection
- Blocked detection with and without colon
2026-03-17 09:11:34 -06:00
Tom Boucher
ae8b8eeca3 fix: limit native web_search to max_uses:5 per response (#817) (#869) 2026-03-17 08:23:05 -06:00
Tom Boucher
4c7c64f7f5 fix: completed milestone with summary re-entered as active on resume (#864) (#868) 2026-03-17 08:22:16 -06:00
Jeremy McSpadden
d06e9ca12e fix: auto-heal STATE.md missing in preDispatchHealthGate (#862) 2026-03-17 08:20:10 -06:00
Jeremy McSpadden
2d037249c4 test: expand E2E smoke tests with 14 new CLI verification tests
Add comprehensive black-box smoke tests covering command routing,
graceful error handling, headless mode validation, and help completeness.

New tests:
- Command routing: headless --help, sessions --help
- Flag aliases: -v (--version), -h (--help)
- Error handling: no-TTY clean exit, unknown flags resilience
- Headless mode: missing .gsd/ dir, missing --context, invalid/negative --timeout
- Help completeness: all subcommands listed, all key flags listed
- Edge cases: --version ignores trailing args, headless positional help

All tests run without API keys and use temp directory isolation.
2026-03-17 00:02:26 -05:00
TÂCHES
193b2b32a5 fix: strip clack UI from postinstall, keep silent Playwright download (#783)
npm ≥7 suppresses lifecycle script output by default, so the clack
banner/spinner was invisible during `npm install -g`. The user-facing
onboarding experience already lives at first `gsd` launch (onboarding.ts),
making the postinstall UI redundant dead code.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-16 21:35:04 -06:00
Jeremy McSpadden
6ed9cd5359 fix: resolve CI failures in VS Code extension PR
- Fix Windows MCP test failures: use pathToFileURL() instead of bare
  join() paths for dynamic imports, fixing ERR_UNSUPPORTED_ESM_URL_SCHEME
  on Windows where D:\ paths are not valid ESM import specifiers

- Remove parallel orchestration code that was WIP from another feature
  branch and not part of the VS Code extension scope (commands.ts,
  preferences.ts, types.ts changes reverted to main)

- Rebase cleanly onto main, resolving mcp-server.ts merge conflict by
  keeping main's dynamic import approach with PR's exported interface
  and JSDoc documentation
2026-03-16 16:53:34 -05:00
Jeremy McSpadden
48feced87d feat: add VS Code extension scaffold and MCP server compiled module
- Add vscode-extension/ with full MVP scaffold:
  - GsdClient: spawns gsd --mode rpc, JSON line communication
  - @gsd Chat participant: forward messages to agent, stream responses
  - Sidebar panel: connection status, model info, start/stop controls
  - Command palette: gsd.start, gsd.stop, gsd.newSession, gsd.sendMessage
  - Extension config: gsd.binaryPath setting
- Add compiled MCP server module at src/mcp-server.ts for tsc output
- Add MCP server tests verifying module import and instantiation
2026-03-16 16:46:20 -05:00
Ryan Harrington
f87b4938ca fix/gsd-bg-shell-stale-cwd: normalize bg-shell worktree cwd detection 2026-03-16 17:02:58 -04:00
Ryan Harrington
8b8ba0d207 fix/gsd-bg-shell-stale-cwd: resync bg-shell cwd after auto-worktree exit 2026-03-16 16:45:21 -04:00
TÂCHES
966e5e80fb Merge pull request #673 from jeremymcs/feat/v2.20-phase2-3-features
feat: v2.20 Phase 2-4 — skills, integrations, MCP server
2026-03-16 14:29:07 -06:00
Jeremy McSpadden
062b5c65eb fix: skip environment-dependent tests in CI
- Skip E2E --print test when no API key is configured (process hangs
  waiting for onboarding wizard input in non-TTY CI environments)
- Skip file-watcher extensions subdirectory test on Windows (chokidar
  subdirectory event delivery is unreliable in Windows CI runners)
2026-03-16 14:38:59 -05:00
Jeremy McSpadden
3690e7a8ca fix: stabilize file-watcher and E2E smoke tests for CI
- Increase file-watcher extension directory test delay to 1500ms with
  500ms settle time (Windows filesystem events are slower)
- Make E2E --print test more permissive on exit code 1: check for
  unhandled crash indicators instead of specific error messages
  (error text varies by CI environment)
2026-03-16 14:32:25 -05:00
TÂCHES
07effd64cc Merge pull request #471 from Jamie-BitFlight/feat/claude-import-skills-plugins
feat: import Claude marketplace plugins with namespaced components
2026-03-16 13:32:09 -06:00
Jeremy McSpadden
ce6e684899 fix: increase file-watcher test delay for CI stability 2026-03-16 14:00:47 -05:00
Jeremy McSpadden
8d56ab2893 feat: add MCP server mode, /lint skill, E2E smoke tests
- Add native MCP server mode (--mode mcp): exposes GSD's tools via
  Model Context Protocol over stdin/stdout for Claude Desktop, VS Code,
  and other MCP-compatible clients. Uses @modelcontextprotocol/sdk.
- Add /lint skill: auto-detects ESLint, Biome, Prettier, rustfmt,
  gofmt, Black, Ruff and runs with structured output
- Add 6 E2E smoke tests: --version, --help, config --help, update
  --help, --list-models, and --mode text --print startup
- Fix diff-context.ts stdio type for CI compatibility
- Fix token-counter.ts tiktoken import for extensions typecheck
- Update help text and CLI to include --mode mcp
2026-03-16 13:56:31 -05:00
Jeremy McSpadden
973b8992e5 feat: add GitHub API client, diff-aware context, tiktoken token counting
- Add GitHub API integration via @octokit/rest: createGitHubClient,
  getRepoInfo (parses HTTPS/SSH remotes), createPullRequest,
  getPullRequest, listPullRequestReviews, createIssueComment
- Add diff-aware context module: getRecentlyChangedFiles,
  getChangedFilesWithContext, rankFilesByRelevance — prioritizes
  recently-changed files for context window budget allocation
- Add accurate token counting via tiktoken: countTokens (async),
  countTokensSync, initTokenCounter — falls back to chars/4 heuristic
  when tiktoken is unavailable
- 27 new tests across 3 test files
2026-03-16 13:50:00 -05:00
Jeremy McSpadden
0b3163d297 feat: add /review skill, /test skill, chokidar file watcher, subcommand help
- Add /review skill: reviews staged/unstaged/commit changes for security,
  performance, bugs, and quality with structured findings by severity
- Add /test skill: auto-detects test framework, generates comprehensive
  tests for source files, or runs suites with failure analysis
- Add chokidar file watcher: watches ~/.gsd/agent/ for config changes
  (settings.json, auth.json, models.json, extensions/) with debounced
  events on an EventBus
- Add --help per subcommand: `gsd config --help` and `gsd update --help`
  show subcommand-specific usage information
- 8 new file-watcher tests (start/stop, event emission, debouncing,
  unrelated file filtering)
2026-03-16 13:47:25 -05:00
Jeremy McSpadden
ebbcbe363a security: add SSRF protection to fetch_page tool
Block requests to private/internal addresses in the fetch_page tool:
- Private IP ranges (10.x, 172.16-31.x, 192.168.x, 127.x, 169.254.x)
- Cloud metadata endpoints (metadata.google.internal, instance-data)
- localhost
- Non-HTTP protocols (file://, ftp://)
- IPv6 private ranges (::1, fc00:, fd, fe80:)

Add isBlockedUrl() to url-utils.ts with 11 new tests covering all
blocked and allowed URL patterns.
2026-03-16 13:35:48 -05:00
Jeremy McSpadden
2c926c12e3 fix: Phase 1 quick wins — bug fixes, security hardening, and performance
- Fix loadStoredEnvKeys divergent provider lists: add telegram_bot and
  custom-openai to wizard.ts (the canonical copy used by CLI), remove
  dead duplicate from onboarding.ts
- Security: add SAFE_COMMAND_PREFIXES allowlist to resolveConfigValue
  to prevent arbitrary RCE via settings.json shell commands
- Security: add TOFU (Trust On First Use) model for project-local
  extensions — skip untrusted .pi/extensions/ with stderr warning
- Performance: debounce sql.js MemoryStorage persistence (500ms window)
  so rapid mutations coalesce into a single db.export()+writeFileSync
- Fix double lstatSync call in tool-bootstrap.ts isRegularFile
- Add 26 new tests covering all changes
2026-03-16 13:18:02 -05:00
Lex Christopherson
3d2f294f6a fix: google-search OAuth test mock and Windows path separator in smoke test
- google-search test: mock getApiKeyForProvider to return JSON string
  matching real OAuth provider behavior (token+projectId), instead of
  using AuthStorage.inMemory which bypasses the OAuth getApiKey transform
- smoke test: split on /[/\\]/ for Windows path separator compatibility

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-16 11:56:46 -06:00
Lex Christopherson
20e0fe2460 Merge remote-tracking branch 'origin/main' into test/extension-smoke-test 2026-03-16 11:48:44 -06:00
Lex Christopherson
d187a1ed2d fix: use file:// URL for dynamic imports in smoke test (Windows compat)
On Windows, raw paths like D:\... are interpreted as protocol "d:" by
the ESM loader. Convert via pathToFileURL before dynamic import.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-16 11:46:04 -06:00
Jamie McGregor Nelson
13730189f2 test(claude-import): use portable fixtures for marketplace discovery tests 2026-03-16 11:47:28 -04:00
Jamie McGregor Nelson
60413c4ea9 test: skip marketplace contract tests when local repos are absent 2026-03-16 11:47:28 -04:00
Jamie McGregor Nelson
2e3fa903b1 feat: import Claude marketplace plugins with namespaced components 2026-03-16 11:47:28 -04:00
Lex Christopherson
40dd80a41e fix(voice): replace __dirname with import.meta.dirname for ESM compat
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-15 19:35:07 -06:00
Lex Christopherson
15462a7da7 test: add extension smoke test to catch import failures in CI
Dynamically discovers all bundled extensions and verifies they can be
imported without throwing. Catches missing imports, circular deps, and
broken module resolution that tsc cannot detect since extensions are
loaded at runtime via jiti.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-15 19:27:39 -06:00
Harald Heckmann
186a1de406 feat: Use google search via Google OAuth if available 2026-03-15 11:13:36 +01:00