Commit graph

2208 commits

Author SHA1 Message Date
Jeremy McSpadden
de065094ea Merge pull request #4006 from jeremymcs/fix/workflow-events-toctou
fix: TOCTOU file locking race conditions in event log and custom workflow graph
2026-04-11 16:27:37 -05:00
Jeremy
01b4177041 test(gsd): add file-lock TOCTOU fallback coverage 2026-04-11 16:15:51 -05:00
Jeremy McSpadden
31e88c99d2 Merge pull request #4007 from jeremymcs/refactor/state-derive-god-function
refactor: extract deriveStateFromDb logic into composable helpers
2026-04-11 16:11:45 -05:00
Jeremy
647056aa7d test(state): add tests for extracted deriveStateFromDb helpers
Cover the composable helpers extracted from deriveStateFromDb:
reconcileDiskToDb, buildCompletenessSet, buildRegistryAndFindActive,
handleNoActiveMilestone, resolveSliceDependencies, reconcileSliceTasks,
detectBlockers, checkReplanTrigger, checkInterruptedWork, and queue
order sorting.
2026-04-11 16:00:28 -05:00
Jeremy
c63220ab72 refactor: extract deriveStateFromDb logic into composable helpers
Extracts the monolithic deriveStateFromDb function into distinct,
composable helper functions (reconcileDiskToDb, resolveSliceDependencies,
detectBlockers, etc.) inside state.ts.

Resolves technical debt identified during the code quality audit by
drastically reducing cyclomatic complexity while preserving the exact
type signature and logical behavior.

Also removes duplicate disk->DB reconciliation that could overwrite
milestone statuses.
2026-04-11 15:19:48 -05:00
Jeremy
68b638a588 fix: TOCTOU file locking race conditions in event log and custom workflow graph
Implements file-level advisory locking via proper-lockfile to ensure
atomic read-modify-write sequences in:
- compactMilestoneEvents (event-log.jsonl)
- custom-workflow-engine reconcile (GRAPH.yaml)

Fixes silent data loss when concurrent auto-mode or dashboard
sessions overlap in these operations.
2026-04-11 14:49:19 -05:00
Jeremy
bf4bcfadde fix(claude-code): harden MCP elicitation schema handling 2026-04-11 13:26:24 -05:00
Jeremy
1495e711e1 fix(claude-code): accept secure_env_collect MCP elicitation forms 2026-04-11 13:18:27 -05:00
Jeremy
74fee9ed48 fix(interactive): keep MCP tool output ordered and restore secure prompt fallback 2026-04-11 12:47:41 -05:00
Jeremy McSpadden
23f1e72868 Merge pull request #3975 from jeremymcs/fix/windows-portability-sweep
fix(gsd): harden Claude Code workflow MCP bootstrap and guidance
2026-04-11 10:49:23 -05:00
Jeremy
9e5203fdf8 fix(gsd): resolve workflow MCP test typing regressions 2026-04-11 10:34:51 -05:00
Jeremy McSpadden
902c020590 Merge pull request #3979 from jeremymcs/fix/mcp-tool-iserror-flag
fix(mcp): return isError flag on workflow tool execution failures
2026-04-11 10:31:58 -05:00
Jeremy
818bf97c36 test(gsd): add regression for plan_slice isError failures 2026-04-11 10:18:20 -05:00
Jeremy
d572e372a1 fix(mcp): return isError flag on workflow tool execution failures
This fixes an issue where MCP workflow tools (e.g. gsd_plan_slice) would return error details in their JSON response, but without setting the 'isError: true' flag at the top level of the tool response payload. This caused MCP clients (like Claude Code) to interpret failed validations (like empty tasks arrays) as successes and get stuck in infinite validation failure loops.
2026-04-11 09:51:03 -05:00
Jeremy
d853992cb7 fix(discuss): add structuredQuestionsAvailable conditional to all gates
The depth verification gate and multi-milestone readiness gate
unconditionally referenced ask_user_questions. On transports where
structuredQuestionsAvailable is false, this would trap the flow in
re-ask loops.

Add {{structuredQuestionsAvailable}} conditionals to:
- Depth verification (true: ask_user_questions, false: plain text)
- Multi-milestone Phase 3 readiness gate
- Per-milestone technical assumption verification
2026-04-11 09:44:08 -05:00
Jeremy
6f27e514ca fix(discuss): add multi-round questioning to new-project discuss phase
The discuss.md prompt (used for /gsd new project creation) only asked
"What's the vision?" once and relied on LLM judgment for follow-ups.
This led to the agent asking a single question then jumping straight
to planning without gathering enough context.

Add explicit Question Rounds section with:
- 1-3 questions per round structure
- Conditional ask_user_questions vs plain text support
- Incremental persistence (CONTEXT-DRAFT save every 2 rounds)
- Depth-matching rule (1-2 rounds for simple, 4+ for large visions)
- Round cadence that drives toward the depth enforcement checklist

Thread structuredQuestionsAvailable through buildDiscussPrompt() and
prepareAndBuildDiscussPrompt() so the template variable resolves
correctly at runtime.

Closes #3976
2026-04-11 09:32:20 -05:00
Jeremy
4301a72522 fix(gsd): harden claude-code workflow MCP bootstrap
Ensure startup/session/init paths auto-prepare workflow MCP for Claude Code.

Disable native AskUserQuestion in Claude Code SDK options to avoid broken host prompts.

Add explicit /gsd mcp init . guidance when workflow MCP is missing.

Refs #3964
2026-04-11 08:35:19 -05:00
Jeremy McSpadden
679587bee2 Merge pull request #3970 from jeremymcs/fix/ask-user-question-stream-order
fix(web): preserve only final ask_user_questions text
2026-04-11 07:36:30 -05:00
Jeremy
bd91186e2f fix(web): drop provisional pre-tool question text 2026-04-11 07:20:18 -05:00
Jeremy McSpadden
1e6919d92b Merge pull request #3963 from jeremymcs/fix/3962-model-routing-transparency
fix(routing): skip dynamic routing for interactive dispatches (#3962)
2026-04-10 22:54:10 -05:00
Jeremy
66710f2b28 fix(routing): address codex review — complete interactive bypass and accurate banner
resolvePreferredModelConfig now skips routing-ceiling synthesis when
isAutoMode=false, preventing interactive dispatches from silently switching
to the tier_models.heavy model. The auto-start banner now reflects
effective routing state by checking flat-rate provider suppression and
using the actual ceiling from tier_models.heavy when configured.
2026-04-10 22:18:42 -05:00
Jeremy McSpadden
19cbb17683 Merge pull request #3961 from jeremymcs/fix/windows-portability-sweep
fix: harden Windows portability across runtime and tooling
2026-04-10 21:53:10 -05:00
Jeremy
839cd8d55b fix(routing): skip dynamic routing for interactive dispatches, always show model changes (#3962)
Dynamic routing silently downgraded models in interactive commands (guided-flow),
overriding the user's /model selection. Now routing only applies in auto-mode where
cost optimization is expected. Model downgrade notifications always fire regardless
of verbose setting, and auto-mode shows routing status upfront on start.
2026-04-10 21:51:18 -05:00
Jeremy
153ed252fe fix(ci): unblock windows portability follow-up 2026-04-10 20:45:51 -05:00
Jeremy
61204ce771 fix(windows): harden portability across runtime and tooling 2026-04-10 20:33:18 -05:00
Jeremy McSpadden
8fcaee6d5a Merge pull request #3885 from mastertyko/fix/3856-doctor-scope-db-unavailable
fix(gsd): surface scoped doctor health warnings
2026-04-10 20:31:47 -05:00
Jeremy McSpadden
b5ffff50ee Merge pull request #3886 from mastertyko/fix/3806-update-registry-version-fallback
fix(update): fetch latest version from registry
2026-04-10 20:31:06 -05:00
Jeremy McSpadden
854ab38498 Merge pull request #3894 from mastertyko/fix/3892-double-backtick-preexec
fix(gsd): handle doubled-backtick pre-exec paths
2026-04-10 20:29:46 -05:00
Jeremy McSpadden
999c2fc576 Merge pull request #3931 from mastertyko/fix/3912-skip-skipped-summaries
fix(gsd): skip skipped slices in milestone prompts
2026-04-10 20:29:27 -05:00
Jeremy
e26c5cff56 fix(auto): use pathToFileURL for cross-platform import and reconcile regression test
Convert resource-loader import path to file URL via pathToFileURL() to
fix Windows ERR_UNSUPPORTED_ESM_URL_SCHEME. Update existing regression
test to validate the GSD_PKG_ROOT + pathToFileURL contract.
2026-04-10 20:05:32 -05:00
Jeremy
7b2601e6a0 fix(auto): resolve resource-loader.js from GSD_PKG_ROOT on resume (#3949)
Auto-mode resume crashed with "Cannot find module" because the relative
import ../../../resource-loader.js only works from the source tree, not
from the deployed path at ~/.gsd/agent/extensions/gsd/auto.js.

Expose GSD_PKG_ROOT from loader.ts and use it in auto.ts to construct
an absolute path to dist/resource-loader.js that works in both contexts.
2026-04-10 20:00:46 -05:00
Jeremy
3c6093cf37 test remove secret-like onboarding fixture 2026-04-10 19:30:38 -05:00
Jeremy
d64056f833 fix claude code mcp elicitation bridge 2026-04-10 19:24:51 -05:00
Jeremy
5a940856c1 fix claude-code discuss question fallback 2026-04-10 19:12:45 -05:00
mastertyko
07da34dadb fix(gsd): surface scoped doctor health warnings 2026-04-11 01:49:05 +02:00
Jeremy
7ee1fa0c46 fix(pi-ai): remove Anthropic OAuth flow for TOS compliance
Delete the Anthropic OAuth module, remove it from the built-in provider
registry, strip the OAuth client branch from the Anthropic streaming
provider, and replace the daemon orchestrator's token refresh with a
simple ANTHROPIC_API_KEY requirement.

Anthropic access is now API key or local Claude Code CLI only.

Closes #3952
2026-04-10 17:33:34 -05:00
Jeremy McSpadden
7a44ca7aed Merge pull request #3948 from jeremymcs/fix/cmux-auto-enable
fix(gsd): auto-enable cmux when detected
2026-04-10 17:16:46 -05:00
Jeremy McSpadden
0575d2cb58 Merge pull request #3946 from jeremymcs/feat/mcp-ask-user-questions-elicitation
feat(mcp-server): expose ask_user_questions via elicitation
2026-04-10 16:58:34 -05:00
Jeremy
6d1e7952a5 test(gsd): accept source workflow MCP module paths 2026-04-10 16:45:09 -05:00
mastertyko
c2de4eb17b fix(gsd): skip skipped slices in milestone prompts 2026-04-10 23:19:25 +02:00
Jeremy
6c2964c502 test(mcp-server): cover ask_user_questions elicitation 2026-04-10 16:14:34 -05:00
Jeremy
28decda937 test(gsd): add regression tests for cmux auto-enable
Covers: successful auto-enable, missing prefs file fallback,
and preservation of existing cmux sub-preferences.
2026-04-10 16:12:35 -05:00
Jeremy
fffec6174a fix(gsd): auto-enable cmux when detected instead of prompting
When running inside a cmux terminal, GSD now automatically enables cmux
in project preferences instead of showing a manual enable prompt. Users
who explicitly disabled cmux (enabled: false) are still respected.

Closes gsd-build/gsd-2#3947
2026-04-10 15:46:36 -05:00
Jeremy
b3275a182d feat(mcp-server): expose ask_user_questions via elicitation 2026-04-10 15:44:08 -05:00
Jeremy McSpadden
4e84196bdb Merge pull request #3941 from jeremymcs/fix/codebase-generator-excludes
fix(gsd): add missing dirs to codebase generator exclude list
2026-04-10 14:17:44 -05:00
Jeremy
89bdc1e7f9 test(gsd): add regression test for .agents/ and tooling dir exclusion
Verifies that .agents/, .bg-shell/, .idea/, .cache/, tmp/, target/,
and venv/ are excluded from the codebase map during generation.
2026-04-10 13:46:53 -05:00
Jeremy
099a7723a3 fix(gsd): add missing directories to codebase generator exclude list
.agents/, .bg-shell/, .idea/, venv/, target/, .cache/, and tmp/ were
missing from DEFAULT_EXCLUDES. This caused /gsd-new-project to scan
skill and agent definition files as project code, confusing researcher
agents during project initialization.

Aligns the exclude list with the gitignore patterns in gitignore.ts.
2026-04-10 13:40:45 -05:00
Jeremy
2ad315b9fb fix(gsd): wire ADR-005 infrastructure into live paths
Addresses Codex adversarial review findings — the ADR-005 registries
and filters were built but not connected to the actual model selection
and provider adapter paths.

Fix 1+2: Tool filtering applied after model selection
- selectAndApplyModel() now calls adjustToolSet() after pi.setModel()
- Incompatible tools are removed via pi.setActiveTools()
- adjust_tool_set hook fires to allow extension overrides
- Verbose output reports filtered tools with provider context

Fix 3: ProviderSwitchReport wired through all 6 provider adapters
- New transformMessagesWithReport() convenience wrapper creates report,
  passes it to transformMessages(), and logs non-empty reports to stderr
  when GSD_VERBOSE=1 or PI_VERBOSE=1
- All adapters updated: anthropic, google, openai-responses,
  openai-completions, mistral, bedrock
2026-04-10 12:49:49 -05:00
Jeremy
b1c0dafc70 feat(gsd): implement ADR-005 multi-model provider and tool strategy
Implements all 4 phases of ADR-005 (issue #2790):

Phase 1: Provider Capabilities Registry
- Declarative ProviderCapabilities interface and PROVIDER_CAPABILITIES
  registry covering all 12 API providers
- Consolidates scattered *-shared.ts knowledge into queryable registry
- Unknown providers get permissive defaults (backward compatible)

Phase 2: Tool Compatibility Metadata
- ToolCompatibility interface (producesImages, schemaFeatures, minCapabilityTier)
- compatibility field on ToolDefinition
- Tool compatibility registry with pre-populated built-in tools
- Auto-registration from registerTool() and MCP tool defaults

Phase 3: Tool-Compat Filter + ProviderSwitchReport
- ProviderSwitchReport tracks thinking blocks dropped/downgraded,
  tool call IDs remapped, synthetic results inserted, thought
  signatures dropped during cross-provider message transformation
- isToolCompatibleWithProvider(), filterToolsForProvider(), adjustToolSet()
  functions in model router
- filteredTools field on RoutingDecision
- Verbose output for filtered tools in auto-model-selection

Phase 4: adjustToolSet Extension Hook
- AdjustToolSetEvent and AdjustToolSetResult interfaces
- emitAdjustToolSet() on ExtensionAPI and ExtensionRuntime
- Default no-op handler in register-hooks.ts

Includes 47 new tests (20 provider caps + 10 switch report + 17 tool compat)

Closes #2790
2026-04-10 12:33:40 -05:00
Jeremy
f96bc91014 feat(gsd): complete ADR-004 capability-aware model routing implementation
Close three remaining gaps from ADR-004:

1. Add modelOverrides to GSDPreferences type — removes unsafe type cast
   in auto-model-selection.ts, enables TypeScript validation for user
   capability override config.

2. Add profile completeness lint test — two tests in capability-router
   that fail if MODEL_CAPABILITY_TIER and MODEL_CAPABILITY_PROFILES
   drift out of sync (catches stale profiles on new model additions).

3. Add capability profiles for all 24 missing tier-mapped models — goes
   from 9 to 33 profiles, organized by provider. Values reflect each
   model family's known strengths (o-series high reasoning, nano/spark
   high speed, codex variants high coding).

Closes #2659
2026-04-10 12:10:29 -05:00