Commit graph

605 commits

Author SHA1 Message Date
TÂCHES
3812c2d322 Merge branch 'main' into feat-google-oauth-search 2026-03-15 08:17:11 -06:00
TÂCHES
85b9c10265 Merge pull request #465 from deseltrus/fix/loop-recovery-all-unit-types
fix: verify artifacts on disk before bailing on dispatch loop limit
2026-03-15 08:06:01 -06:00
Harald Heckmann
186a1de406 feat: Use google search via Google OAuth if available 2026-03-15 11:13:36 +01:00
deseltrus
271ab39576 fix: verify artifacts on disk before bailing on dispatch loop limit
The loop detection in dispatchNextUnit stops auto-mode when a unit has
been dispatched MAX_UNIT_DISPATCHES (3) times. Previously, only
execute-task had reconciliation logic to check whether the artifact
actually exists on disk before bailing. All other unit types
(complete-slice, plan-slice, research-slice, etc.) would immediately
stop — even if the Nth attempt successfully produced the artifact.

This is a race between the dispatch counter and disk verification:
the counter increments at dispatch time, but artifact verification
only runs during closeout of the NEXT unit. If the last allowed
attempt succeeds, the counter is already at the limit when the next
dispatch tries to run, and nobody checks disk state.

Reproduction scenario:
1. complete-slice dispatched 3 times (LLM missed writing UAT on
   attempts 1-2, succeeded on attempt 3)
2. Attempt 3 produces both SUMMARY and UAT — auto-committed to disk
3. Dispatch 4 fires: prevCount (3) >= MAX_UNIT_DISPATCHES (3)
4. No disk check for complete-slice → pipeline stops with
   'Expected artifact not found' despite artifacts existing

Fix: add a general verifyExpectedArtifact() check after the
execute-task-specific reconciliation and before the final bail-out.
If artifacts exist on disk, clear the counter and advance. If not,
same error as before — no behavior change for genuinely stuck units.
2026-03-15 10:52:50 +01:00
0xggoma
88e6957f64 fix: persist completion key in loop-recovery/self-repair to prevent infinite dispatch loops
When loop-recovery or self-repair reconciliation succeeds (artifacts exist on
disk), the dispatch counter is reset but the unit is never marked complete in
completed-units.json. If deriveState() continues returning the same unit, the
cycle repeats indefinitely: 3 dispatches → stuck detection → reconciliation
→ counter reset → 3 more dispatches...

This was observed in production burning $93.87 on 103 dispatches of a single
already-completed task over 4.9 hours.

Changes:
1. Persist completed key (persistCompletedKey + completedKeySet.add) in both
   the loop-recovery and self-repair success paths, so the idempotency check
   at the top of dispatchNextUnit prevents re-dispatch.
2. Add invalidateStateCache() after reconciliation writes to ensure the next
   deriveState() call sees fresh disk state.
3. Add a hard lifetime dispatch counter (unitLifetimeDispatches) that survives
   counter resets from reconciliation paths. Caps any single unit at 6 total
   dispatches across all reconciliation cycles.

Fixes #462
2026-03-15 01:39:19 -07:00
deseltrus
3b914033f4 fix(test): update draft-promotion test for expanded checkAutoStartAfterDiscuss
The static assertion searched the first 1200 chars of checkAutoStartAfterDiscuss
for CONTEXT-DRAFT and unlinkSync references, but the function grew to 4164 chars
after adding Gates 2-4 (STATE.md, PROJECT.md, manifest validation). The search
window now extends to the next export statement.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-15 09:23:26 +01:00
deseltrus
f27ed34fc0 feat(discuss): add discussion manifest for mechanical process verification
Closes the remaining gap in multi-milestone enforcement: the code
previously validated only the END STATE (files exist) but not the
PROCESS (each gate was presented to the user).

New mechanism:
- discuss.md instructs the LLM to write .gsd/DISCUSSION-MANIFEST.json
  after EACH Phase 3 gate decision, tracking gates_completed vs total
- checkAutoStartAfterDiscuss() Gate 4: BLOCKS auto-start if
  gates_completed < total (not just a warning)
- Manifest is deleted after auto-start (only needed during discussion)
- Single-milestone discussions don't use manifest (backward-compatible)
- DISCUSSION-MANIFEST.json added to baseline gitignore patterns

This creates a three-layer enforcement:
  Layer 1 (Prompt): ask_user_questions calls at each gate
  Layer 2 (Files):  CONTEXT.md/DRAFT/directory existence check
  Layer 3 (Manifest): gates_completed == total process verification

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-15 09:15:56 +01:00
deseltrus
ccb2a08d67 feat(discuss): harden multi-milestone gates with two-layer enforcement
Layer 1 (Prompt): discuss.md now enforces:
- Document ingestion rule: read ALL user-provided files before reflection
- Mandatory milestone confirmation gate via ask_user_questions
- 1M context awareness: prefer discussing all milestones in-session
- Phase 3 gates marked MANDATORY with progress tracking
- Default-recommend "Discuss now" over "Draft for later"

Layer 2 (Code): checkAutoStartAfterDiscuss() now validates:
- Gate 1: Primary CONTEXT.md exists
- Gate 2: STATE.md exists (written last in Phase 4, prevents
  premature auto-start during Phase 3 readiness gates)
- Gate 3: Multi-milestone completeness check against PROJECT.md
  milestone sequence — warns if milestones are missing from filesystem

Also fixes conflict markers in discuss.md from gsd/M005/S05 merge.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-15 09:15:56 +01:00
deseltrus
d154d992dd feat(config): session-internal /gsd config + fix key hydration
Three fixes for config/setup UX:

1. cli.ts: Add missing loadStoredEnvKeys() call in gsd config flow.
   Previously, gsd config showed keys as "not configured" even when
   they existed in auth.json because env vars weren't hydrated first.

2. commands.ts: New /gsd config slash command that lets users configure
   API keys (Tavily, Brave, Context7, Jina, Groq) from within a running
   session. Keys are saved to auth.json and activated immediately.
   No need to exit the session and run gsd config externally.

3. command-search-provider.ts: Show native Anthropic web search status
   when using Claude models, so users know search works even without
   Brave/Tavily keys.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-15 08:35:27 +01:00
deseltrus
9aeacc803c feat(prefs): model selection via select list instead of free-text input
The preferences wizard now shows available models from the model
registry as a selectable list instead of requiring users to manually
type model IDs. Falls back to text input when no authenticated
models are available.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-15 07:33:12 +01:00
deseltrus
be2492b48d fix(prefs): break parse/serialize cycle for empty arrays and objects
The preferences parser treated [] and {} as strings instead of empty
array/object. On next serialize, yamlSafeString quoted them as "[]"
and "{}", permanently corrupting the preferences file. This caused
the wizard to show empty fields (models, auto_supervisor, etc.).

Fix: parseScalar now recognizes [] and {} (quoted or unquoted) as
empty array/object. Serializer omits empty values entirely instead
of writing key: [] or key: {}.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-15 07:19:13 +01:00
deseltrus
5377cfad50 fix(ux): launch prefs wizard directly from /gsd prefs
Instead of just showing "Edit file" notification, /gsd prefs now
ensures the preferences file exists and immediately launches the
interactive wizard. This matches user expectation — typing "prefs"
should let you edit preferences, not just show a file path.

/gsd prefs status still available for file path info without wizard.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-15 07:13:23 +01:00
deseltrus
0c9cbf6b4c fix(ux): differentiate skill diagnostics and improve prefs discoverability
Split skill diagnostics into [Skill conflicts] (actual collisions) and
[Skill issues] (validation warnings like missing description) so users
aren't misled by the label. Add wizard hint to /gsd prefs output.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-15 06:54:11 +01:00
Colin Johnson
fe03743b08 fix: guard against newer synced resources (#445)
Co-authored-by: TÂCHES <afromanguy@me.com>
2026-03-14 22:58:18 -06:00
TÂCHES
87364ed089 Merge branch 'main' into fix/discuss-needs-discussion-deadlock-440 2026-03-14 22:19:17 -06:00
Flux Labs
16364c7dba fix: prevent web_search tool injection for non-Anthropic providers serving Claude models (#444) (#446)
GitHub Copilot users with Claude models got 400 errors because the native
Anthropic web_search_20250305 tool was injected into requests to Copilot's
API proxy, which doesn't support it. The root cause was that model_select
never fires before the first API request on new sessions, so the fallback
heuristic (model name starts with "claude-") couldn't distinguish direct
Anthropic from proxied providers.

Fix: pass the resolved Model object through to the before_provider_request
event so extensions can check model.provider directly instead of relying on
model name heuristics.
2026-03-14 22:15:00 -06:00
Flux Labs
e5732ca4cb fix: /gsd discuss routes to draft discussion when phase is needs-discussion
When a milestone is in 'needs-discussion' phase (has CONTEXT-DRAFT.md
but no ROADMAP.md yet), /gsd discuss was incorrectly hitting the
'No roadmap yet' guard and returning early — creating the deadlock
reported in #440.

Fix: add an early check in showDiscuss() for the needs-discussion phase.
When detected, it shows the same draft discussion menu that the main
/gsd wizard shows (discuss_draft / discuss_fresh / skip_milestone),
bypassing the roadmap guard entirely. The discussion IS how the roadmap
gets created, so requiring it to already exist was wrong.

Fixes #440
2026-03-14 22:37:38 -05:00
Copilot
6d2ff3d4a5 Fix: em dash and slash in milestone/slice titles corrupt GSD state management (#426)
* Initial plan

* chore: establish baseline before implementing em-dash fix

Co-authored-by: glittercowboy <186001655+glittercowboy@users.noreply.github.com>

* fix: validate milestone titles against delimiter characters (em dash, slash) that break state management

- Changed STATE.md separator from em dash to colon in buildStateMarkdown and state.md template
- Removed ambiguous '— Context' suffix from context.md H1 template
- Added validateTitle() function to detect problematic delimiter characters
- Added delimiter_in_title doctor issue code for milestone/slice title validation
- Added tests for validateTitle() and doctor delimiter detection
- Added em-dash-in-title cases to regex-hardening test

Fixes: milestone titles containing '—' caused state corruption when the LLM
misread the ambiguous STATE.md separator format and wrote incorrect planning files.

Co-authored-by: glittercowboy <186001655+glittercowboy@users.noreply.github.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: glittercowboy <186001655+glittercowboy@users.noreply.github.com>
2026-03-14 21:03:55 -06:00
Flux Labs
ecf8125e39 feat: add Ollama Cloud as model and web tool provider (#430) (#434)
Add Ollama Cloud (ollama.com) as a built-in provider with both model
hosting and web search/fetch capabilities.

Model provider:
- 13 curated models via OpenAI-compatible API (Llama 3.1, Qwen 3,
  DeepSeek R1, Gemma 3, Mistral, Phi-4, GPT-OSS)
- Auth via OLLAMA_API_KEY environment variable
- Registered in onboarding, env hydration, and model resolver

Web tool provider:
- Search via POST ollama.com/api/web_search
- Page fetch via POST ollama.com/api/web_fetch (fallback after Jina)
- Added as third search provider option alongside Tavily and Brave
- /search-provider command updated with ollama option

Closes #430
2026-03-14 21:03:31 -06:00
gkd67pjznr-ctrl
3c931b2e19 fix(guided-flow): add self-heal for stale runtime records on wizard start (#436)
auto.ts has selfHealRuntimeRecords() which cleans up stale .gsd/runtime/units/
records when /gsd auto starts. However, guided-flow.ts (used by /gsd manual
mode) had zero awareness of runtime records — it only checked auto.lock.

This means if auto-mode crashes mid-unit, the stale runtime records persist
until the next /gsd auto run. Users who alternate between manual and auto
mode, or who only use manual mode after a crash, would accumulate stale
records that could cause spurious re-dispatch or confusing state.

Add selfHealRuntimeRecords() to guided-flow.ts that:
- Clears records where the expected artifact already exists (completed but
  closeout didn't finish)
- Clears records stuck in dispatched or timeout phase (process died mid-unit)
- Notifies the user how many stale records were cleaned

Called in showSmartEntry() before the crash lock check so the wizard always
starts from a clean state regardless of how the previous session ended.

Co-authored-by: Thomas <twilliams1234@gmail.com>
2026-03-14 20:54:16 -06:00
Flux Labs
0e3284215a fix: bg_shell ready_port timeout and error handling (#428) (#435)
When a server fails to bind to the configured ready_port, the process
would stay in "starting" status indefinitely after the probing interval
cleared, with no error surfaced to the agent. This fixes the hang by:

- Transitioning process to "error" status when port probing times out
- Detecting process exit during port polling and reporting stderr context
- Adding ready_timeout parameter for custom timeout values
- Including stderr output in waitForReady timeout/error responses
- Registering SIGTERM/SIGINT handlers to clean up bg processes on exit

Closes #428
2026-03-14 20:51:02 -06:00
Flux Labs
96ced0357b fix: clear cachedReaddir before dispatch and artifact verification (#431) (#432)
The directory listing cache in paths.ts has no TTL and was never cleared
in production, causing dispatchNextUnit to re-dispatch the same unit
when files written by the previous unit weren't visible to deriveState.

Add clearPathCache() calls at the top of dispatchNextUnit (before
deriveState) and verifyExpectedArtifact so each dispatch cycle and
artifact check sees fresh disk state.

Closes #431
2026-03-14 20:48:43 -06:00
TÂCHES
8c45a0dda3 Merge pull request #424 from gsd-build/perf/inline-static-templates
perf: inline static templates into prompt builders to eliminate ~44 READ tool calls per milestone
2026-03-14 18:18:14 -06:00
TÂCHES
9f56049509 Potential fix for pull request finding
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
2026-03-14 17:55:23 -06:00
TÂCHES
36a810be8a Potential fix for pull request finding
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
2026-03-14 17:55:05 -06:00
Lex Christopherson
3551d2291b perf: inline static templates into prompt builders to eliminate ~44 READ tool calls per milestone
Add loadTemplate() and inlineTemplate() to prompt-loader.ts, then use
them in all 7 auto.ts builder functions and ~9 guided-flow.ts callsites
to inject template content at prompt-build time. Update 16 prompt .md
files to reference inlined templates instead of instructing agents to
read them from disk.

Over a typical 3-slice/15-task milestone run, this eliminates ~44
unnecessary READ tool calls (~45-90s latency, ~5-9k wasted tokens).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-14 17:34:42 -06:00
TÂCHES
73c0fd8043 Merge pull request #416 from fluxlabs/feat/post-unit-hooks-140
feat: extensible hook system for auto-mode state machine
2026-03-14 17:14:31 -06:00
Lex Christopherson
9c82a1b79f fix(auto): clear parse and path caches alongside state cache
Ensures auto-mode reads fresh file data after unit completion,
slice merges, and self-healing — prevents stale cached parses
from the memoized deriveState pipeline.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-14 17:02:32 -06:00
TÂCHES
4334f3a27f Merge pull request #415 from fluxlabs/perf/repo-hotpath-optimizations 2026-03-14 15:58:04 -06:00
Flux Labs
cafc36f16c feat: add extensible hook system for auto-mode state machine (#140)
Implements post-unit hooks, pre-dispatch hooks, state persistence, and
a /gsd hooks status command — all configured via preferences.md without
code changes. Enables code review loops, simplify passes, convention
enforcement, and custom unit interception as opt-in extensions.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-14 16:52:43 -05:00
Flux Labs
f981d5aa79 perf: optimize discovery and interactive hot paths 2026-03-14 16:03:44 -05:00
Lex Christopherson
8a64a1c1da fix(tests): invalidate both state and path caches before assertions expecting fresh disk state
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-14 14:47:29 -06:00
Lex Christopherson
e739c4cab7 perf: memoize deriveState() per dispatch cycle
deriveState() was called ~7 times per dispatch cycle, each call re-reading
the entire .gsd/milestones/ tree from disk (~50-60 file reads per call,
~350-420 redundant reads per cycle). Add a 100ms TTL cache keyed by
basePath so repeated calls within the same dispatch cycle return the
cached result. Expose invalidateStateCache() and call it at every
mutation boundary in auto.ts: handleAgentEnd start, post-merge
re-derivations, and resume-from-pause.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-14 14:38:19 -06:00
TÂCHES
86f705c8e4 Merge pull request #398 from gsd-build/perf/path-resolution-cache
perf: session-scoped directory listing cache for path resolution
2026-03-14 14:36:59 -06:00
Lex Christopherson
d7faf8a4e5 fix(tests): invalidate path cache between deriveState calls that expect fresh disk state
Tests that write files and immediately call deriveState() got stale results
because the path resolution cache (dirEntryCache/dirListCache) returned
cached directory listings that didn't include newly written files.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-14 14:14:40 -06:00
TÂCHES
5b62a3c7b3 Merge pull request #407 from fluxlabs/feat/model-provider-preferences-350 2026-03-14 14:13:56 -06:00
TÂCHES
4139e0d194 Merge pull request #406 from fluxlabs/fix/auto-mode-openrouter-model-resolution 2026-03-14 14:12:47 -06:00
TÂCHES
6004b5caaa Merge pull request #405 from fluxlabs/fix/git-svn-noise-404 2026-03-14 14:12:13 -06:00
TÂCHES
6ec179c743 Merge pull request #408 from fluxlabs/feat/custom-openai-compatible-endpoint 2026-03-14 14:11:14 -06:00
TÂCHES
5e0d333ea5 Merge pull request #400 from gsd-build/perf/prompt-dedup 2026-03-14 14:10:19 -06:00
TÂCHES
a4999a406e Merge pull request #399 from gsd-build/perf/parse-cache 2026-03-14 14:09:54 -06:00
TÂCHES
879c476df5 Merge pull request #394 from ASRagab/main 2026-03-14 14:09:34 -06:00
TÂCHES
d22226e3b8 Merge pull request #391 from deseltrus/refactor/dynamic-extension-discovery 2026-03-14 14:09:22 -06:00
Flux Labs
595c778250 feat: add custom OpenAI-compatible endpoint option to onboarding wizard (#335)
Adds a "Custom (OpenAI-compatible)" provider option to the API key
flow in the onboarding wizard. When selected, prompts for base URL,
API key, and model ID, then writes the config to models.json.
2026-03-14 15:07:47 -05:00
Flux Labs
c1dcca6820 feat: allow specifying model provider in preferences (#350)
Add explicit provider targeting for model preferences when the same
model ID exists across multiple providers (e.g., claude-sonnet-4-6
on both Anthropic and Bedrock).

Two formats supported:
- String: "bedrock/claude-sonnet-4-6"
- Object: { model: claude-sonnet-4-6, provider: bedrock }

The provider/model string format already worked in the resolution
code but was undocumented. This adds the provider field to the
object format and documents both approaches.
2026-03-14 15:06:49 -05:00
Flux Labs
8abbccea54 fix: resolve OpenRouter model IDs in auto-mode and show active model per phase
OpenRouter models use slash-separated IDs (e.g. "moonshotai/kimi-k2.5") where the
full string is the model ID on the "openrouter" provider. The auto-mode model
switcher incorrectly split on the first slash and treated the prefix as a provider
name, causing all OpenRouter preference models to fail resolution and fall back to
the default model for every phase.

Now the resolver first checks whether the slash-prefix is a known provider, and if
not (or if no match is found), falls back to matching the full string as a model ID
— consistent with model-resolver.ts.

Also improves the progress widget and notifications to show [PHASE] and
provider/model so users can confirm the correct model is active.

Closes #402
2026-03-14 15:00:03 -05:00
Flux Labs
9eb5a00f4f fix: suppress git-svn noise that causes confusing errors on affected systems (#404)
Systems with a buggy git-svn Perl module (notably Arch Linux) emit
"Duplicate specification" warnings on every git invocation. Filter
these from error messages and suppress git-svn loading via GIT_SVN_ID.
Also update repository URLs from stale glittercowboy/gsd-pi to
gsd-build/gsd-2.
2026-03-14 14:59:02 -05:00
deseltrus
aca53c5853 refactor: replace hardcoded extension list with dynamic discovery in loader
loader.ts previously maintained a hardcoded list of bundled extension paths
for GSD_BUNDLED_EXTENSION_PATHS. This required manual updates whenever
extensions were added or removed, and created a consistency gap with
buildResourceLoader() which already discovers extensions dynamically.

Replace with runtime directory scanning that mirrors the discovery rules
in resource-loader.ts:
- Top-level .ts/.js files → extension entry point
- Directories with index.ts or index.js → extension entry point
- Directories without either (shared/, remote-questions/) → skipped

Benefits:
- Adding a new extension no longer requires editing loader.ts
- GSD_BUNDLED_EXTENSION_PATHS stays in sync with what buildResourceLoader()
  loads in the main process — subagents now receive the same extensions
- Fixes: 5 extensions (google-search, mcporter, ttsr, universal-config,
  voice) were loaded in the main process but missing from
  GSD_BUNDLED_EXTENSION_PATHS, meaning subagents did not receive them
- Eliminates a common source of merge conflicts for contributors and forks
  that add custom extensions
2026-03-14 20:37:18 +01:00
Lex Christopherson
c3b518457a perf: deduplicate transitive dependency summaries in prompt builders
Prevent duplicate slice/dependency summaries from being inlined into
prompts when the same ID appears more than once. Uses a Set to track
already-included IDs in inlineDependencySummaries and
buildCompleteMilestonePrompt.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-14 13:35:34 -06:00
Lex Christopherson
3a705cb501 perf: add content-hash-keyed parse cache for file parsing
parseRoadmap, parsePlan, parseSummary, and parseContinue are pure
functions that get called repeatedly with the same content during
deriveState dispatch cycles. A module-scoped Map keyed by a fast
composite key (length + first 100 chars + last 100 chars) avoids
redundant parsing. Cache caps at 50 entries and clears when full.
Exports clearParseCache() for explicit invalidation.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-14 13:34:40 -06:00