Commit graph

1262 commits

Author SHA1 Message Date
Jeremy McSpadden
2d8fdcc0ab fix: match both milestoneId and sliceId when filtering duplicate blocker cards
The high-risk card filter in buildBlockersSection only compared sliceId,
causing false positives when different milestones had slices with the
same ID (e.g. M001/S01 and M002/S01). Now matches on both milestoneId
and sliceId to correctly deduplicate.
2026-03-17 23:19:42 -05:00
Lex Christopherson
c1bc65bcca fix: switch alibaba-coding-plan to OpenAI-compat endpoint with proper compat flags (#1003) (#1057)
Co-Authored-By: Tom Boucher <trek-e@users.noreply.github.com>
2026-03-17 22:09:55 -06:00
TÂCHES
965028e219 Merge pull request #1077 from jeremymcs/feat/token-optimization-suite
feat: token optimization suite — caching, compression, smart context selection
2026-03-17 22:07:52 -06:00
Tom Boucher
41186ad9b0 docs: document /gsd config for global API keys (#1079)
* docs: add Node LTS pinning guide for macOS Homebrew users

New doc (docs/node-lts-macos.md) explains how to pin Node 24 LTS
via Homebrew to avoid running on odd-numbered development releases.
Covers brew install/link/pin, version managers as alternatives,
and verification steps.

Added notice banner in README linking to the guide.

* docs: document /gsd config for global API keys

Added Global API Keys section to configuration.md explaining:
- /gsd config saves keys to ~/.gsd/agent/auth.json
- Keys apply to all projects automatically
- Three supported keys: Tavily, Brave, Context7
- How precedence works (env vars > saved keys)
- Anthropic models don't need search keys

Updated commands.md gsd config entry to link to the new section.
Added Set up API keys section to getting-started.md for first-run.
2026-03-17 22:03:10 -06:00
Jeremy McSpadden
288b399f88 fix: add dispatch stall guards to prevent auto-mode pause after slice completion (#1073) (#1076)
* fix: prevent summarizing phase stall by retrying dropped agent_end events (#1072)

When handleAgentEnd dispatches a sub-unit (via hooks, triage, or quick-task
early-dispatch paths) and that unit completes before handleAgentEnd returns,
the resulting agent_end event is silently dropped by the reentrancy guard.
This leaves auto-mode active but permanently stalled — no unit running, no
watchdog set, process at high CPU doing nothing.

Add a pendingAgentEndRetry flag to AutoSession that the reentrancy guard sets
when it drops an agent_end event. The finally block in handleAgentEnd checks
this flag and schedules a deferred retry via setImmediate, ensuring the
completed unit's agent_end is always processed.

* fix: add dispatch stall guards to prevent auto-mode pause after slice completion (#1073)

After a slice completes all tasks, auto-mode can stall if newSession()
hangs or dispatchNextUnit gets permanently blocked at any await point.
The existing gap watchdog only fires AFTER dispatchNextUnit returns, so
it cannot recover from hangs inside the function itself.

- Wrap newSession() with Promise.race timeout (30s) to prevent permanent
  hangs from session manager deadlocks or network issues
- Add pre-dispatch hang guard (60s) in handleAgentEnd that starts the
  gap watchdog if dispatchNextUnit hasn't completed — catches hangs at
  any await point (model selection, session creation, etc.)
- Add better diagnostics: notify user when session creation times out
  or fails, with specific unit type/ID for debugging
2026-03-17 22:02:10 -06:00
Jeremy McSpadden
b2befe3628 fix: prevent summarizing phase stall by retrying dropped agent_end events (#1072) (#1074)
When handleAgentEnd dispatches a sub-unit (via hooks, triage, or quick-task
early-dispatch paths) and that unit completes before handleAgentEnd returns,
the resulting agent_end event is silently dropped by the reentrancy guard.
This leaves auto-mode active but permanently stalled — no unit running, no
watchdog set, process at high CPU doing nothing.

Add a pendingAgentEndRetry flag to AutoSession that the reentrancy guard sets
when it drops an agent_end event. The finally block in handleAgentEnd checks
this flag and schedules a deferred retry via setImmediate, ensuring the
completed unit's agent_end is always processed.
2026-03-17 22:01:58 -06:00
Tom Boucher
38b79d75a7 refactor: remove redundant test file, identify consolidation targets (#1070)
* docs: add Node LTS pinning guide for macOS Homebrew users

New doc (docs/node-lts-macos.md) explains how to pin Node 24 LTS
via Homebrew to avoid running on odd-numbered development releases.
Covers brew install/link/pin, version managers as alternatives,
and verification steps.

Added notice banner in README linking to the guide.

* refactor: remove auto-draft-pause.test.ts — redundant with auto-dashboard.test.ts

auto-draft-pause.test.ts tested describeNextUnit() for needs-discussion,
pre-planning, and executing phases. All of these are already covered by
auto-dashboard.test.ts which has proper node:test structure.

The removed file also had fragile structural tests (string-matching source
code) that break on refactors. The behavioral coverage is complete in the
existing file.

1296 tests pass, 0 fail.
2026-03-17 22:01:20 -06:00
Jeremy McSpadden
60dfaabe03 fix: use atomic writes for completed-units.json and invalidate caches in db-writer (#1069)
Addresses state safety issues found during #1062 deep dive:

1. completed-units.json writes in auto-worktree.ts and auto-worktree-sync.ts
   used plain writeFileSync which could produce truncated/corrupt files on
   crash, losing completion keys and causing unit re-dispatch. Switched to
   atomicWriteSync (temp file + rename) for crash safety.

2. Plan file checkbox reconciliation in auto-worktree.ts also switched to
   atomicWriteSync to prevent partial PLAN.md writes on crash.

3. db-writer.ts functions (saveDecisionToDb, updateRequirementInDb,
   saveArtifactToDb) wrote markdown files via saveFile() without invalidating
   caches afterward. Added targeted cache invalidation (state + path + parse)
   so deriveState() always sees fresh data. Uses individual invalidation
   functions rather than invalidateAllCaches() to avoid clearing the artifacts
   table that was just written to.
2026-03-17 22:01:08 -06:00
Jeremy McSpadden
668f12b97f fix: reject prose Verify: fields from being executed as shell commands (#1066) (#1068)
The verification gate's discoverCommands() was passing prose descriptions
from task plan Verify: fields through sanitizeCommand(), which only checked
for shell injection characters. English prose like "Document exists, contains
all 5 scale names..." passed the filter and was executed via spawnSync,
causing exit code 127 false negatives.

Added isLikelyCommand() heuristic that distinguishes executable commands
from prose descriptions by checking:
- Known command prefixes (npm, node, tsc, eslint, etc.)
- Path-like first tokens (./script.sh, /usr/bin/check)
- Flag-like tokens (-v, --check)
- Uppercase-initial words with 4+ tokens (prose pattern)
- Comma-space clause separators (prose pattern)

Prose Verify: fields now fall through to package.json scripts or "none"
instead of being executed. Valid commands continue to work as before.

Closes #1066
2026-03-17 22:00:52 -06:00
Jeremy McSpadden
9083d86766 fix: restore session model on error instead of reading stale global prefs (#1065) (#1067)
When a model fails during auto-mode and the fallback chain is exhausted
(or absent), the error recovery path previously fell through to pause
without attempting to restore the session's original model. Meanwhile,
the fallback chain itself was read fresh from disk via
loadEffectiveGSDPreferences(), which could pick up models configured by
a different concurrent GSD session sharing the same global preferences
file.

This adds a session model recovery step between fallback exhaustion and
pause. After the existing fallback chain logic, we now check whether the
current model has diverged from the model captured at auto-mode start
(autoModeStartModel). If so, we restore the session model and retry
before giving up and pausing.

Changes:
- auto.ts: export getAutoModeStartModel() getter for the session's
  captured start model
- index.ts: add session model recovery block after fallback chain
  exhaustion, using the session-scoped model instead of re-reading
  global preferences from disk
- model-isolation.test.ts: add 4 tests covering cross-session leakage
  detection, divergence checks, and null safety
2026-03-17 22:00:33 -06:00
Jeremy McSpadden
306c205dfc fix: prevent run-uat re-dispatch loop when roadmap checkbox update fails (#1063) (#1064)
Two compounding bugs caused auto-mode to re-dispatch run-uat indefinitely
after UAT passed:

1. markSliceDoneInRoadmap regex required dash at line start (^-) but the
   roadmap parser accepts optional leading whitespace (^\s*-). When LLMs
   indented checklist items, the doctor could never mark them done.

2. After run-uat completed, handleAgentEnd ran doctor with fixLevel:"task"
   which explicitly excluded slice-level completion transitions. Since
   run-uat is the terminal unit for a slice, the roadmap checkbox stayed
   unchecked, causing deriveState to return the same slice indefinitely.

Fix: Update markSliceDoneInRoadmap and markTaskDoneInPlan regexes to
accept leading whitespace (matching the parser), preserving indentation
in the replacement. Add run-uat to the set of unit types that use
fixLevel:"all" in handleAgentEnd closeout.
2026-03-17 22:00:19 -06:00
Tom Boucher
55769392af refactor: batch 2 — consolidate preferences, convert 8 more files to node:test (#1061)
* docs: add Node LTS pinning guide for macOS Homebrew users

New doc (docs/node-lts-macos.md) explains how to pin Node 24 LTS
via Homebrew to avoid running on odd-numbered development releases.
Covers brew install/link/pin, version managers as alternatives,
and verification steps.

Added notice banner in README linking to the guide.

* refactor: batch 2 — consolidate preferences tests, convert 7 more files to node:test

Preferences (6 files → 1):
  preferences-{git,hooks,mode,models,schema-validation,wizard-fields}.test.ts
  → preferences.test.ts (28 tests)

Converted to node:test (custom runner → node:test):
  - discuss-prompt.test.ts (1 test)
  - auto-preflight.test.ts (1 test)
  - next-milestone-id.test.ts (4 tests)
  - plan-slice-prompt.test.ts (3 tests)
  - workspace-index.test.ts (1 test)
  - roadmap-slices.test.ts (5 tests)
  - in-flight-tool-tracking.test.ts (5 tests)

Net: -933 lines, -6 files. Full suite: 1325 pass, 0 fail.

* refactor: convert dispatch-guard.test.ts to node:test

Net: 1 more file converted. Total this branch: 14 files converted/consolidated, 6 deleted.

* fix: add null guards for parsePreferencesMarkdown in tests

Add assert.ok(prefs) after each parsePreferencesMarkdown() call to
narrow the GSDPreferences | null return type before property access.
Fixes TS18047 errors in CI typecheck.
2026-03-17 22:00:04 -06:00
Tom Boucher
8dfa7d058c refactor: consolidate tests by area, standardize on node:test (#1059)
* docs: add Node LTS pinning guide for macOS Homebrew users

New doc (docs/node-lts-macos.md) explains how to pin Node 24 LTS
via Homebrew to avoid running on odd-numbered development releases.
Covers brew install/link/pin, version managers as alternatives,
and verification steps.

Added notice banner in README linking to the guide.

* refactor: consolidate tests by area, standardize on node:test

Consolidated 10 test files into 4, standardizing on node:test.

Provider errors (3 files → 1): provider-errors.test.ts (34 tests)
Metrics (2 files → 1): metrics.test.ts (13 tests, converted from custom runner)
Activity log (2 files → 1): activity-log.test.ts (11 tests, converted from custom runner)
Complexity (2 files → 1): removed redundant structural string checks

Net: -694 lines, -6 files.
2026-03-17 21:59:50 -06:00
Jeremy McSpadden
3d4f77b2ee fix: inline compareSemver in gsd extension to fix broken relative import (#1058)
The /gsd update command imported compareSemver from ../../../update-check.js,
a relative path that resolves correctly in the source tree (src/resources/
extensions/gsd/ → src/update-check.js) but breaks when extensions are synced
to ~/.gsd/agent/extensions/gsd/ (where ../../../ points to ~/.gsd/ which has
no update-check.js).

This caused the error:
  Extension "command:gsd" error: Cannot find module '../../../update-check.js'

Fix: inline a local compareSemverLocal() function in commands.ts, eliminating
the cross-tree import. The function is small (10 lines) and already well-tested
via update-check.test.ts.
2026-03-17 21:59:25 -06:00
Tom Boucher
41ebc6b643 docs: recommend pi-dashscope extension for DashScope models (#1056)
* docs: add Node LTS pinning guide for macOS Homebrew users

New doc (docs/node-lts-macos.md) explains how to pin Node 24 LTS
via Homebrew to avoid running on odd-numbered development releases.
Covers brew install/link/pin, version managers as alternatives,
and verification steps.

Added notice banner in README linking to the guide.

* docs: recommend pi-dashscope extension for DashScope models

The built-in alibaba-coding-plan provider uses the Anthropic-compat
endpoint and lacks per-model thinking format and compatibility flags,
causing issues like #1003 (MiniMax-M2.5 thinking loop).

The community pi-dashscope extension uses the correct OpenAI-compat
endpoint, sets thinkingFormat per model (qwen/zai), includes compat
flags (supportsDeveloperRole, supportsReasoningEffort), and provides
an interactive /dashscope-configure command.

Added Community Provider Extensions section to configuration docs
recommending pi-dashscope over the built-in provider.
2026-03-17 21:59:01 -06:00
Tom Boucher
85d48d3c97 fix: disable reasoning for MiniMax-M2.5 in alibaba-coding-plan provider (#1003) (#1055)
* docs: add Node LTS pinning guide for macOS Homebrew users

New doc (docs/node-lts-macos.md) explains how to pin Node 24 LTS
via Homebrew to avoid running on odd-numbered development releases.
Covers brew install/link/pin, version managers as alternatives,
and verification steps.

Added notice banner in README linking to the guide.

* fix: disable reasoning for MiniMax-M2.5 in alibaba-coding-plan provider (#1003)

MiniMax-M2.5 via Dashscope's Anthropic-compatible API does not
properly support extended thinking, causing the model to get stuck
in a thinking loop. Set reasoning: false for this model entry in
the alibaba-coding-plan provider.
2026-03-17 21:58:38 -06:00
Tom Boucher
004f0ac861 docs: update README and docs for v2.28.0 release (#1054)
* docs: add Node LTS pinning guide for macOS Homebrew users

New doc (docs/node-lts-macos.md) explains how to pin Node 24 LTS
via Homebrew to avoid running on odd-numbered development releases.
Covers brew install/link/pin, version managers as alternatives,
and verification steps.

Added notice banner in README linking to the guide.

* docs: update README and docs for v2.28.0 release

- README: add 'What's New in v2.28' section with key features
- commands.md: add /gsd update, /gsd export --html --all, and
  Export section with usage examples
- auto-mode.md: add --all flag to export, add Failure Recovery
  (v2.28) section documenting reliability hardening
- getting-started.md: mention /gsd update as in-session option
2026-03-17 21:58:07 -06:00
Jeremy McSpadden
45bff3456c feat(gsd): add directory safeguards for system/home paths (#1053)
* feat(gsd): add directory safeguards to prevent running in system/home paths

GSD previously had no protection against being launched from dangerous
directories like $HOME, /, /usr, or /etc. This adds layered validation:

- Blocked system paths (hard stop): /, /usr, /etc, /var, $HOME, tmpdir, etc.
- High entry count heuristic (>200 entries triggers confirmation dialog)
- Symlink resolution via realpathSync to prevent bypass
- Integrated at three chokepoints: projectRoot(), showSmartEntry(), bootstrapGsdDirectory()

Includes 19 tests covering all blocked categories, boundary conditions, and
the assertSafeDirectory throw/return behavior.

* fix: make directory safeguard tests cross-platform (Windows CI)

- Skip Unix-specific blocked path tests on Windows (/, /usr, /etc, etc.)
- Add Windows-specific blocked path tests (C:\, C:\Windows)
- Use platform-appropriate path separator in trailing slash test
- Fix root path normalization for Windows drive letters (C:\ not C:)
2026-03-17 21:57:53 -06:00
Jeremy McSpadden
ce1ad35706 perf: skip initResources when version matches, consolidate startup I/O (#1052)
- Add version-match early return to initResources() — skips ~800ms of
  synchronous rmSync + cpSync when managed-resources.json already matches
  the running GSD version (steady-state on every launch)
- Consolidate package.json reads in loader.ts from 3 to 1 — single read
  reused for --version, --help, banner, and GSD_VERSION env var
- Replace blocking checkAndPromptForUpdates() with passive checkForUpdates()
  to avoid blocking startup on npm registry fetch + user prompt (up to 5s)
- Cache bundled extension keys in resource-loader to avoid redundant
  filesystem scan in buildResourceLoader()
- Use GSD_VERSION env var in getBundledGsdVersion() to skip package.json
  re-read from resource-loader.ts
- Add test verifying version-skip behavior: marker file survives when
  versions match, gets cleaned on mismatch
2026-03-17 21:57:13 -06:00
Jeremy McSpadden
326cef0b2d feat: enhance HTML report with derived metrics, visualizations, and interactivity (#1078)
* feat: enhance HTML report with derived metrics, visualizations, and interactivity

Add 13 features to the HTML report generator across 6 implementation waves:

Wave 1 - Summary enhancements:
- Executive summary paragraph with project completion %, cost, and budget context
- ETA calculation based on completion rate and remaining slices
- Cost/slice and Tokens/tool efficiency metrics in KV grid
- Cache hit ratio percentage
- Milestone scope indicator when scoped to a milestone

Wave 2 - Metrics visualizations:
- Cost over time inline SVG area chart with grid lines and axis labels
- Duration by slice bar chart (third chart using existing buildBarChart)
- Budget burndown horizontal stacked bar (spent/projected/overshoot)
- Chart row CSS changed to auto-fit for flexible multi-column layout

Wave 3 - Blockers section:
- New section with card-based layout for blocker verifications and high-risk
  incomplete slices, added to sections array and TOC nav

Wave 4 - Gantt chart:
- SVG horizontal bar timeline grouped by slice with done/active/pending
  coloring and time axis labels

Wave 5 - Interactive JS features:
- Timeline filter input for text-based row filtering
- Collapsible sections with toggle buttons (localStorage persisted)
- Dark/light theme toggle in header (localStorage persisted)

Wave 6 - Mobile responsiveness:
- 768px and 480px breakpoints with stacked layouts and compressed padding

All changes in a single file (export-html.ts). No data layer changes needed.
30 new tests covering all features and edge cases.

* fix: correct Phase type literal in export-html-enhancements test

Change "execution" to "executing" to match the Phase type definition.
2026-03-17 21:46:51 -06:00
Tom Boucher
50bea6e73a feat: auto-extract lessons to KNOWLEDGE.md on slice/milestone completion (#711) (#1081)
* docs: add Node LTS pinning guide for macOS Homebrew users

New doc (docs/node-lts-macos.md) explains how to pin Node 24 LTS
via Homebrew to avoid running on odd-numbered development releases.
Covers brew install/link/pin, version managers as alternatives,
and verification steps.

Added notice banner in README linking to the guide.

* feat: auto-extract lessons to KNOWLEDGE.md on slice/milestone completion (#711)

Added knowledge extraction steps to completion prompts:

- complete-slice.md step 9: review task summaries for patterns,
  gotchas, and non-obvious lessons → append to KNOWLEDGE.md
- complete-milestone.md step 9: review all slice summaries for
  cross-cutting insights → append to KNOWLEDGE.md

Combined with the existing execute-task step 13 (which already
tells agents to append discoveries during execution), this creates
a three-layer extraction pipeline: task → slice → milestone.
2026-03-17 21:45:55 -06:00
Tom Boucher
c5739f1282 feat: auto-create PR on milestone completion (#687) (#1084)
* docs: add Node LTS pinning guide for macOS Homebrew users

New doc (docs/node-lts-macos.md) explains how to pin Node 24 LTS
via Homebrew to avoid running on odd-numbered development releases.
Covers brew install/link/pin, version managers as alternatives,
and verification steps.

Added notice banner in README linking to the guide.

* feat: auto-create PR on milestone completion (#687)

New git preferences:
- git.auto_pr (boolean, default false): create a PR when a
  milestone completes via gh CLI
- git.pr_target_branch (string, default main branch): target
  branch for auto-created PRs (e.g. develop, qa, staging)

Implementation:
- GitPreferences: added auto_pr and pr_target_branch fields
- preferences.ts: added validation for both fields
- auto-worktree.ts: after push, pushes milestone branch and
  creates PR via 'gh pr create' (non-fatal on failure)

Documentation:
- configuration.md: added fields to git config block, table,
  and new git.auto_pr section with requirements and flow
- git-strategy.md: added Automatic Pull Requests section with
  Gitflow example config
2026-03-17 21:45:29 -06:00
Tom Boucher
792b166ce6 fix: improve LSP diagnostics when no servers detected (#1082) (#1086)
* docs: add Node LTS pinning guide for macOS Homebrew users

New doc (docs/node-lts-macos.md) explains how to pin Node 24 LTS
via Homebrew to avoid running on odd-numbered development releases.
Covers brew install/link/pin, version managers as alternatives,
and verification steps.

Added notice banner in README linking to the guide.

* fix: improve LSP diagnostics when no servers detected (#1082)

When lsp status returns 'No language servers configured', the output
now includes diagnostics:
- Which project markers were detected (e.g. package.json found)
- Which server commands are missing (e.g. typescript-language-server)
- Install instructions

Also added LSP troubleshooting section to docs/troubleshooting.md
with common install commands per language.
2026-03-17 21:45:11 -06:00
Jeremy McSpadden
4e7b3d486f test: add end-to-end token optimization benchmark
Benchmark validates all optimization modules with realistic GSD content:
- Structured data: 20% decisions savings, 7% requirements savings
- Prompt compression: 5-17% across light/moderate/aggressive levels
- Semantic chunking: 73% content reduction via TF-IDF selection
- Summary distillation: 73% savings preserving structured fields
- Combined pipeline: 43% total savings on realistic dispatch prompt
- Cache efficiency: 94% cacheable prefix, 85% estimated Anthropic savings
- Provider-aware: 14% budget accuracy improvement for Anthropic vs OpenAI
2026-03-17 22:10:58 -05:00
Jeremy McSpadden
d65da6c927 feat: wire semantic chunking, add preferences, metrics, and docs
- Wire semantic chunker into inlineFileSmart() for large file context selection
- Use inlineFileSmart for knowledge file in buildExecuteTaskPrompt (TF-IDF relevance)
- Add compression_strategy and context_selection preferences with profile defaults
- Add resolveCompressionStrategy() and resolveContextSelection() resolvers
- Add cacheHitRate and compressionSavings to UnitMetrics
- Add aggregateCacheHitRate() for session-wide cache performance
- Update token-optimization.md with compression, chunking, and distillation docs
- Add 12 integration tests for optimization preferences and modules
2026-03-17 22:07:05 -05:00
Jeremy McSpadden
39b3daee6f feat: add token optimization suite for prompt caching, compression, and smart context selection
Introduces six new modules that work together to reduce token usage across
the dispatch pipeline while preserving semantic content quality:

- Provider-aware token counting with per-provider char/token ratios
- Prompt cache optimizer for maximizing Anthropic/OpenAI cache hit rates
- Structured data formatter (compact notation for decisions/requirements/tasks)
- Deterministic prompt compressor (light/moderate/aggressive levels)
- Semantic chunker with TF-IDF relevance scoring for context selection
- Summary distiller for condensed dependency summaries

Integration points:
- inlineDependencySummaries uses distillation before truncation (3+ deps)
- inlineDecisionsFromDb/inlineRequirementsFromDb use compact format at non-full levels
- buildExecuteTaskPrompt compresses carry-forward when it exceeds 40% of budget
- context-budget.reduceToFit combines compression with section-boundary truncation
- computeBudgets accepts optional provider for accurate char/token ratios

All existing 1475 unit tests + 30 integration tests pass with zero regressions.
157 new tests cover all optimization modules.
2026-03-17 22:02:27 -05:00
Jeremy McSpadden
68a999ebde fix: prevent summarizing phase stall by retrying dropped agent_end events (#1072)
When handleAgentEnd dispatches a sub-unit (via hooks, triage, or quick-task
early-dispatch paths) and that unit completes before handleAgentEnd returns,
the resulting agent_end event is silently dropped by the reentrancy guard.
This leaves auto-mode active but permanently stalled — no unit running, no
watchdog set, process at high CPU doing nothing.

Add a pendingAgentEndRetry flag to AutoSession that the reentrancy guard sets
when it drops an agent_end event. The finally block in handleAgentEnd checks
this flag and schedules a deferred retry via setImmediate, ensuring the
completed unit's agent_end is always processed.
2026-03-17 21:49:39 -05:00
Tom Boucher
d252168de5 fix: switch alibaba-coding-plan to OpenAI-compat endpoint with proper compat flags (#1003)
The alibaba-coding-plan provider was using the Anthropic-compatible
endpoint (/apps/anthropic) with anthropic-messages API, which caused
issues with thinking mode on several models (MiniMax-M2.5 thinking
loop, missing thinkingFormat for Qwen/GLM models).

Changes for all 8 models:
- API: anthropic-messages → openai-completions
- Endpoint: /apps/anthropic → /v1 (OpenAI-compatible)
- Added per-model compat flags:
  - Qwen models: thinkingFormat: 'qwen', supportsDeveloperRole: false
  - GLM models: thinkingFormat: 'qwen', supportsDeveloperRole: false
  - MiniMax-M2.5: supportsReasoningEffort: true, maxTokensField: 'max_tokens'
  - Kimi K2.5: thinkingFormat: 'zai', supportsDeveloperRole: false
- Enabled reasoning for qwen3-max (was incorrectly false)
- Fixed context windows to match tested values
- Fixed MiniMax-M2.5 maxTokens: 24576 → 65536
2026-03-17 21:11:18 -04:00
Tom Boucher
7377a08d8e docs: add Node LTS pinning guide for macOS Homebrew users
New doc (docs/node-lts-macos.md) explains how to pin Node 24 LTS
via Homebrew to avoid running on odd-numbered development releases.
Covers brew install/link/pin, version managers as alternatives,
and verification steps.

Added notice banner in README linking to the guide.
2026-03-17 20:49:51 -04:00
TÂCHES
94be09482f fix: add barrel files for remote-questions, ttsr, and shared extensions (#1048)
* fix: add barrel files for remote-questions, ttsr, and shared extensions

Centralizes public API surface for three extension directories behind
index.ts barrel files. External consumers now import from the barrel
instead of reaching into internal module files, reducing coupling and
making future refactors safer.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: rename barrel files to mod.ts to avoid extension loader auto-discovery

The extension loader auto-discovers extensions by looking for index.ts files
inside extensions/*/ directories. remote-questions/ and shared/ are utility
directories, not extensions — their index.ts barrel files caused load failures.

Renamed to mod.ts which the loader ignores, and updated all import paths.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-17 18:48:32 -06:00
TÂCHES
27e79f76b3 refactor: centralize magic numbers into constants.ts (#1044)
Extracts 11 hardcoded timeout, retry, compaction, and tool-default
values from 9 source files into a single constants.ts module. Each
source file now imports from the central definition, eliminating
duplicated literals and making tuning a single-file change.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-17 18:45:43 -06:00
TÂCHES
6a2a6a9e2c fix: consolidate frontmatter parsing into shared module (#1040)
* fix: consolidate frontmatter parsing into shared module

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: strip quotes from frontmatter scalar values

The shared parseFrontmatterMap was missing quote-stripping that the old
rule-loader had, causing 3 test failures in ttsr-rule-loader.test.ts.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-17 18:44:25 -06:00
Tom Boucher
7ba993cbfb docs: add troubleshooting for oh-my-zsh gsd alias collision (#722) (#1051)
oh-my-zsh's git plugin defines alias gsd='git svn dcommit' which
shadows the GSD binary. Added troubleshooting section to
getting-started.md with two solutions: unalias in .zshrc, or use
the gsd-cli alternative binary name.
2026-03-17 18:41:04 -06:00
Tom Boucher
1f9da9ed5f fix: always ensure tasks/ directory exists for slice units (#900) (#1050)
ensurePreconditions() had two branches: create-slice (which included
tasks/) and slice-exists (which conditionally created tasks/). The
conditional path could miss cases where a slice dir was created
manually or by a previous run without the tasks/ subdirectory.

Simplified to: create slice dir if missing, then always check and
create tasks/ unconditionally. Removes the branching that could
leave tasks/ missing.
2026-03-17 18:39:54 -06:00
TÂCHES
8382e5bfcc refactor(resource-loader): extract syncResourceDir to eliminate triplicated sync logic (#1036)
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-17 18:37:59 -06:00
TÂCHES
40f277a65f refactor(bg-shell): split 1604-line god file into tool, command, and lifecycle modules (#1049)
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-17 18:36:24 -06:00
TÂCHES
87cd612542 refactor(headless): split 772-line god file into events, UI, and context modules (#1047)
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-17 18:36:20 -06:00
TÂCHES
665121537d refactor(gsd): extract safeCopy/safeMkdir helpers to replace repetitive try/catch FS patterns (#1043)
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-17 18:36:17 -06:00
TÂCHES
4f10e9bdc4 refactor(gsd): extract atomicWriteSync utility to replace 6 duplicate write-tmp-rename patterns (#1046)
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-17 18:36:14 -06:00
TÂCHES
9488865b9e refactor(gsd): unify duplicate padRight/truncate into shared format-utils (#1045)
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-17 18:36:11 -06:00
TÂCHES
518ccaf8a8 refactor(loader): consolidate 5 duplicate package.json version reads into cached helper (#1042)
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-17 18:35:49 -06:00
TÂCHES
15b72fc738 refactor(headless): remove duplicate jsonLine, use serializeJsonLine from pi-coding-agent (#1039)
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-17 18:35:00 -06:00
Tom Boucher
7d1fc013e4 test: add AutoSession encapsulation invariant tests (#1035)
9 tests that enforce the encapsulation of auto-mode state in AutoSession:

1. No module-level let declarations in auto.ts
2. No module-level var declarations in auto.ts
3. Exactly one AutoSession singleton
4. reset() covers every instance property
5. toJSON() includes key diagnostic properties
6. Module-level consts are only constants/accessors (no mutable state)
7-9. session.ts exports AutoSession with reset() and toJSON()

Added maintenance comments to auto.ts and auto/session.ts explaining
the invariant and linking to these tests. Any PR that adds a module-level
mutable variable to auto.ts will fail CI.
2026-03-17 18:33:20 -06:00
TÂCHES
f5bf03c504 fix: centralize GSD timeout and cache constants (#1038)
Move scattered timeout and cache-size constants (DEFAULT_COMMAND_TIMEOUT_MS,
DEFAULT_BASH_TIMEOUT_SECS, DIR_CACHE_MAX, CACHE_MAX) into a single
constants.ts module within the GSD extension.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-17 18:33:13 -06:00
TÂCHES
6240926ab6 fix: improve RemotePromptRecord.ref type safety (#1041)
Split RemotePromptRecord into a discriminated union of PendingPromptRecord
(ref is undefined) and DispatchedPromptRecord (ref is required). This
makes the type system enforce that ref is always present after dispatch.

Also removes a redundant truthiness check on dispatch.ref in manager.ts,
since RemoteDispatchResult.ref is already non-optional.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-17 18:33:08 -06:00
TÂCHES
9201c0ce16 fix: document silent catch handlers in browser-tools (#1037)
Add descriptive comments to all empty catch blocks explaining why the
error is intentionally swallowed. Covers networkidle timeouts, optional
screenshots, best-effort file writes, response body reads, route
cleanup, and page metadata refreshes.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-17 18:33:00 -06:00
TÂCHES
236e576d1e fix: use literal union types in RuntimeErrorJSON for type safety (#1034)
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-17 18:30:17 -06:00
TÂCHES
edda01e438 fix: extract sanitizeError to shared module and apply to ask-user-questions (#1033)
Closes a security gap where ask-user-questions errorResult() could leak
tokens in error messages. The sanitizeError function and TOKEN_PATTERNS
are now in shared/sanitize.ts, imported by both manager.ts and
ask-user-questions.ts.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-17 18:30:05 -06:00
TÂCHES
4407c24522 fix: deduplicate formatDateShort into shared/format-utils (#1032)
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-17 18:29:54 -06:00
TÂCHES
5657f302a6 refactor: fix unicode regex discrepancy and standardize function naming (#1031)
Add missing \u1680 (Ogham space mark) to UNICODE_SPACES in path-utils.ts
and loader.ts. Make edit-diff.ts import the shared constant from
path-utils.ts instead of maintaining an inline copy.

Rename hashlineParseText to parseHashlineText to follow the parseX()
convention used across the codebase.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-17 18:29:44 -06:00