Addresses state safety issues found during #1062 deep dive:
1. completed-units.json writes in auto-worktree.ts and auto-worktree-sync.ts
used plain writeFileSync which could produce truncated/corrupt files on
crash, losing completion keys and causing unit re-dispatch. Switched to
atomicWriteSync (temp file + rename) for crash safety.
2. Plan file checkbox reconciliation in auto-worktree.ts also switched to
atomicWriteSync to prevent partial PLAN.md writes on crash.
3. db-writer.ts functions (saveDecisionToDb, updateRequirementInDb,
saveArtifactToDb) wrote markdown files via saveFile() without invalidating
caches afterward. Added targeted cache invalidation (state + path + parse)
so deriveState() always sees fresh data. Uses individual invalidation
functions rather than invalidateAllCaches() to avoid clearing the artifacts
table that was just written to.
The verification gate's discoverCommands() was passing prose descriptions
from task plan Verify: fields through sanitizeCommand(), which only checked
for shell injection characters. English prose like "Document exists, contains
all 5 scale names..." passed the filter and was executed via spawnSync,
causing exit code 127 false negatives.
Added isLikelyCommand() heuristic that distinguishes executable commands
from prose descriptions by checking:
- Known command prefixes (npm, node, tsc, eslint, etc.)
- Path-like first tokens (./script.sh, /usr/bin/check)
- Flag-like tokens (-v, --check)
- Uppercase-initial words with 4+ tokens (prose pattern)
- Comma-space clause separators (prose pattern)
Prose Verify: fields now fall through to package.json scripts or "none"
instead of being executed. Valid commands continue to work as before.
Closes#1066
When a model fails during auto-mode and the fallback chain is exhausted
(or absent), the error recovery path previously fell through to pause
without attempting to restore the session's original model. Meanwhile,
the fallback chain itself was read fresh from disk via
loadEffectiveGSDPreferences(), which could pick up models configured by
a different concurrent GSD session sharing the same global preferences
file.
This adds a session model recovery step between fallback exhaustion and
pause. After the existing fallback chain logic, we now check whether the
current model has diverged from the model captured at auto-mode start
(autoModeStartModel). If so, we restore the session model and retry
before giving up and pausing.
Changes:
- auto.ts: export getAutoModeStartModel() getter for the session's
captured start model
- index.ts: add session model recovery block after fallback chain
exhaustion, using the session-scoped model instead of re-reading
global preferences from disk
- model-isolation.test.ts: add 4 tests covering cross-session leakage
detection, divergence checks, and null safety
Two compounding bugs caused auto-mode to re-dispatch run-uat indefinitely
after UAT passed:
1. markSliceDoneInRoadmap regex required dash at line start (^-) but the
roadmap parser accepts optional leading whitespace (^\s*-). When LLMs
indented checklist items, the doctor could never mark them done.
2. After run-uat completed, handleAgentEnd ran doctor with fixLevel:"task"
which explicitly excluded slice-level completion transitions. Since
run-uat is the terminal unit for a slice, the roadmap checkbox stayed
unchecked, causing deriveState to return the same slice indefinitely.
Fix: Update markSliceDoneInRoadmap and markTaskDoneInPlan regexes to
accept leading whitespace (matching the parser), preserving indentation
in the replacement. Add run-uat to the set of unit types that use
fixLevel:"all" in handleAgentEnd closeout.
The /gsd update command imported compareSemver from ../../../update-check.js,
a relative path that resolves correctly in the source tree (src/resources/
extensions/gsd/ → src/update-check.js) but breaks when extensions are synced
to ~/.gsd/agent/extensions/gsd/ (where ../../../ points to ~/.gsd/ which has
no update-check.js).
This caused the error:
Extension "command:gsd" error: Cannot find module '../../../update-check.js'
Fix: inline a local compareSemverLocal() function in commands.ts, eliminating
the cross-tree import. The function is small (10 lines) and already well-tested
via update-check.test.ts.
* docs: add Node LTS pinning guide for macOS Homebrew users
New doc (docs/node-lts-macos.md) explains how to pin Node 24 LTS
via Homebrew to avoid running on odd-numbered development releases.
Covers brew install/link/pin, version managers as alternatives,
and verification steps.
Added notice banner in README linking to the guide.
* docs: recommend pi-dashscope extension for DashScope models
The built-in alibaba-coding-plan provider uses the Anthropic-compat
endpoint and lacks per-model thinking format and compatibility flags,
causing issues like #1003 (MiniMax-M2.5 thinking loop).
The community pi-dashscope extension uses the correct OpenAI-compat
endpoint, sets thinkingFormat per model (qwen/zai), includes compat
flags (supportsDeveloperRole, supportsReasoningEffort), and provides
an interactive /dashscope-configure command.
Added Community Provider Extensions section to configuration docs
recommending pi-dashscope over the built-in provider.
* docs: add Node LTS pinning guide for macOS Homebrew users
New doc (docs/node-lts-macos.md) explains how to pin Node 24 LTS
via Homebrew to avoid running on odd-numbered development releases.
Covers brew install/link/pin, version managers as alternatives,
and verification steps.
Added notice banner in README linking to the guide.
* fix: disable reasoning for MiniMax-M2.5 in alibaba-coding-plan provider (#1003)
MiniMax-M2.5 via Dashscope's Anthropic-compatible API does not
properly support extended thinking, causing the model to get stuck
in a thinking loop. Set reasoning: false for this model entry in
the alibaba-coding-plan provider.
* docs: add Node LTS pinning guide for macOS Homebrew users
New doc (docs/node-lts-macos.md) explains how to pin Node 24 LTS
via Homebrew to avoid running on odd-numbered development releases.
Covers brew install/link/pin, version managers as alternatives,
and verification steps.
Added notice banner in README linking to the guide.
* docs: update README and docs for v2.28.0 release
- README: add 'What's New in v2.28' section with key features
- commands.md: add /gsd update, /gsd export --html --all, and
Export section with usage examples
- auto-mode.md: add --all flag to export, add Failure Recovery
(v2.28) section documenting reliability hardening
- getting-started.md: mention /gsd update as in-session option
* feat(gsd): add directory safeguards to prevent running in system/home paths
GSD previously had no protection against being launched from dangerous
directories like $HOME, /, /usr, or /etc. This adds layered validation:
- Blocked system paths (hard stop): /, /usr, /etc, /var, $HOME, tmpdir, etc.
- High entry count heuristic (>200 entries triggers confirmation dialog)
- Symlink resolution via realpathSync to prevent bypass
- Integrated at three chokepoints: projectRoot(), showSmartEntry(), bootstrapGsdDirectory()
Includes 19 tests covering all blocked categories, boundary conditions, and
the assertSafeDirectory throw/return behavior.
* fix: make directory safeguard tests cross-platform (Windows CI)
- Skip Unix-specific blocked path tests on Windows (/, /usr, /etc, etc.)
- Add Windows-specific blocked path tests (C:\, C:\Windows)
- Use platform-appropriate path separator in trailing slash test
- Fix root path normalization for Windows drive letters (C:\ not C:)
- Add version-match early return to initResources() — skips ~800ms of
synchronous rmSync + cpSync when managed-resources.json already matches
the running GSD version (steady-state on every launch)
- Consolidate package.json reads in loader.ts from 3 to 1 — single read
reused for --version, --help, banner, and GSD_VERSION env var
- Replace blocking checkAndPromptForUpdates() with passive checkForUpdates()
to avoid blocking startup on npm registry fetch + user prompt (up to 5s)
- Cache bundled extension keys in resource-loader to avoid redundant
filesystem scan in buildResourceLoader()
- Use GSD_VERSION env var in getBundledGsdVersion() to skip package.json
re-read from resource-loader.ts
- Add test verifying version-skip behavior: marker file survives when
versions match, gets cleaned on mismatch
* feat: enhance HTML report with derived metrics, visualizations, and interactivity
Add 13 features to the HTML report generator across 6 implementation waves:
Wave 1 - Summary enhancements:
- Executive summary paragraph with project completion %, cost, and budget context
- ETA calculation based on completion rate and remaining slices
- Cost/slice and Tokens/tool efficiency metrics in KV grid
- Cache hit ratio percentage
- Milestone scope indicator when scoped to a milestone
Wave 2 - Metrics visualizations:
- Cost over time inline SVG area chart with grid lines and axis labels
- Duration by slice bar chart (third chart using existing buildBarChart)
- Budget burndown horizontal stacked bar (spent/projected/overshoot)
- Chart row CSS changed to auto-fit for flexible multi-column layout
Wave 3 - Blockers section:
- New section with card-based layout for blocker verifications and high-risk
incomplete slices, added to sections array and TOC nav
Wave 4 - Gantt chart:
- SVG horizontal bar timeline grouped by slice with done/active/pending
coloring and time axis labels
Wave 5 - Interactive JS features:
- Timeline filter input for text-based row filtering
- Collapsible sections with toggle buttons (localStorage persisted)
- Dark/light theme toggle in header (localStorage persisted)
Wave 6 - Mobile responsiveness:
- 768px and 480px breakpoints with stacked layouts and compressed padding
All changes in a single file (export-html.ts). No data layer changes needed.
30 new tests covering all features and edge cases.
* fix: correct Phase type literal in export-html-enhancements test
Change "execution" to "executing" to match the Phase type definition.
* docs: add Node LTS pinning guide for macOS Homebrew users
New doc (docs/node-lts-macos.md) explains how to pin Node 24 LTS
via Homebrew to avoid running on odd-numbered development releases.
Covers brew install/link/pin, version managers as alternatives,
and verification steps.
Added notice banner in README linking to the guide.
* feat: auto-extract lessons to KNOWLEDGE.md on slice/milestone completion (#711)
Added knowledge extraction steps to completion prompts:
- complete-slice.md step 9: review task summaries for patterns,
gotchas, and non-obvious lessons → append to KNOWLEDGE.md
- complete-milestone.md step 9: review all slice summaries for
cross-cutting insights → append to KNOWLEDGE.md
Combined with the existing execute-task step 13 (which already
tells agents to append discoveries during execution), this creates
a three-layer extraction pipeline: task → slice → milestone.
* docs: add Node LTS pinning guide for macOS Homebrew users
New doc (docs/node-lts-macos.md) explains how to pin Node 24 LTS
via Homebrew to avoid running on odd-numbered development releases.
Covers brew install/link/pin, version managers as alternatives,
and verification steps.
Added notice banner in README linking to the guide.
* feat: auto-create PR on milestone completion (#687)
New git preferences:
- git.auto_pr (boolean, default false): create a PR when a
milestone completes via gh CLI
- git.pr_target_branch (string, default main branch): target
branch for auto-created PRs (e.g. develop, qa, staging)
Implementation:
- GitPreferences: added auto_pr and pr_target_branch fields
- preferences.ts: added validation for both fields
- auto-worktree.ts: after push, pushes milestone branch and
creates PR via 'gh pr create' (non-fatal on failure)
Documentation:
- configuration.md: added fields to git config block, table,
and new git.auto_pr section with requirements and flow
- git-strategy.md: added Automatic Pull Requests section with
Gitflow example config
* docs: add Node LTS pinning guide for macOS Homebrew users
New doc (docs/node-lts-macos.md) explains how to pin Node 24 LTS
via Homebrew to avoid running on odd-numbered development releases.
Covers brew install/link/pin, version managers as alternatives,
and verification steps.
Added notice banner in README linking to the guide.
* fix: improve LSP diagnostics when no servers detected (#1082)
When lsp status returns 'No language servers configured', the output
now includes diagnostics:
- Which project markers were detected (e.g. package.json found)
- Which server commands are missing (e.g. typescript-language-server)
- Install instructions
Also added LSP troubleshooting section to docs/troubleshooting.md
with common install commands per language.
* fix: add barrel files for remote-questions, ttsr, and shared extensions
Centralizes public API surface for three extension directories behind
index.ts barrel files. External consumers now import from the barrel
instead of reaching into internal module files, reducing coupling and
making future refactors safer.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: rename barrel files to mod.ts to avoid extension loader auto-discovery
The extension loader auto-discovers extensions by looking for index.ts files
inside extensions/*/ directories. remote-questions/ and shared/ are utility
directories, not extensions — their index.ts barrel files caused load failures.
Renamed to mod.ts which the loader ignores, and updated all import paths.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Extracts 11 hardcoded timeout, retry, compaction, and tool-default
values from 9 source files into a single constants.ts module. Each
source file now imports from the central definition, eliminating
duplicated literals and making tuning a single-file change.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: consolidate frontmatter parsing into shared module
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: strip quotes from frontmatter scalar values
The shared parseFrontmatterMap was missing quote-stripping that the old
rule-loader had, causing 3 test failures in ttsr-rule-loader.test.ts.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
oh-my-zsh's git plugin defines alias gsd='git svn dcommit' which
shadows the GSD binary. Added troubleshooting section to
getting-started.md with two solutions: unalias in .zshrc, or use
the gsd-cli alternative binary name.
ensurePreconditions() had two branches: create-slice (which included
tasks/) and slice-exists (which conditionally created tasks/). The
conditional path could miss cases where a slice dir was created
manually or by a previous run without the tasks/ subdirectory.
Simplified to: create slice dir if missing, then always check and
create tasks/ unconditionally. Removes the branching that could
leave tasks/ missing.
9 tests that enforce the encapsulation of auto-mode state in AutoSession:
1. No module-level let declarations in auto.ts
2. No module-level var declarations in auto.ts
3. Exactly one AutoSession singleton
4. reset() covers every instance property
5. toJSON() includes key diagnostic properties
6. Module-level consts are only constants/accessors (no mutable state)
7-9. session.ts exports AutoSession with reset() and toJSON()
Added maintenance comments to auto.ts and auto/session.ts explaining
the invariant and linking to these tests. Any PR that adds a module-level
mutable variable to auto.ts will fail CI.
Move scattered timeout and cache-size constants (DEFAULT_COMMAND_TIMEOUT_MS,
DEFAULT_BASH_TIMEOUT_SECS, DIR_CACHE_MAX, CACHE_MAX) into a single
constants.ts module within the GSD extension.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Split RemotePromptRecord into a discriminated union of PendingPromptRecord
(ref is undefined) and DispatchedPromptRecord (ref is required). This
makes the type system enforce that ref is always present after dispatch.
Also removes a redundant truthiness check on dispatch.ref in manager.ts,
since RemoteDispatchResult.ref is already non-optional.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add descriptive comments to all empty catch blocks explaining why the
error is intentionally swallowed. Covers networkidle timeouts, optional
screenshots, best-effort file writes, response body reads, route
cleanup, and page metadata refreshes.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Closes a security gap where ask-user-questions errorResult() could leak
tokens in error messages. The sanitizeError function and TOKEN_PATTERNS
are now in shared/sanitize.ts, imported by both manager.ts and
ask-user-questions.ts.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add missing \u1680 (Ogham space mark) to UNICODE_SPACES in path-utils.ts
and loader.ts. Make edit-diff.ts import the shared constant from
path-utils.ts instead of maintaining an inline copy.
Rename hashlineParseText to parseHashlineText to follow the parseX()
convention used across the codebase.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Align pi-tui chalk from ^5.5.0 to ^5.6.2 (matches root, pi-ai, pi-coding-agent)
- Convert @mistralai/mistralai and openai to caret ranges (^1.14.1, ^6.26.0)
in both root and pi-ai — no intentional pin rationale found in git history,
versions were just hoisted as-is from workspace deps
- Keep gaxios@7.1.4 override pinned — intentionally set in 5c64f99 to
eliminate glob@10.5.0 deprecation warnings from transitive deps
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Cap parallel async operations to prevent memory spikes when processing
large numbers of items:
- session-manager.ts: limit file loading to 10 concurrent reads
- pipeline.ts: limit job execution to 5 concurrent LLM calls
- discovery.ts: limit tool scanning to 5 concurrent scanners
Uses an inline pLimit utility in each file to avoid adding a dependency.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Consolidate six per-subdirectory makeTreeWritable calls (pre+post copy
for extensions, agents, and skills) into two calls on the parent agentDir:
one before all copies to unlock existing Nix-store read-only files, and
one after to ensure newly copied files are writable for subsequent runs.
Eliminates ~4 redundant recursive stat+chmod walks, saving 100-300ms on
every CLI launch that triggers resource sync.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add argument completions for /thinking command with all 6 levels
(off, minimal, low, medium, high, xhigh) and descriptions
- Add descriptions to all GSD 2nd-level subcommand completions across
14 subcommand groups (auto, mode, parallel, setup, prefs, remote,
next, history, undo, export, cleanup, knowledge, doctor, dispatch)
- Add 35 new tests for autocomplete and fuzzy matching systems
gsd_generate_milestone_id scans disk for existing milestone dirs.
When called multiple times before any artifacts are written, it
returned the same ID (e.g. M001) every time because no dirs existed
yet.
Added an in-memory reservation set that tracks IDs returned by the
tool. Subsequent calls merge reserved IDs with on-disk IDs before
computing the next sequential ID, ensuring M001, M002, M003 are
returned in sequence even without intermediate disk writes.
* fix: consolidate duplicate formatting functions
- Rename private formatDuration in verification-evidence.ts to
formatDurationSecs to clarify its distinct fixed-seconds behavior
- Replace formatUptime implementation in bg-shell/utilities.ts with
import of shared formatDuration (functionally equivalent for >=1s)
- Replace inline fmt lambda in cli.ts with shared formatTokenCount
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: revert cli.ts import from extensions path that breaks at runtime
The extensions directory uses a separate compilation/bundle path and
cannot be imported from cli.ts. Restores the inline fmt lambda.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix(gsd): delete orphaned complexity.ts (superseded by complexity-classifier.ts)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix(test): update complexity-routing tests for complexity-classifier.ts
The test file imported from the deleted complexity.ts. Removed tests
for the defunct classifyTaskComplexity function and updated all source
reads to reference complexity-classifier.ts with its actual exports.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix(search): consolidate duplicate Brave API helper functions
getBraveApiKey() and braveHeaders() were duplicated across provider.ts,
tool-llm-context.ts, and tool-search.ts. Export both from provider.ts
and import in the tool files.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix(test): update provider export count to include braveHeaders
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>