interview-ui.ts saveEditorToState() was calling getText() which returns
paste markers like '[paste #1 2033 chars]' for content >1000 chars or
>10 lines. The actual pasted content was stored in the Editor's paste
map but never expanded back.
This silently discards user input in ask_user_questions notes — any
substantive response (voice transcripts, detailed explanations, extended
enrichments) that exceeds the paste threshold gets replaced with a
marker string. The LLM receives '[paste #1 N chars]' instead of the
user's actual words.
Fix: getText() → getExpandedText() — the Editor already has the method
that expands paste markers to their stored content. One-line change.
- /exit now calls stopAuto() before exiting to save activity log and clear locks
- Added new /kill command for immediate exit without cleanup
- Fixes issue #132: /exit terminates too abruptly and leaves terminal state dirty
The ESM resolve hook was rewriting .js imports from vendored Pi
packages (packages/*/dist/) to .ts, breaking test resolution.
Compiled dist/ files need their .js specifiers left intact.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Vendor all 4 Pi packages (tui, ai, agent-core, coding-agent) from
pi-mono v0.57.1 as @gsd/* workspace packages under packages/. This
replaces the compiled npm dependency (@mariozechner/pi-coding-agent)
and patch-package workflow, giving direct source access for
modifications.
- Copy Pi source from pi-mono v0.57.1 into packages/
- Create workspace package.json + tsconfig.json for each package
- Rename ~240 imports from @mariozechner/pi-* to @gsd/pi-*
- Apply existing patches as source edits (setModel persist, VT input)
- Remove @mariozechner/pi-coding-agent dep and patch-package
- Update build pipeline to build packages in dependency order
- Add pi-upstream git remote for future selective syncing
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Doctor's fix:true mode was creating summary stubs and marking slices
done in the roadmap during the post-hook after every task. This
short-circuited the complete-slice dispatch unit — by the time
dispatchNextUnit ran, the slice was already 'done' and the merge
guard merged it to main, so complete-slice (which writes the real
compressed summary) never got a chance to run.
Root cause: doctor conflated two responsibilities — task-level
bookkeeping (marking checkboxes) and completion state transitions
(summary stubs, roadmap marking). The post-hook should only do
the former.
Fix: added fixLevel option to runGSDDoctor. fixLevel:'task' (used
by post-hook) skips completion transition codes. fixLevel:'all'
(default, used by manual gsd doctor and resume) preserves existing
recovery behavior.
Completion transition codes gated by fixLevel:
- all_tasks_done_missing_slice_summary
- all_tasks_done_missing_slice_uat
- all_tasks_done_roadmap_not_checked
Restores main_branch field on GitPreferences (removed in a prior merge conflict
resolution) and adds VALID_BRANCH_NAME validation in getMainBranch(). Implements
runPreMergeCheck with auto-detection from package.json test scripts and support
for custom commands via prefs.pre_merge_check string values.
Fixes 5 pre-existing test failures in git-service.test.ts (158/158 now pass).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The crash loop: stale state → unit redispatched → activity log grows →
retry diagnostic reads full log → prompt grows → replaceAll on huge
string → V8 heap exhaustion. Cap both the read path (10MB JSONL parse
limit) and the injection path (50K char prompt cap) to break the cycle.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Pre-switch auto-commits were including .gsd/ planning artifacts (roadmaps, STATE.md)
on both sides of a branch switch, causing reliable merge conflicts when squash-merging
slices back to main. Now pre-switch auto-commits exclude the entire .gsd/ directory,
while post-task auto-commits continue to include them normally.
Also restores VALID_BRANCH_NAME export removed in a prior merge conflict resolution.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Front-load API key collection into GSD's planning phase so auto-mode
runs uninterrupted. Planning prompts forecast secrets into a manifest,
auto-mode collects pending keys before dispatching the first slice.
- getManifestStatus() queries manifest state against env
- collectSecretsFromManifest() orchestrates summary, collection, manifest update
- showSecretsSummary() read-only TUI summary with status indicators
- collectOneSecret() enhanced with guidance display above masked input
- Secrets gate in startAuto() — non-fatal, inherited by guided flow
- 19 new tests (manifest-status, collect-from-manifest, auto-secrets-gate)
- All 10 requirements (R001-R010) validated
mergeSliceToMain now runs git reset --hard if git merge --squash fails,
restoring a clean working tree instead of leaving conflict markers.
The merge guard catch block in auto.ts now:
1. Detects leftover conflicted state (UU/AA/UD in porcelain status)
2. Resets the working tree if conflicts remain
3. Stops auto-mode with a clear error instead of continuing with
corrupted .gsd/ state files that cause an infinite dispatch loop
Also fixes conflict markers in loader.ts, logo.ts, and postinstall.js
that were baked into main from a prior bad merge resolution.
Remove task/project/product classification taxonomy from discuss prompt.
The LLM now sizes work based on judgment, not labels.
Key changes:
- discuss.md: Replace 3-tier classification with judgment-based sizing.
Remove hard minimum question rounds (2 for task, 4 for project).
Questioning depth now matches actual scope.
- plan-milestone.md: Add right-sizing doctrine. Single-slice milestones
now write the S01 plan + task plans inline, eliminating separate
research-slice and plan-slice sessions.
- plan-slice.md: Add right-sizing guidance. Make Proof Level,
Integration Closure, and Observability sections conditional —
omit entirely for simple slices instead of filling with 'none'.
Consolidate self-audit from 10 items to 7 (remove duplicates).
- auto.ts: Skip research-slice for S01 when milestone research exists.
Update peekNext label for plan-milestone.
- complete-slice.md: Add effort-matching guidance. Lighten observability
verification for simple slices.
- execute-task.md: Make observability steps conditional on task plan
content rather than always required.
- templates (plan.md, task-plan.md): Add comments making heavyweight
sections explicitly optional for simple work.
Pipeline reduction for simple 1-slice milestone:
Before: 9-10 sessions (research-M, plan-M, research-S, plan-S,
execute×N, complete-S, reassess, complete-M)
After: 5-6 sessions (research-M, plan-M [+S01 inline],
execute×N, complete-S, complete-M)
Adds support for specifying fallback models in GSD preferences. When a
primary model fails to switch (provider unavailable, rate limited, etc.),
GSD automatically tries the next model in the fallbacks list.
Changes:
- Add GSDPhaseModelConfig interface for per-phase model with fallbacks
- Add resolveModelWithFallbacksForUnit() function
- Update model switching in auto.ts to try fallbacks in order
- Update preferences-reference.md with fallback examples
Example usage:
```yaml
models:
planning:
model: claude-opus-4-6
fallbacks:
- openrouter/z-ai/glm-5
- openrouter/minimax/minimax-m2.5
```
This enables cost-optimized configurations with resilience against
provider outages or credit exhaustion.
Covers VALID_BRANCH_NAME regex validation, configured preference
returns correctly, fallback to auto-detection, and injection rejection.
Closes#108
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Three fixes to the dispatch loop:
1. Don't mark a unit complete when the next dispatch is the same unit
(retry scenario) — let the retry mechanism handle it instead of
persisting a false completion.
2. Verify expected artifact exists on disk before marking a unit
complete. Uses resolveExpectedArtifactPath + existsSync to gate
persistCompletedKey calls.
3. Cross-validate idempotency: when skipping a "completed" unit, verify
the artifact actually exists. If missing, remove the stale record
from completed-units.json and re-run the unit.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Cut 48% of system prompt token cost while preserving all load-bearing content:
- Remove Activity Logs section (agent never interacts with these)
- Remove Investigation escalation ladder (redundant with tool-routing)
- Remove Context economy section (obvious/redundant)
- Remove Web research vs browser execution (compressed into playbooks)
- Compress tool-routing to non-obvious entries only (scout, bg_shell, Context7)
- Compress Ask vs infer to core rule
- Compress Code structure to 5 key principles
- Compress Verification to inline task-type table
- Compress Agent-First Observability (character block carries the why)
- Compress Background processes playbook from 30 to 5 lines
- Compress Web behavior playbook from 25 to 6 lines
- Compress Libraries and Current facts into single section
- Remove BRAVE_API_KEY config (user-facing, not agent-facing)
Three additions to the GSD character block:
- Security/performance/elegance as craft instinct, not checkbox compliance
- Anti-laziness: finish what you start, no stubs, no 80% features, no skipped error handling
- Self-debugging awareness: you write code you will debug later with no memory of writing it
Replace the generic agent intro with a craftsman-engineer character
definition: curious about problems, warm but terse, co-owner during
planning, committed executor during auto-mode. Consolidate the
scattered Communication and Writing Style + Work Narration sections
into a single focused Communication section that preserves all
calibration signals (pushback triggers, narration examples, uncertainty
handling).
- Prevent duplicate Brave tool entries when toggling providers repeatedly
by filtering already-active tools before re-adding (BUG-1)
- Remove single quotes from test glob patterns in package.json so Windows
shell expands them correctly (BUG-2)
- Fix test mock fire() to call all handlers instead of short-circuiting
on first match, matching real framework behavior (BUG-3)
- Suppress "Native Anthropic web search active" notification on session
restore (source: "restore") to reduce UX noise (BUG-4)
- Add regression tests for all 4 bugs
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The model_select event doesn't reliably fire on startup, so Brave tools
remained visible to Claude even without a key. Now before_provider_request
filters search-the-web and search_and_read from the payload directly,
ensuring Claude only sees the native web_search tool.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The Pi SDK's streaming parser drops server_tool_use and
web_search_tool_result content blocks. When the conversation is replayed,
assistant messages are incomplete, causing the Anthropic API to reject
requests with "thinking blocks cannot be modified."
Fix: stripThinkingFromHistory() removes thinking/redacted_thinking blocks
from all assistant messages before sending, since they're all from stored
history. The model generates fresh thinking for each new turn.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Inject the web_search_20250305 server-side tool into Anthropic API
requests, eliminating the BRAVE_API_KEY requirement for Anthropic models.
When Anthropic + no Brave key, custom search tools are disabled to avoid
confusing the LLM with broken tools. fetch_page (Jina) is unaffected.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Theme proxy throws when accessed in RPC mode since initTheme() is
never called without a TUI. Wrap header rendering in try/catch so
the GSD extension loads cleanly in both TUI and RPC modes.
Closes#121
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Adds a Work Narration section to system.md and per-phase hints to
research, plan, and execute prompts. Instructs the LLM to emit brief
status messages between tool calls covering decisions, discoveries,
phase transitions, and verification results — without narrating
routine reads or trivial commands.
Parallel mode was slicing each agent's output to 200 characters before
returning to the parent agent, destroying researcher/scout findings.
Single and chain modes already return full output — this aligns parallel.
Closes#116
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Terminals like macOS Terminal.app and JetBrains IDEs don't support
the Kitty keyboard protocol, so Ctrl+Alt shortcuts silently fail.
Shortcut descriptions now detect unsupported terminals and surface
the equivalent slash command (e.g. /gsd status, /bg, /voice).
Closes#100, closes#104
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
ROADMAP.md was the only fatal requirement for .planning → .gsd migration,
but the transformer already had a null-roadmap fallback that infers
milestones from the phases/ directory. Downgrade to warning so partial
v1 projects can migrate successfully.
Closes#93Closes#90
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: worktree branch namespacing and fresh-start flow
- Namespace slice branches by worktree name (gsd/<wt>/<M>/<S>) to prevent
git checkout conflicts when multiple worktrees work on the same milestone
- getMainBranch() returns worktree/<name> inside a worktree so slice merges
target the worktree branch instead of main (which is checked out elsewhere)
- Add continue/fresh-start prompt when creating a worktree with existing milestones
- Restyle all worktree command output with consistent semantic color palette
- Add parseSliceBranch() and SLICE_BRANCH_RE for robust branch name parsing
- Fix duplicate getCurrentBranch import in auto.ts
- Add 40-assertion integration test covering full worktree lifecycle
* fix: branch slice from current branch, not main
ensureSliceBranch always branched from getMainBranch() (main/master),
but planning artifacts (CONTEXT, ROADMAP, etc.) may only exist on the
working branch (e.g. "developer"). The slice branch would lose all
planning artifacts, causing deriveState to see pre-planning and the
rebuildState post-hook to overwrite STATE.md with a blank state.
Now branches from the current branch when it is not itself a slice
branch. Falls back to main when on a slice branch to avoid chaining.
Adds regression tests for both cases.
- Fix auto.ts indentation: properly indent inner if-else chain inside summarizing guard
- Clear unitDispatchCount on resume path (paused → active) to prevent stale counts
- Add parseSummary regression tests for #91: bare scalar "none" coerced to string[]
- Add parseSummary test for missing frontmatter fields yielding empty arrays
- Verify .slice().join() works on coerced arrays (the original crash pattern)
Test results: 273 passed, 0 failed (24 new assertions)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
CLI routing (#81, #107):
- Import and route --mode rpc to runRpcMode() instead of silently falling through to runPrintMode
- Add TTY guard before interactive mode — exit with helpful message when stdin is not a TTY
- Add --version and --help flags
Auto-mode infinite loop (#96):
- Move summarizing/complete-slice dispatch before reassessment check (D1) — ensures mergeSliceToMain always runs
- Add per-unit dispatch counter to detect alternating loops like A→B→A→B (D3)
Windows shell escaping (#106, #98):
- Platform-aware escapeShellArg() in mcporter extension — double quotes on Windows, single quotes on Unix
CRASH: parseSummary (#91):
- Add asStringArray() helper to safely coerce YAML bare scalars (e.g. "none") to string arrays
- Applied to all 7 frontmatter fields that expect string[]
Google Search model (#99):
- Replace hardcoded gemini-3-flash-preview with env var GEMINI_SEARCH_MODEL (default: gemini-2.5-flash)
Worktree branch collision (#84):
- Check git worktree list before checkout to detect branches already in use by another worktree
Migration UX (#90, #93):
- Improve error messages to distinguish migration from new project setup, suggest /gsd:new-project
Keyboard shortcuts (#100, #104):
- Document terminal protocol requirement in shortcut descriptions — Ctrl+Alt combos need Kitty/modifyOtherKeys
Closes#81, #84, #91, #96, #99, #106, #107
Addresses #90, #93, #95, #98, #100, #104
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Three defects in auto.ts + worktree.ts combined to produce an infinite
alternating loop and unreliable unit closeout in GSD auto-mode.
**D1 (auto.ts)** — `state.phase === "summarizing"` is now the first
branch in the dispatch if-else chain, evaluated before `needsRunUat`
and `needsReassess`. Previously, if an execute-task agent wrote
slice-level artifacts early, `needsReassess` fired instead and
`mergeSliceToMain` was permanently skipped.
**D2 (worktree.ts)** — New slice branches are now created from the
current HEAD instead of `main`. When a prior slice merge was skipped,
the new branch would inherit a stale ROADMAP from main, creating
divergent state that drove the A→B→A→B alternation.
**D3 (auto.ts)** — Replaced `lastUnit`/`retryCount` consecutive-repeat
detection with a `unitDispatchCount` map that tracks total dispatches
per unit key. The old guard reset to 0 on every ID change; the map
catches alternating-loop patterns and stops after MAX_UNIT_DISPATCHES=3.
**Atomic closeout (auto.ts)** — `persistCompletedKey` writes the unit
key to `.gsd/completed-units.json` before any in-memory update. A crash
mid-closeout is now recoverable: on next start `loadPersistedKeys`
re-populates `completedKeySet` and the idempotency guard skips already-
completed units.
**Persistent idempotency (auto.ts)** — `completedKeySet` is loaded from
disk on `startAuto` and checked before every dispatch, preventing re-
dispatch of units completed in a prior session even after a restart.
**Startup self-heal (auto.ts + unit-runtime.ts)** — `selfHealRuntimeRecords`
runs on start and resume; it scans all on-disk runtime records, checks
whether each unit's expected artifact exists, and clears any orphaned
records. Added `listUnitRuntimeRecords` to unit-runtime.ts to support
this scan.
**Recovery backoff (auto.ts)** — `recoverTimedOutUnit` now tracks
cross-invocation recovery attempts per unit in `unitRecoveryCount` and
applies exponential backoff (1s→2s→4s…30s cap) between attempts.
Attempt number is included in all recovery notify messages for
traceability.
Closes#96Closes#109
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add Tavily Search API as an alternative backend for search-the-web and
search_and_read tools. Tavily is selected automatically when TAVILY_API_KEY
is set (preferred over Brave when both keys present). Existing Brave
Search paths are completely unchanged.
Motivation: Brave Search API signup requires Stripe payment which may
not be available in all regions. Tavily offers a free tier and also
provides a Deep Research API for future expansion.
Changes:
- Auth: Tavily API key in wizard, auth.json storage, env hydration
- search-the-web: Tavily POST backend with response normalization
- search_and_read: Tavily advanced search with client-side token budgeting
- /search-provider: slash command for explicit provider switching
- 61 new tests covering all Tavily integration paths
- Zero changes to existing Brave code paths