Three additions to the GSD character block:
- Security/performance/elegance as craft instinct, not checkbox compliance
- Anti-laziness: finish what you start, no stubs, no 80% features, no skipped error handling
- Self-debugging awareness: you write code you will debug later with no memory of writing it
Replace the generic agent intro with a craftsman-engineer character
definition: curious about problems, warm but terse, co-owner during
planning, committed executor during auto-mode. Consolidate the
scattered Communication and Writing Style + Work Narration sections
into a single focused Communication section that preserves all
calibration signals (pushback triggers, narration examples, uncertainty
handling).
- Prevent duplicate Brave tool entries when toggling providers repeatedly
by filtering already-active tools before re-adding (BUG-1)
- Remove single quotes from test glob patterns in package.json so Windows
shell expands them correctly (BUG-2)
- Fix test mock fire() to call all handlers instead of short-circuiting
on first match, matching real framework behavior (BUG-3)
- Suppress "Native Anthropic web search active" notification on session
restore (source: "restore") to reduce UX noise (BUG-4)
- Add regression tests for all 4 bugs
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The model_select event doesn't reliably fire on startup, so Brave tools
remained visible to Claude even without a key. Now before_provider_request
filters search-the-web and search_and_read from the payload directly,
ensuring Claude only sees the native web_search tool.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The Pi SDK's streaming parser drops server_tool_use and
web_search_tool_result content blocks. When the conversation is replayed,
assistant messages are incomplete, causing the Anthropic API to reject
requests with "thinking blocks cannot be modified."
Fix: stripThinkingFromHistory() removes thinking/redacted_thinking blocks
from all assistant messages before sending, since they're all from stored
history. The model generates fresh thinking for each new turn.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
12 tests covering: tool injection for claude models, non-claude passthrough,
double-injection prevention, tool deactivation/reactivation on model switch,
and session_start diagnostics.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Inject the web_search_20250305 server-side tool into Anthropic API
requests, eliminating the BRAVE_API_KEY requirement for Anthropic models.
When Anthropic + no Brave key, custom search tools are disabled to avoid
confusing the LLM with broken tools. fetch_page (Jina) is unaffected.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Theme proxy throws when accessed in RPC mode since initTheme() is
never called without a TUI. Wrap header rendering in try/catch so
the GSD extension loads cleanly in both TUI and RPC modes.
Closes#121
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Adds a Work Narration section to system.md and per-phase hints to
research, plan, and execute prompts. Instructs the LLM to emit brief
status messages between tool calls covering decisions, discoveries,
phase transitions, and verification results — without narrating
routine reads or trivial commands.
Replace the plain-text API-key-only wizard with a branded, clack-based
onboarding experience that guides first-launch users through LLM provider
authentication (OAuth or API key), optional tool API keys, and a summary.
- Create src/logo.ts as single source of truth for ASCII logo
- Create src/onboarding.ts with shouldRunOnboarding() and runOnboarding()
- Trim src/wizard.ts to env hydration only (loadStoredEnvKeys)
- Wire onboarding into src/cli.ts, add `gsd config` subcommand
- Remove duplicate first-launch banner from src/loader.ts
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Parallel mode was slicing each agent's output to 200 characters before
returning to the parent agent, destroying researcher/scout findings.
Single and chain modes already return full output — this aligns parallel.
Closes#116
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Terminals like macOS Terminal.app and JetBrains IDEs don't support
the Kitty keyboard protocol, so Ctrl+Alt shortcuts silently fail.
Shortcut descriptions now detect unsupported terminals and surface
the equivalent slash command (e.g. /gsd status, /bg, /voice).
Closes#100, closes#104
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
ROADMAP.md was the only fatal requirement for .planning → .gsd migration,
but the transformer already had a null-roadmap fallback that infers
milestones from the phases/ directory. Downgrade to warning so partial
v1 projects can migrate successfully.
Closes#93Closes#90
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: worktree branch namespacing and fresh-start flow
- Namespace slice branches by worktree name (gsd/<wt>/<M>/<S>) to prevent
git checkout conflicts when multiple worktrees work on the same milestone
- getMainBranch() returns worktree/<name> inside a worktree so slice merges
target the worktree branch instead of main (which is checked out elsewhere)
- Add continue/fresh-start prompt when creating a worktree with existing milestones
- Restyle all worktree command output with consistent semantic color palette
- Add parseSliceBranch() and SLICE_BRANCH_RE for robust branch name parsing
- Fix duplicate getCurrentBranch import in auto.ts
- Add 40-assertion integration test covering full worktree lifecycle
* fix: branch slice from current branch, not main
ensureSliceBranch always branched from getMainBranch() (main/master),
but planning artifacts (CONTEXT, ROADMAP, etc.) may only exist on the
working branch (e.g. "developer"). The slice branch would lose all
planning artifacts, causing deriveState to see pre-planning and the
rebuildState post-hook to overwrite STATE.md with a blank state.
Now branches from the current branch when it is not itself a slice
branch. Falls back to main when on a slice branch to avoid chaining.
Adds regression tests for both cases.
- Fix auto.ts indentation: properly indent inner if-else chain inside summarizing guard
- Clear unitDispatchCount on resume path (paused → active) to prevent stale counts
- Add parseSummary regression tests for #91: bare scalar "none" coerced to string[]
- Add parseSummary test for missing frontmatter fields yielding empty arrays
- Verify .slice().join() works on coerced arrays (the original crash pattern)
Test results: 273 passed, 0 failed (24 new assertions)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
CLI routing (#81, #107):
- Import and route --mode rpc to runRpcMode() instead of silently falling through to runPrintMode
- Add TTY guard before interactive mode — exit with helpful message when stdin is not a TTY
- Add --version and --help flags
Auto-mode infinite loop (#96):
- Move summarizing/complete-slice dispatch before reassessment check (D1) — ensures mergeSliceToMain always runs
- Add per-unit dispatch counter to detect alternating loops like A→B→A→B (D3)
Windows shell escaping (#106, #98):
- Platform-aware escapeShellArg() in mcporter extension — double quotes on Windows, single quotes on Unix
CRASH: parseSummary (#91):
- Add asStringArray() helper to safely coerce YAML bare scalars (e.g. "none") to string arrays
- Applied to all 7 frontmatter fields that expect string[]
Google Search model (#99):
- Replace hardcoded gemini-3-flash-preview with env var GEMINI_SEARCH_MODEL (default: gemini-2.5-flash)
Worktree branch collision (#84):
- Check git worktree list before checkout to detect branches already in use by another worktree
Migration UX (#90, #93):
- Improve error messages to distinguish migration from new project setup, suggest /gsd:new-project
Keyboard shortcuts (#100, #104):
- Document terminal protocol requirement in shortcut descriptions — Ctrl+Alt combos need Kitty/modifyOtherKeys
Closes#81, #84, #91, #96, #99, #106, #107
Addresses #90, #93, #95, #98, #100, #104
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Three defects in auto.ts + worktree.ts combined to produce an infinite
alternating loop and unreliable unit closeout in GSD auto-mode.
**D1 (auto.ts)** — `state.phase === "summarizing"` is now the first
branch in the dispatch if-else chain, evaluated before `needsRunUat`
and `needsReassess`. Previously, if an execute-task agent wrote
slice-level artifacts early, `needsReassess` fired instead and
`mergeSliceToMain` was permanently skipped.
**D2 (worktree.ts)** — New slice branches are now created from the
current HEAD instead of `main`. When a prior slice merge was skipped,
the new branch would inherit a stale ROADMAP from main, creating
divergent state that drove the A→B→A→B alternation.
**D3 (auto.ts)** — Replaced `lastUnit`/`retryCount` consecutive-repeat
detection with a `unitDispatchCount` map that tracks total dispatches
per unit key. The old guard reset to 0 on every ID change; the map
catches alternating-loop patterns and stops after MAX_UNIT_DISPATCHES=3.
**Atomic closeout (auto.ts)** — `persistCompletedKey` writes the unit
key to `.gsd/completed-units.json` before any in-memory update. A crash
mid-closeout is now recoverable: on next start `loadPersistedKeys`
re-populates `completedKeySet` and the idempotency guard skips already-
completed units.
**Persistent idempotency (auto.ts)** — `completedKeySet` is loaded from
disk on `startAuto` and checked before every dispatch, preventing re-
dispatch of units completed in a prior session even after a restart.
**Startup self-heal (auto.ts + unit-runtime.ts)** — `selfHealRuntimeRecords`
runs on start and resume; it scans all on-disk runtime records, checks
whether each unit's expected artifact exists, and clears any orphaned
records. Added `listUnitRuntimeRecords` to unit-runtime.ts to support
this scan.
**Recovery backoff (auto.ts)** — `recoverTimedOutUnit` now tracks
cross-invocation recovery attempts per unit in `unitRecoveryCount` and
applies exponential backoff (1s→2s→4s…30s cap) between attempts.
Attempt number is included in all recovery notify messages for
traceability.
Closes#96Closes#109
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add Tavily Search API as an alternative backend for search-the-web and
search_and_read tools. Tavily is selected automatically when TAVILY_API_KEY
is set (preferred over Brave when both keys present). Existing Brave
Search paths are completely unchanged.
Motivation: Brave Search API signup requires Stripe payment which may
not be available in all regions. Tavily offers a free tier and also
provides a Deep Research API for future expansion.
Changes:
- Auth: Tavily API key in wizard, auth.json storage, env hydration
- search-the-web: Tavily POST backend with response normalization
- search_and_read: Tavily advanced search with client-side token budgeting
- /search-provider: slash command for explicit provider switching
- 61 new tests covering all Tavily integration paths
- Zero changes to existing Brave code paths
On Windows, cmd.exe does not strip single quotes like Unix shells.
This caused MCP tools (mcp_servers, mcp_discover, mcp_call) to fail
with 'Unknown command list' errors because mcporter received
literal 'list' instead of just list.
The fix uses execFile with shell=true on Windows, which properly
passes arguments without the shell interpreting quotes.
Closes#98
Co-authored-by: OpenClaw AI <ai@openclaw.dev>
Line 5's 'When to read this' guidance updated to reflect the actual
mechanism — the file is injected programmatically by /gsd, not read
directly by the agent.
Line 659's context-pressure resume instruction updated from:
'read @GSD-WORKFLOW.md - what\''s next?'
to:
'run /gsd to pick up where you left off, or /gsd auto to resume in
auto-execution mode.'
The read @GSD-WORKFLOW.md instruction was broken — the file is not
accessible via the read tool; it only enters context through
dispatchWorkflow(). Users who followed the old instruction got nothing.
Relates to #38 (same file, different problem).
gemini-3-flash-preview is not available on Vertex AI and has lower
rate limits on the Gemini Developer API. gemini-2.5-flash is the
stable model available on both Vertex AI and Gemini API.
Handle both plain number and { total: number } shapes for msg.usage.cost
in snapshotUnitMetrics, and coerce formatCost input to prevent crashes
when cost is null/undefined/NaN from corrupted ledger data.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
During auto-mode, the built-in footer is hidden entirely via setFooter()
and all its info is moved into the progress widget:
- pwd + git branch shown inside the widget
- Token stats (↑/↓/R/W) from current unit session
- Cumulative cost from metrics ledger (survives across unit resets)
- Context window usage with color coding (warning >70%, error >90%)
- Model name right-aligned
- Footer restored to built-in on pause or stop
- No model duplication (removed from hints)
The built-in write, read, and edit tools capture process.cwd() once at
startup. When /worktree switch calls process.chdir() into a worktree,
these tools still resolve relative paths against the original launch
directory. This caused GSD auto-mode to write .gsd/ artifacts to the
main project instead of the worktree.
The bash tool was already patched with a spawnHook for dynamic CWD.
Apply the same pattern to write, read, and edit: each execute() call
creates a fresh tool instance with the current process.cwd(), so
relative paths always resolve against the active working directory.
Replace the narrow 'if currentUnit === complete-slice' merge check with a
general merge guard that detects any completed slice branch and merges it
to main before dispatching the next unit.
The old check only triggered merges after the complete-slice unit type.
When the LLM or the doctor post-hook completed slice bookkeeping during
task execution, complete-slice was skipped entirely, leaving the slice
branch unmerged. On milestone transition, the next slice branch (forked
from main) couldn't see the prior milestone's summary, causing deriveState
to oscillate between milestones in an infinite loop.
The new guard checks: are we on a gsd/MID/SID branch where the roadmap
entry is [x]? If so, merge to main and re-derive state before dispatching.
Patch SDK setModel() to accept { persist: false } so per-unit model
switching in auto-mode no longer overwrites the user's global default.
Add state rebuild + doctor on resume, guard logging for silent dispatch
failures, and active-state check before prompt injection.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Validate channel IDs via isValidChannelId() before URL interpolation
in setup wizard, preventing SSRF during test-send
- Add 15s fetch timeout to setup API calls (fetchJson, Discord test-send)
- Sanitize Discord error responses before surfacing to user
- Remove dead send.ts + channels.ts (unused parallel implementation)
- Add poll retry tolerance in manager.ts (1 transient error before fail)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
ensureSliceBranch() now auto-commits dirty files before git checkout,
preventing "would be overwritten" errors when doctor/STATE.md rebuild
leaves uncommitted changes between slice dispatches. (closes#63)
On startup, migrate any .jsonl session files from the flat
~/.gsd/sessions/ directory into the per-cwd subdirectory so /resume
can find sessions created before per-directory scoping was added.
(closes#64)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Apple's on-device speech recognition resets bestTranscription after
silence gaps, discarding previous text. The Swift recognizer now
detects these resets (word count drop / different starting word) and
accumulates finalized segments so speech continues appending instead
of overwriting. TS side simplified to pass through the already-
accumulated text from the Swift process.
Provides a `google_search` tool as an alternative to Brave-based web
search for users with Google Cloud / Gemini API credits. Sends queries
to Gemini 3 Flash with `googleSearch: {}` grounding enabled, returning
an AI-synthesized answer with source URLs from grounding metadata.
Features:
- In-session caching (keyed by normalized query)
- Defensive truncation via truncateHead
- Classified error handling (auth, rate-limit, general)
- Custom TUI rendering for call/result display
- Session start warning if GEMINI_API_KEY is missing
- macOS-only (SFSpeechRecognizer), no-op on other platforms
- /voice command and Ctrl+Alt+V shortcut to toggle
- Streams partial transcription results directly into editor input
- Custom footer with flashing red dot + 'transcribing' indicator on row 1
- Enter to stop and keep text, Esc to cancel
- Ships precompiled Swift binary (60KB)
Run doctor (fix mode) and rebuild STATE.md after each unit completes
in handleAgentEnd. Catches missed checkboxes and stub summaries the
LLM may have skipped, and keeps STATE.md in sync with disk state.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- /gsd next: same state machine as /gsd auto but pauses between units
with a wizard showing what completed and what's next
- /gsd (bare): now defaults to step mode instead of the old guided flow
- /gsd auto: unchanged — continuous execution without pausing
- Deleted /gsd-run slash command (redundant with /gsd auto)
- Step mode preserves through discuss → auto-start transition
- User can switch from step → auto mid-session via wizard option
- Progress widget shows NEXT/AUTO based on current mode
The idle watchdog checked lastProgressAt to detect stalled agents, but
nothing updated that timestamp during normal execution. Any task taking
>10min triggered false idle recovery, steering messages, and eventually
got skipped — even while actively writing code.
Add detectWorkingTreeActivity() check before recovery: if git reports
uncommitted changes, the agent is working. Bump lastProgressAt and
skip recovery. Genuinely idle agents (clean working tree) still get
recovered as before.
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Merge improvements:
- Auto-detect current worktree: /worktree merge (bare) and /worktree merge main
work from inside a worktree without specifying the worktree name
- Full repo diffs: preview and LLM prompt show all changed files, not just .gsd/
- Accurate preview: direct diff (main vs branch) shows actual merge impact
- Per-file line stats: +N/-N shown for each file in merge preview
- CWD fix: chdir to main tree before dispatching merge to prevent broken CWD
after worktree cleanup
- Prompt includes explicit paths so the LLM knows where to read/write
Create/switch:
- /worktree create <name> works as alias for create-or-switch behavior
- Guard against creating a worktree when the branch is already in use
Remove:
- /worktree remove <name> validates the name exists before attempting removal
- /worktree remove <name> confirms before deleting
- /worktree remove all removes every worktree after confirmation prompt
Reload resilience:
- Detects if CWD is inside a worktree on extension init and restores
originalCwd tracking, surviving /reload without losing worktree state
Command descriptions:
- /worktree shows '(also /wt)' in description
- /wt shows 'Alias for /worktree'
Replace the inline union cast in renderResult with a proper
discriminated union (LocalResultDetails | RemoteResultDetails)
keyed on the `remote` field. Improves type safety and makes
the rendering logic self-documenting.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>