When a user presses Escape during streaming, the abort flow clears all
queued messages indiscriminately. User messages typed during streaming
are silently discarded. This adds a QueueEntry wrapper in the Agent class
to track message origin ("user" vs "system"), so that clearQueue() can
preserve user-typed messages while discarding system-generated ones.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* rfc: GitOps branching & versioning strategy proposal
Proposes a Git-Flow Lite model with automated integration branches:
main ← production-ready, tagged releases only
next ← integration branch for next minor (PRs target here)
release/X.Y ← stabilization branch, only bugfixes allowed
hotfix/X.Y.Z ← emergency fixes cherry-picked to release
Includes:
- RFC document with lifecycle diagrams, migration path, open questions
- Workflow scaffolds (in docs/proposals/workflows/, NOT .github/):
- create-release.yml: manual dispatch to cut release branch from next
- sync-next.yml: auto-sync next branch after version tags
- backmerge.yml: auto back-merge release fixes to next
This is an experimental proposal requesting community feedback before
any implementation. The workflow files are inert scaffolds — they do
not run in CI.
* fix: prevent ensureGitignore from adding .gsd when tracked in git (#1364)
CRITICAL DATA-LOSS FIX: ensureGitignore() unconditionally added '.gsd' to
.gitignore even when .gsd/ was a real git-tracked directory, causing git to
report ~889 tracked files as deleted.
Root cause: BASELINE_PATTERNS included '.gsd' unconditionally, and the
gitignore modification ran BEFORE migration checks in auto-start.ts.
Changes:
- Add hasGitTrackedGsdFiles() helper using nativeLsFiles to detect tracked
.gsd/ content
- ensureGitignore() now skips the '.gsd' pattern when .gsd/ has tracked files
- untrackRuntimeFiles() now skips entirely when .gsd/ has tracked files
- migrateToExternalState() aborts when .gsd/ has tracked files
- Reorder auto-start.ts: migration runs BEFORE gitignore modification
- Add 8 regression tests covering all scenarios
Fixes#1364
* fix: break recursive dialog loop when all milestones complete (#1348)
Two interacting bugs:
1. Recursive dialog loop: When all milestones are complete, bootstrapAutoSession
calls showSmartEntry → sets pendingAutoStart → checkAutoStartAfterDiscuss
calls startAuto → bootstrapAutoSession → showSmartEntry → infinite loop.
The discuss workflow completes without producing a milestone directory, so
phase stays 'complete' and the cycle never breaks.
Fix: Add a re-entry counter (_consecutiveCompleteBootstraps) that tracks
how many times bootstrapAutoSession enters the 'complete' branch without
advancing. After 2 consecutive attempts, break the loop with a warning
message and return false.
2. Missing _releaseFunction = null in retry lock onCompromised handler:
The retry lock path in session-lock.ts set _lockCompromised but didn't
null out _releaseFunction, which could leave a stale reference that
masks the compromise detection in validateSessionLock().
Fixes#1348
* fix: self-heal stale roadmap checkbox for interrupted complete-slice (#1350)
When complete-slice is interrupted after writing SUMMARY.md and UAT.md but
before flipping the roadmap checkbox, auto-mode enters an infinite loop —
re-launching the same complete-slice unit because the dispatch loop uses
the roadmap checkbox as the sole 'slice done' signal.
Fix: Add a self-heal case in selfHealRuntimeRecords that detects when
SUMMARY + UAT exist but the roadmap checkbox is unchecked, and auto-fixes
the checkbox. This allows the verification to pass and the dispatch loop
to advance.
Fixes#1350
* fix: add EISDIR guard to complete/validate milestone prompts (#1343)
The LLM was passing tasks/ directory paths to the read tool during
milestone completion, causing EISDIR crashes. Added file system safety
instructions to both complete-milestone and validate-milestone prompts
telling the LLM to use ls/find for directory listing, not the read tool.
Fixes#1343
* feat: improve extension conflict messages with removal guidance (#1347)
When a user extension registers tools/commands that now ship as built-ins,
the conflict message now includes '(built-in tool supersedes — consider
removing <path>)' and the log level is downgraded from 'Extension load error'
to 'Extension conflict'.
Changes:
- resource-loader.ts: detect built-in vs user extension conflicts, add hint
- cli.ts: downgrade severity for superseded-tool conflicts
Fixes#1347
* test: fix always-skipped preferences test, add test:marketplace script
- preferences.test.ts: Replace always-skipped getIsolationMode test with
a filesystem-independent version that validates the default through
validatePreferences() instead of reading ~/.gsd/preferences.md.
Reduces skipped count from 3 → 2.
- package.json: Add test:marketplace script for running marketplace
contract tests (claude-import-tui, plugin-importer-live,
marketplace-discovery) with GSD_TEST_CLONE_MARKETPLACES=1.
These tests need external repos and self-skip in unit test runs.
Remaining 2 skips:
- Marketplace contract test suites (need external repos, run via test:marketplace)
- Windows-only tests in validate-directory.test.ts are platform-conditional
and correctly skip on macOS
* fix: use execFileSync in regression tests for Windows portability
The regression tests used execSync with shell-dependent constructs:
- '&&' command chaining (works in bash/cmd but fragile)
- Single-quoted commit messages (bash-only, cmd.exe splits on spaces)
Replaced with execFileSync via a git() helper that bypasses the shell
entirely. Each git operation is a separate call with proper argument
arrays, eliminating all shell interpretation issues.
Fixes windows-portability CI failure.
* fix: guard milestone completion against missing slice summaries (#1368)
Auto-mode could report a milestone as complete after executing only the
last slice, skipping earlier unexecuted slices. The milestone completion
signal fired based on roadmap checkbox state, which could be stale or
inconsistent after worktree transitions.
Changes:
- auto-dispatch.ts: Added slice SUMMARY file existence check to both
validating-milestone and completing-milestone dispatch rules. If any
slice lacks a SUMMARY file, dispatch stops with a diagnostic error
instead of proceeding to validation/completion.
- validate-milestone.test.ts: Updated tests to create slice summary
files (required by the new guard).
- file-watcher.test.ts: Fixed flaky 'auth.json change emits auth-changed
event' test by adding watcher initialization delay and increasing event
propagation timeout (race condition when run in full suite).
Fixes#1368
* fix: warn on common misspelled preference keys + verify field guidance (#1373, #1341)
#1373: Users setting 'taskIsolation.mode: none' instead of 'git.isolation: none'
got a generic 'unknown key' warning. Added KEY_MIGRATION_HINTS that map common
misspellings (taskIsolation, task_isolation, isolation, manage_gitignore, auto_push,
main_branch) to their correct git.* equivalents with actionable messages.
#1341: Planning agent writes aspirational prose in Verify fields ('Sections 3.1
and 3.2 exist with exact formulas. Zero TBD.') instead of executable commands.
Added explicit verify field rules to the plan template: must be mechanically
executable, with examples of good vs bad patterns for content tasks.
Fixes#1373, partially addresses #1341
* refactor: extract roadmap-mutations.ts + shared test-utils.ts
Consolidation:
- roadmap-mutations.ts: Extracted markSliceDoneInRoadmap() and markTaskDoneInPlan()
from duplicated implementations in doctor.ts, mechanical-completion.ts, and
auto-recovery.ts. All three callers used identical regex patterns.
mechanical-completion.ts and auto-recovery.ts now import the shared utility.
(doctor.ts deferred — touched by PR #1349)
- test-utils.ts: Shared cross-platform test utilities for GSD extension tests.
Provides git() helper (execFileSync, no shell), makeTempRepo() with
core.autocrlf=false, cleanup(), createFile(), safeReadFile(), and
writeMilestoneFixture(). 12 test files currently define their own versions
of these helpers — new tests should import from test-utils.ts instead.
Security audit: No injection vectors (sid/tid are alphanumeric from roadmap
parser), no path traversal, no secrets, no new dependencies.
* fix: port conflict false positive on non-Node projects + paused worktree resume (#1381, #1383)
projects without package.json. macOS AirPlay Receiver listens on port 5000,
causing a spurious warning on non-Node projects.
Fix: Skip port checks entirely when no package.json exists. When using
default ports, filter out 5000 on macOS.
in-memory only. Re-entering /gsd started a fresh bootstrap from the project
root instead of the active worktree.
Fix: pauseAuto() now writes paused-session.json to .gsd/runtime/ with
milestoneId, worktreePath, originalBasePath, and stepMode. startAuto()
checks for this file before bootstrap and restores the paused session
context, including worktree re-entry. stopAuto() cleans up the file.
Fixes#1381, #1383
* fix: catch spawn ENOENT in uncaught exception guard + snapshot session lock path (#1384, #1363)
uncaught exception and crashes auto-mode. The EPIPE guard now also catches
ENOENT from spawn syscalls — logs the error and continues instead of
terminating the process.
the lock path differently via gsdRoot() because basePath could be either the
project root or a worktree path. gsdRoot() produces different results for
each, so the lock was written to one path and validated against another.
Fix: Snapshot the resolved lock path (_snapshotLockPath) at acquisition time
and reuse it for all subsequent lock operations within the session.
Fixes#1384, #1363
* fix: suppress false-positive lock compromise + skip migration with active worktrees (#1362, #1337)
because the event loop stall delays the heartbeat mtime update. The handler
now checks elapsed time since acquisition — if within the 30-minute stale
window, it logs a warning and continues instead of setting _lockCompromised.
Real takeovers (past the stale window) still trigger the compromise flag.
even when .gsd/worktrees/ contained active git worktrees with locked
directory handles. This caused EBUSY errors and destructive data loss.
Migration now checks for active worktree directories and skips entirely
if any are found.
Fixes#1362, #1337
- Extract emitMessagePair() to consolidate 6 message_start/message_end push pairs in agent-loop.ts
- Extract emitErrorSequence() to deduplicate identical catch blocks in agentLoop and agentLoopContinue
- Export ZERO_USAGE constant and reuse it in agent.ts instead of inline object literals
- Merge identical message_start/message_update switch cases in Agent._runLoop
- Extract Agent._updatePendingToolCalls() to consolidate tool_execution_start/end Set mutation
Replace 30+ repetitive setter method bodies with four generic private
helpers: setGlobalSetting, setScopedSetting, setNestedGlobalSetting,
and setProjectSetting. Each setter method retains its original public
signature and behavior — only the internal implementation is consolidated.
Methods with custom logic (setEditorPaddingX, setAutocompleteMaxVisible,
setSearchExcludeDirs, setFallbackChain, removeFallbackChain,
setDefaultModelAndProvider) are left unchanged or minimally adapted.
Net reduction: 79 lines (164 deleted, 85 added).
- Delete theme-schema.json (335 lines): redundant with the TypeBox
schema already defined in theme.ts, only referenced via $schema URLs
in the JSON files for editor autocomplete.
- Delete dark.json (85 lines) and light.json (84 lines): move built-in
theme definitions into themes.ts as TypeScript objects, eliminating
runtime filesystem reads and the getThemesDir() dependency.
- Export ThemeJson type from theme.ts so themes.ts can reference it.
- Net reduction: ~319 lines removed.
Move overlay positioning (resolveOverlayLayout, resolveAnchorRow/Col),
line compositing (compositeLineAt, compositeOverlays, applyLineResets),
cursor extraction, and size parsing into overlay-layout.ts. These are
pure functions with no TUI state dependencies, reducing tui.ts from
1,200 to 899 lines.
Move slash command dispatch logic and 12 individual command handlers
(/export, /share, /copy, /name, /session, /changelog, /hotkeys,
/compact, /thinking, /edit-mode, /arminsayshi, plus showThinkingSelector)
into a new slash-command-handlers.ts module.
InteractiveMode now delegates to dispatchSlashCommand() via a
SlashCommandContext interface, keeping the integration surface minimal.
Handlers that are also invoked from keybindings/events remain on
InteractiveMode and are accessed through the context.
Reduces interactive-mode.ts from 4,783 to 4,272 lines (-511).
* refactor: replace recursive auto-dispatch with linear autoLoop, delete ~3k lines of dead code
Replace the complex recursive dispatch system (dispatchNextUnit, reentrancy
guards, stall detection, idempotency tracking, skip-depth machinery) with a
simple linear while(s.active) loop in auto-loop.ts.
Key changes:
- New auto-loop.ts with autoLoop(), runUnit(), resolveAgentEnd()
- Deleted auto-idempotency.ts, auto-stuck-detection.ts, session-lock.ts,
mechanical-completion.ts, progress-score.ts, auto-constants.ts, unit-id.ts
- Extracted WorktreeResolver class for worktree path resolution
- Added auto-worktree-sync.ts for worktree synchronization
- Simplified auto.ts from ~1400 lines to ~400 lines
- Fixed 9 TypeScript errors (NotifyCtx type widening, capture typing)
- Comprehensive test coverage: 32 auto-loop tests + worktree resolver/DB tests
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: address 6 audit findings in auto-loop refactor
1. CRITICAL: Move pendingResolve to AutoSession + queue orphaned agent_end
events instead of silently dropping them. Prevents permanent stalls when
error-recovery sendMessage retries fire between loop iterations.
2. HIGH: Scope pendingResolve per-session via _activeSession ref, preventing
concurrent /gsd auto sessions from corrupting each other's promises.
3. HIGH: Replace console.log in dispatchHookUnit with debugLog to prevent
hook prompt content (potentially containing secrets) from leaking to stdout.
4. HIGH: Restore parked milestone handling in state.ts — Phase 1 skips
parked milestones so they don't satisfy depends_on, Phase 2 registers
them as 'parked' status. Add 'parked' to MilestoneRegistryEntry type.
5. MEDIUM: Restore queuePhaseActive parameter in shouldBlockContextWrite
and re-export setQueuePhaseActive for guided-flow-queue.ts consumers.
6. MEDIUM: Add MAX_LOOP_ITERATIONS (500) lifetime cap to autoLoop to prevent
runaway loops when units alternate between IDs.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: resolve build breakers, add correctness fixes, and graduated recovery
Build breakers (CRITICAL):
- Restore unit-id.ts (deleted but still imported by complexity-classifier.ts, metrics.ts)
- Restore progress-score.ts (deleted but still imported by commands.ts, dashboard-overlay.ts, doctor.ts)
- Rewrite worktree-sync-milestones.test.ts to use new syncProjectRootToWorktree API
Correctness fixes (MEDIUM):
- Cap pendingAgentEndQueue to 3 entries to prevent unbounded growth from stale events
- Add milestoneId path traversal validation in WorktreeResolver
- Clear depthVerificationDone on session_start to prevent cross-session leaks in RPC mode
- Add verification gate for non-hook sidecar units (triage, quick-tasks)
- Remove dead handleAgentEnd import from index.ts
Graduated recovery (Jeremy's feedback):
- Blanket try/catch around loop body — one bad iteration no longer kills the session
- Graduated stuck recovery: at count 3 try artifact verification + cache invalidation,
at count 5 hard stop (was: binary stop at 5 with no recovery attempt)
- Graduated error recovery: 1st error retries, 2nd invalidates caches, 3rd stops
Test results: 32/32 auto-loop, 28/28 worktree-resolver, 11/11 sidecar-queue, tsc clean.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: restore copyWorktreeDb/reconcileWorktreeDb exports and fix loadToolApiKeys import
Two missing exports caused ~90% of the 120 pre-existing test failures:
1. copyWorktreeDb + reconcileWorktreeDb — imported by auto-worktree.ts but
never added to gsd-db.ts. Restored with the original implementations.
2. loadToolApiKeys — moved to commands-config.ts but index.ts still imported
from commands.ts. Fixed the import path.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: move loadToolApiKeys import to commands-config.js
loadToolApiKeys was moved to commands-config.ts but index.ts still
imported it from commands.ts, causing runtime failures in all tests
that transitively load the extension entry point.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* test: fix provider error assertion on windows
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Extract two self-contained subsystems from agent-session.ts (3,367 -> 2,737 lines):
- RetryHandler: auto-retry with exponential backoff, credential rotation,
and cross-provider fallback logic
- CompactionOrchestrator: manual/auto compaction, overflow recovery, and
extension integration for custom compaction providers
Also add shared getErrorMessage() utility to replace repeated
`err instanceof Error ? err.message : String(err)` patterns.
The extracted modules receive AgentSession state via dependency injection
interfaces, avoiding state duplication. AgentSession remains the coordinator
that delegates to these modules.
markdown.ts: Extract renderCodeBlock() helper to eliminate duplicated
code block rendering between renderToken and renderListItem.
keys.ts: Consolidate LEGACY_KEY_SEQUENCES, LEGACY_SHIFT_SEQUENCES, and
LEGACY_CTRL_SEQUENCES into a single LEGACY_SEQUENCES structure with
plain/shift/ctrl variants per key. Auto-generate LEGACY_SEQUENCE_KEY_IDS
from this structure instead of hand-maintaining a 50+ entry reverse
lookup. Extract hasKittyEventType() to deduplicate isKeyRelease and
isKeyRepeat. Net reduction of ~80 lines.
Extract the duplicated commandContextActions implementations from
rpc-mode.ts and print-mode.ts into a shared
createDefaultCommandContextActions() factory in
modes/shared/command-context-actions.ts.
Both headless modes (RPC and print) had identical implementations of
waitForIdle, newSession, fork, navigateTree, switchSession, and reload
that simply delegate to AgentSession. The shared factory provides these
defaults; interactive mode continues to layer TUI-specific behavior on
top via its own overrides.
Also fixes a subtle redundancy in print mode where newSession manually
called options.setup after session.newSession(), even though
session.newSession() already handles the setup callback internally.
Extract duplicated implementations of startCallbackServer, parseRedirectUrl,
getUserEmail, and token refresh logic from google-antigravity.ts and
google-gemini-cli.ts into a shared google-oauth-utils.ts module.
Extract the duplicated file lock mechanism from auth-storage.ts and
session-manager.ts into a shared lock-utils.ts module.
- acquireLockSyncWithRetry(): throwing variant (used by auth-storage)
- tryAcquireLockSync(): non-throwing variant (used by session-manager)
- acquireLockAsync(): async lock with retries and staleness detection
Removes ~55 lines of duplicated retry-loop logic. The shared module
also provides a foundation for deduplicating identical patterns in
settings-manager.ts and models-json-writer.ts.
- Replace dedupePrompts() and dedupeThemes() with generic dedupeResources<T>()
that accepts getName/getPath/resourceType callbacks
- Replace discoverSystemPromptFile() and discoverAppendSystemPromptFile() with
generic discoverFileInSearchPaths(filename)
- Import ResourceCollision type for use in dedupeResources signature
- Net reduction of 24 lines (868 → 844) with elimination of duplicated logic
Move duplicated patterns from compaction.ts and branch-summarization.ts
into shared utilities in utils.ts:
- getMessageFromEntry(): unified entry-to-message conversion with
optional toolResult skipping for branch summarization
- collectMessages(): replaces three identical for-loops that collect
AgentMessages from entry ranges
- extractTextContent(): replaces five instances of the
.filter(text).map(text).join() pattern
- createSummarizationMessage(): replaces three identical user-message
construction blocks for LLM summarization calls
Net reduction of ~90 lines of duplication.
- toPosixPath: remove private copies in skills.ts and package-manager.ts,
import from canonical utils/path-display.ts
- ZERO_USAGE: export from agent-loop.ts, replace inline zero-usage
objects in agent.ts and proxy.ts
- shortenPath: extract to shared modes/interactive/utils/shorten-path.ts,
import in tool-execution.ts and session-selector.ts
Extract the duplicated loop pattern (create context, iterate extensions,
iterate handlers, try/catch, emitError) into a private invokeHandlers()
helper. Each emit method is now a thin wrapper that delegates the
iteration to invokeHandlers and provides only its result-processing
callback. Net reduction of ~124 lines with identical runtime behavior.
Replace 7 individual ToolResultEvent type guards (isBashToolResult,
isReadToolResult, etc.) with a unified isToolResultEventType() function,
mirroring the existing isToolCallEventType() pattern.
Inline 14 handler type aliases (SendMessageHandler, SetModelHandler, etc.)
directly into the ExtensionActions interface since they were only used there
and added no semantic value.
Update documentation examples to use the new unified guard.
On older Linux distributions (e.g., RHEL 8 with older glibc), the native
Rust addon fails to load. The proxy throws on every function call, but
wrapTextWithAnsi and visibleWidth in pi-tui had no JS fallback — causing
an uncaught crash during TUI rendering.
Fix: Both functions now catch native throws and fall back to JS
implementations (simple word-wrap and ANSI-strip length).
Fixes#1418
getSessionStats() calculated cost by summing usage from assistant messages
in state.messages. After auto-compaction, pre-compaction messages are
replaced by a compactionSummary with no usage field — dropping the cost.
Fix: Added cumulative accumulators (_cumulativeCost, _cumulativeInputTokens,
_cumulativeOutputTokens, _cumulativeToolCalls) that are incremented on
every assistant message event, independent of the message array.
getSessionStats() now returns max(array-sum, cumulative) to ensure
monotonically non-decreasing values.
Fixes#1423
Added scripts/generate-openrouter-models.mjs that fetches the full model
list from OpenRouter's API and generates TypeScript entries matching the
existing models.generated.ts format. Run with:
node scripts/generate-openrouter-models.mjs > /tmp/openrouter.ts
Updated the OpenRouter section in models.generated.ts from 241 → 350
models, including all nvidia/nemotron variants requested in the issue.
Fixes#1407
* fix: sync worktree completion artifacts back to external state before merge (#1412)
When a worktree's .gsd/ was a real directory (not symlinked to external
state), milestone completion artifacts (SUMMARY, VALIDATION, updated
ROADMAP) were written locally but never synced back. The project root's
deriveState() read from external state and found no SUMMARY — reporting
the milestone as incomplete.
Changes:
- auto-worktree.ts: Added syncWorktreeStateBack() that copies milestone
and slice .md files from worktree .gsd/ to the main external state dir
- auto.ts: Call syncWorktreeStateBack() in tryMergeMilestone before the
git merge, ensuring artifacts are visible from the project root
Fixes#1412
* fix: emit agent_end after abort during tool execution (#1414)
When a user aborts a turn while a tool call is running, the abort RPC
succeeds but agent_end was never emitted. RPC consumers tracking turn
lifecycle via events got stuck in a 'streaming' state permanently.
Fix: After abort() + waitForIdle(), emit a synthetic agent_end if the
agent is no longer streaming. This ensures consumers always see the
turn-complete signal regardless of how the turn ended.
Fixes#1414