Add 3 new tests covering editor↔selector/input component swaps that
happen during /gsd prefs, /gsd migrate, and /gsd setup:
- editor-to-selector swap: verifies cursor tracking when editor with
CURSOR_MARKER is replaced by a selector without one
- selector-to-editor swap: verifies cursor restores to CURSOR_MARKER
position when editor returns after selector dismissal
- input component swap: verifies typing in prefs wizard text input
produces correct cursor movement without jumps
All tests confirm hardwareCursorRow baseline computes correct movement
deltas for these interactive component transitions.
Two bugs prevented subscription users from routing through Claude Code CLI:
1. Retry handler regex only matched "third-party" errors but actual error is
"You're out of extra usage" — fallback never triggered
2. auto-model-selection actively rerouted bare model IDs back to anthropic
even after startup migration set claude-code as the session provider
Anthropic now blocks third-party apps from using Pro/Max subscription
quotas via direct API calls. This change makes the claude-code provider
(which delegates to the local claude CLI binary) the default path for
Anthropic subscription users — TOS-compliant because requests flow
through Anthropic's own infrastructure.
Changes:
- Enhanced readiness check to verify CLI auth status (not just binary)
- Startup migration: auto-switch anthropic → claude-code when CLI ready
- Error recovery: auto-switch on third-party 400 block error
- Onboarding: removed Anthropic from OAuth, added Claude CLI option
- Added claude-code to flat-rate providers (no dynamic routing benefit)
Closes#3772
PR #3744 and #3765 introduced contentCursorRow which diverges from the
actual terminal cursor position after IME repositioning. computeLineDiff
computes ANSI escape movements which are relative to where the cursor
physically is — that must be hardwareCursorRow, not a phantom position.
Remove contentCursorRow entirely and revert computeLineDiff baseline to
hardwareCursorRow. The ghost-line test was asserting wrong movement
direction (UP from phantom position vs DOWN from actual cursor).
Closes#3764
- Added queryKnowledge() for keyword-based KNOWLEDGE.md section filtering
- Added formatRoadmapExcerpt() for slice-scoped roadmap excerpts
- Added inlineKnowledgeScoped() and inlineRoadmapExcerpt() to auto-prompts
- Context reduction: ~65% for knowledge, ~67% for roadmap excerpts
- Also includes deriveSliceScope() fix for unit IDs and process words
Squash merge of milestone/M005
Verify that contentCursorRow is correctly maintained across renders
and that IME repositioning does not cause spurious cursor jumps
during normal typing or content shrinking.
Refs #3764
The workflow-logger coverage test (#3348) requires all catch blocks in
migrated files to include logging. Add logWarning for the expected
failure case when nativeWorktreeRemove fails on orphaned directories.
Refs #3739
Address adversarial review findings:
1. Timed-out pre/post verification continues running in background and
can mutate s.currentUnit for the wrong unit. Fix: null out
s.currentUnit on timeout so late async completions are harmless
(all side effects in postUnitPreVerification guard on s.currentUnit).
2. Finalize timeouts were treated as successful iterations, resetting
consecutiveErrors and enabling silent infinite churn. Fix: add
consecutiveFinalizeTimeouts counter to LoopState, increment on each
timeout, hard-stop auto-mode after MAX_FINALIZE_TIMEOUTS (3)
consecutive timeouts. Reset to 0 on successful finalize.
Both fixes apply symmetrically to pre and post verification timeouts.
Refs #3757
postUnitPostVerification already has a 60s timeout guard (#2344) but
postUnitPreVerification was called with bare await — if any async
operation inside it never resolves (browser teardown, worktree sync,
safety harness validation), the auto-loop freezes permanently with no
error, notification, or recovery.
Wrap postUnitPreVerification in the same withTimeout() pattern with a
dedicated FINALIZE_PRE_TIMEOUT_MS constant. On timeout, log a warning
and force-continue to the next iteration.
Closes#3757
Keyboard shortcut hints were hardcoded as Ctrl+Alt+X everywhere except
auto-dashboard.ts which had an inline platform check. On macOS these
should render as ⌃⌥X.
- Add formatShortcut() to files.ts — converts Ctrl/Alt/Shift/Cmd
modifiers to macOS symbols (⌃/⌥/⇧/⌘) when process.platform is darwin
- Replace all inline platform checks and hardcoded hints with
formatShortcut() calls
- Use template variables in system.md for shortcut hints
- Update comments in overlay files for consistency
- Add 7 tests covering all modifier conversions and passthrough
Closes#3753
The tool_result handler called markDepthVerified() whenever
ask_user_questions returned any response with a depth_verification
question ID — without checking what the user actually selected.
Selecting "Not quite", "None of the above", or garbage input all
unlocked the gate.
- Extract isDepthConfirmationAnswer() into write-gate.ts with structural
validation: cross-references selected answer against the question's
defined options, only accepting an exact match of the first option
(confirmation by convention). Rejects free-form "Other" text and
decouples from any specific label substring.
- Harden block message with explicit anti-bypass language
- Add anti-bypass instructions to all three discuss prompts
- Add 8 new tests covering: structural validation, free-form bypass
rejection, label-drift resilience, fallback behavior, edge cases
Closes#3749
Use the rendered content row as the shrink diff baseline instead of\nreusing the IME hardware cursor row. Add a focused TUI regression test\nthat reproduces the ghost-line cleanup path when autocomplete shrinks.\n\nCloses #3721
Strip planner-style path annotations before pre-execution checks compare\ninputs and expected outputs. This keeps existing files, prior outputs,\nand ordering checks aligned even when task-plan entries include inline\ndescriptions.\n\nCloses #3742
The function signature changed from boolean to "present" | "absent" |
"unknown" but three test assertions still compared against true/false.
Update assertions to match the new return type.
When a milestone completes but the session ends before teardown runs,
the milestone branch and worktree directory are orphaned — the DB says
complete so auto-mode won't re-enter, and the teardown is never retried.
Adds auditOrphanedMilestoneBranches() that runs after DB open during
bootstrap. For each milestone/* branch where the DB status is complete:
- If already merged into main → deletes the branch + cleans worktree dir
- If NOT merged → preserves the branch and warns the user
Includes 9 regression tests covering merged/unmerged/active/none-mode
scenarios.
Blocking on "unknown" from hasImplementationArtifacts broke real-world
auto-mode in projects without clean git merge-bases (single-branch,
fresh repos, detached HEAD). The auto-loop silently stopped at
completing-milestone with no visible error.
Reverted to warn-and-proceed for "unknown" — only "absent" (confirmed
no implementation files) blocks completion. This matches the original
fail-open behavior for inconclusive git checks.
1. hasImplementationArtifacts "unknown" now blocks completion instead of
warn-and-proceed. Both auto-dispatch.ts and auto-recovery.ts updated
to treat "unknown" as a stop condition, preventing milestone completion
when git status cannot be verified.
2. Audit log SAFE_KEYS allowlist expanded to include "id", "error", and
"count" fields. SPLIT BRAIN logError entries now persist the entity ID
and rollback error details to audit-log.jsonl for triage/repair.
Adds a pre-write guard in reconcileWorktreeLogs: re-reads the event
log before overwriting and retries if it grew since the initial read.
Prevents appendEvent calls between read and rewrite from being silently
dropped by the atomic overwrite.
1. Paused session file deletion deferred until after lock acquisition.
Previously the file was deleted before acquireSessionLock — if the
lock failed, the pause metadata was lost on disk and in memory,
making the session unresumable. Now the file path is stored in
s.pausedSessionFile and only deleted after successful lock.
2. Lock failure path preserves pause file for retry.
1. plan_task and plan_slice replay now use strict INSERT OR IGNORE
instead of calling insertTask/insertSlice which use ON CONFLICT
DO UPDATE. Prevents replay of older plan events from downgrading
progressed task/slice status back to pending.
2. Type guard on cmd normalization: non-string cmd values are skipped
with a warning instead of throwing.
3. Type guard on extractEntityKey for consistency.
1. Type guard on cmd normalization: non-string cmd values are now
skipped with a warning instead of throwing, preventing replay
from crashing on malformed event lines.
2. complete_milestone replay now validates all slices are closed
before marking milestone complete. Prevents a reordered/partial
event stream from closing a milestone with incomplete work.
3. Type guard on extractEntityKey cmd normalization for consistency.
Addresses Codex adversarial review findings:
1. Migration backup now flushes WAL via PRAGMA wal_checkpoint(TRUNCATE)
before copyFileSync. Without this, the backup could miss committed
data that only exists in the -wal file. Backup failure is now logged
via logWarning instead of silently swallowed.
2. Wave 5 regression tests strengthened:
- Added behavior-level test for skipped/blocked/pending status mapping
to checkbox rendering (not just isClosedStatus helper)
- Added extractEntityKey round-trip tests for underscored cmd formats
- Added unknown cmd → null safety test
Tests isClosedStatus coverage for projections, upsertDecision seq
preservation (ON CONFLICT DO UPDATE vs INSERT OR REPLACE), and
event schema versioning (v:2 field in new events).
Adds tests for plan event entity key extraction and unknown cmd handling.
Fixes empty catch blocks in auto-recovery.ts appendEvent calls that failed
the "no empty catch blocks" CI lint.
Covers event log cmd format normalization (hyphens + underscores),
extractEntityKey for complete-milestone, and isClosedStatus
including skipped status.