* refactor: TUI dashboard cleanup, dedup, and feature improvements
- Extract shared format-utils.ts: formatDuration, padRight, joinColumns,
centerLine, fitColumns, sparkline, stripAnsi — eliminating 3× duplication
across dashboard-overlay, visualizer-views, and auto-dashboard
- Use shared STATUS_GLYPH/STATUS_COLOR from ui.ts consistently across all
overlay and view files instead of hardcoded Unicode glyphs
- Fix redundant dynamic import('node:fs') in visualizer-data.ts (statSync
already imported at top level)
- Replace (entry as any) casts with proper SessionMessageEntry type narrowing
- Add mtime-based file content cache for visualizer data loader to avoid
re-parsing unchanged roadmap/plan files on every refresh
- Increase visualizer refresh interval from 2s to 5s (with mtime cache,
unchanged files are effectively free)
- Fix sparkline to use loop-based max instead of Math.max(...values) to
avoid stack overflow on large arrays
- Add ETA/time-remaining estimate to progress widget and dashboard overlay
based on average unit duration from metrics ledger
- Show warning glyph for budget-pressured units in completed units list
(continueHereFired units now show ⚠ instead of ✓)
- Add terminal resize (SIGWINCH) handling to both overlays — invalidates
cache and re-renders on window size change
- Fix dispose race in dashboard overlay close path — now calls dispose()
before onClose() to prevent timer callbacks firing after teardown
- Add 23 unit tests for format-utils.ts (including 100k-element sparkline)
- Add 2 tests for estimateTimeRemaining
- Add source-contract tests for resize handler and shared imports
* fix: use STATUS_GLYPH.warning instead of STATUS_GLYPH.statusWarning
STATUS_GLYPH is keyed by ProgressStatus ("warning"), not by GLYPH
property name ("statusWarning"). Fixes typecheck failure in CI.
After the skip-loop breaker evicts a completion key, the fallback path
at the bottom of dispatchNextUnit re-persists it because the expected
artifact exists on disk. This recreates the exact loop the breaker was
trying to break:
evict key → dispatch → verifyArtifact(true) → re-persist key → skip → evict → repeat
Fix: Track recently-evicted keys in a Set. The fallback artifact-check
path skips re-persistence for keys that were just evicted by the
skip-loop breaker. Set is cleared on stopAuto.
When a slice plan (S03-PLAN.md) was pre-created during roadmapping
but plan-slice never ran to generate per-task files (tasks/T01-PLAN.md),
deriveState returned 'executing' phase. execute-task then failed because
the task plan didn't exist, creating an infinite restart loop.
Fix: In deriveState, when the tasks directory exists but has zero .md
files and the slice plan references tasks, return 'planning' phase
instead of 'executing'. This causes plan-slice to dispatch and generate
the missing task plans.
Tests updated: 6 test files that create synthetic state fixtures now
include a stub task plan file so their 'executing' phase assertions
remain valid.
* fix: reduce CPU usage on long auto-mode sessions
Seven targeted fixes for compounding process/timer/I/O issues that cause
high CPU during multi-hour /gsd auto sessions:
1. Wrap idle watchdog and hard timeout async callbacks in try-catch to
prevent unhandled rejections from orphaning intervals
2. Cache nativeHasChanges fallback (10s TTL) to avoid spawning a new
git process every 15 seconds when native module is unavailable
3. Call clearUnitTimeout() before dispatchNextUnit() in all recovery
paths to prevent stale idle watchdog from firing alongside new timers
4. Add 10-second timeout to subagent worktree cleanup to prevent hangs
when git worktree remove blocks indefinitely
5. Prune dead bg-shell processes after each unit completion to free
retained output buffers (~500KB-1MB per dead process)
6. Throttle STATE.md rebuilds to at most once per 30 seconds (was every
unit completion at 100-400ms each)
7. Increase progress widget refresh interval from 5s to 15s to reduce
synchronous file I/O on the hot path
* fix: reset nativeHasChanges cache in worktree test
The 10s TTL cache on nativeHasChanges was causing the worktree test
to return stale "no changes" when checking a freshly dirtied repo
within the cache window. Reset the cache before the dirty-repo
assertion so the test correctly detects new changes.
Conflicts arose because main added continueHereHandle cleanup and
buildSnapshotOpts (with continueHereFired) while the PR extracted
inline closeout code into closeoutUnit(). Resolution: use closeoutUnit()
with buildSnapshotOpts() to pass all fields including continueHereFired.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
resolveModelId narrowed the return type to { id, provider } which lost
the full Model<Api> shape needed by pi.setModel(). Using a generic <T>
preserves the actual type from the registry's getAvailable() call.
Cherry-picks the worktree-sync concept from #902, adapted to the flat
auto-*.ts naming convention with fixes:
- Uses gsdVersion (not syncedAt) for resource staleness detection (#804)
- Uses ESM imports (not require()) for node:fs/node:os
- Includes both sync directions (project→worktree and worktree→project)
- Includes escapeStaleWorktree and cleanStaleRuntimeUnits
auto.ts: 3,256 → 3,099 lines (-157 lines)
Break up the 3,975-line auto.ts into focused, single-concern modules:
- auto-budget.ts: Pure budget alert level and enforcement functions
- auto-tool-tracking.ts: In-flight tool call tracking for idle detection
- auto-observability.ts: Pre-dispatch observability validation and repair
- auto-unit-closeout.ts: Consolidated metrics/activity/memory closeout helper
- auto-direct-dispatch.ts: Manual /gsd dispatch phase command handling
- auto-timeout-recovery.ts: Idle and hard timeout recovery with escalation
- auto-model-selection.ts: Model routing, complexity classification, fallback chains
auto.ts retains orchestration (start/stop/pause, handleAgentEnd, dispatchNextUnit)
and drops from 3,975 to 3,256 lines (-719 lines, -18%).
All extractions are pure moves with re-exports — no behavior changes.
All 1,092 unit tests and 30 integration tests pass.
On Windows, process.cwd() returns backslash paths (C:\Users\name\...).
When these paths are injected into system prompts, worktree context
blocks, or tool results, the model copies them into bash commands.
Bash interprets backslashes as escape characters, silently stripping
them — producing invalid paths like 'C:Usersnamedevelopmentapp-name'.
This is not a regex hack — it's a proper cross-platform boundary:
- Filesystem operations (fs, path.join, spawn cwd) use native paths
unchanged. Node handles both separators correctly for I/O.
- LLM-visible text (prompts, tool results, extension messages) uses
toPosixPath() to normalize to forward slashes. C:/Users/name/...
is valid in Git Bash, WSL bash, PowerShell, and Node.js.
Changes:
- utils/path-display.ts: New toPosixPath() utility in pi-coding-agent
package (for system prompt) and shared extension module (for
extensions that can't import from the compiled package at dev time)
- system-prompt.ts: Normalize resolvedCwd before injecting into the
'Current working directory' line
- gsd/index.ts: Normalize all process.cwd() and originalBase paths in
worktree context blocks injected into the system prompt
- bg-shell/index.ts: Normalize cwd in tool result text (start, env
actions) that the model reads and may reference in commands
- path-display.test.ts: 9 regression tests covering toPosixPath
behavior and system prompt output verification. Includes a scanner
that fails if any Windows absolute paths with backslashes appear in
buildSystemPrompt() output.
Audit scope: Checked all process.cwd() usage across pi-coding-agent
and all bundled extensions. Filesystem-only paths (join, readFile,
spawn cwd, existsSync) are correct and left unchanged. Only paths
entering LLM text are normalized.
When the LLM writes freeform prose roadmaps with `## Slice S01: Title`
headers instead of the machine-readable `## Slices` checklist,
parseRoadmapSlices() returned zero slices, causing deriveState() to
permanently block with 'No slice eligible'.
Add a fallback parser that detects prose-style `## Slice SXX:` headers
(and variants like `## S01:`, `## S01 —`) and extracts slice IDs,
titles, and dependencies from the prose. Also parses `Depends on:`
text patterns. All fallback slices default to risk:medium and done:false.
When merging a milestone back to main, `git checkout main` fails if
untracked .gsd/ state files (STATE.md, completed-units.json, auto.lock)
in the working tree conflict with tracked files on the branch.
Remove these known GSD-managed state files before checkout. They are
runtime artifacts regenerated by doctor/rebuildState and are not
meaningful in the main working tree — the worktree had the real state.
Worktree initialization only copied DECISIONS.md, REQUIREMENTS.md,
PROJECT.md, and QUEUE.md. The missing STATE.md caused the pre-dispatch
health check in doctor-proactive.ts to block dispatch with
'STATE.md missing'.
Add STATE.md, KNOWLEDGE.md, and OVERRIDES.md to the copy list so
worktrees start with complete planning state.
Every new pi session writes a fresh syncedAt timestamp to
managed-resources.json, causing a running auto-mode session to falsely
detect a GSD update and stop. The actual version (gsdVersion) only
changes on real upgrades.
Switch the staleness check from syncedAt (timestamp) to gsdVersion
(semver string) so that launching a second session no longer triggers
a false positive.
When multiple tool calls (e.g. concurrent gsd_save_decision) target the
same markdown file, the deterministic .tmp suffix caused ENOENT on
rename() because one caller consumed the temp file before another could
rename it.
Replace the static `.tmp` suffix with a per-call random suffix so each
concurrent writer gets its own temp file. Also clean up orphaned temp
files on rename failure.
The .planning → .gsd migration creates roadmaps and summaries but not
VALIDATION files. deriveState() requires a terminal validation file
(verdict: pass) to consider a milestone complete. Without it, every
migrated milestone enters validating-milestone phase, blocking progress
to the actual current milestone.
For milestones where all slices are done, write a pass-through
VALIDATION.md (verdict: pass, migrated: true) and SUMMARY.md so
deriveState() skips them correctly.
Updated integration test to verify VALIDATION/SUMMARY files are written
and deriveState returns 'complete' phase with activeMilestone pointing
to the last completed entry (expected behavior).
When unique_milestone_ids is enabled, the LLM cannot generate random
suffixes itself. Previously only the first milestone got a correct ID
(pre-generated in TS), while subsequent milestones in multi-milestone
projects got bare M002/M003 without suffixes.
Added a gsd_generate_milestone_id tool that the LLM calls to get each
milestone ID. The tool scans disk for existing milestones and respects
the unique_milestone_ids preference, making it impossible to produce
wrong-format IDs.
Updated discuss, discuss-headless, and queue prompts to instruct the
LLM to use the tool instead of inventing milestone IDs.
* feat: worker NDJSON monitoring, budget enforcement, PID-based stop fallback
Closes three gaps in parallel orchestration:
1. **Worker stdout monitoring** — Workers now run with `--mode json` so
they emit NDJSON events. The coordinator parses stdout line-by-line,
extracting cost/token data from `message_end` events. This keeps
per-worker cost tracking in sync with actual API spend and updates
session status files for live dashboard visibility.
2. **Budget enforcement before spawn** — `startParallel()` now checks
`isBudgetExceeded()` before each worker spawn. When the aggregate
cost across all workers reaches the configured ceiling, no new
workers are started.
3. **PID-based stop fallback** — `stopParallel()` now falls back to
`process.kill(pid, "SIGTERM")` when the ChildProcess handle is null
(e.g., after coordinator restart when handles aren't available).
Previously, orphaned workers could not be stopped.
Includes 11 new tests covering NDJSON format validation, cost
aggregation, budget ceiling comparison, and PID-based kill patterns.
All 54 existing parallel-orchestration tests still pass.
Relates to #672
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: currentUnit type must match SessionStatus interface (object | null, not string)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: render native web search tool calls in TUI
The Anthropic streaming parser silently dropped server_tool_use and
web_search_tool_result content blocks, making native web search
invisible. Add ServerToolUseContent and WebSearchResultContent types,
handle both block types in the streaming parser and conversation replay,
and render them as ToolExecutionComponent in the interactive TUI.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: add PREFER_BRAVE_SEARCH env var to bypass native web search
Set PREFER_BRAVE_SEARCH=1 to keep Brave/custom search tools active
on Anthropic models instead of injecting native server-side web search.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: skip non-toolCall blocks in Mistral provider conversation replay
The ServerToolUseContent and WebSearchResultContent types added for
native web search don't have id/name/arguments properties, causing
TypeScript errors when the Mistral provider tried to push them as
tool calls.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: meaningful commit messages from task summaries (#785, #784)
Post-task commits now derive messages from the task summary one-liner,
inferred type, and key files. Planning prompts respect commit_docs: false.
Commit type inference expanded with perf type and oneLiner parameter.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: replace invalidateStateCache with invalidateAllCaches in crash recovery
PR #799 reintroduced invalidateStateCache() calls in the phantom skip
loop crash recovery paths. These should use invalidateAllCaches() which
is the renamed function.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: resolve CI failures from main merge conflicts
- Replace invalidateStateCache() → invalidateAllCaches() in crash
recovery paths (reintroduced by PR #799 merge)
- Expand smart-entry-draft test chunk window from 3000 to 4000 chars
to accommodate commitInstruction additions
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Cherry-picked from #797 — adds clearArtifacts() to gsd-db.ts and wires
it into invalidateAllCaches() so the DB artifact table is cleared
alongside state/path/parse caches.
Co-Authored-By: 0xLeathery (PR #797)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>