Comprehensive vscode extension redesign with sidebar reorganization,
new features, and enhanced agent integration:
- Redesign sidebar UI: reduce 6 panels to 3, declutter layout
- SCM provider for tracking agent-modified files
- Checkpoint system for saving/restoring agent state
- Diagnostic integration for surfacing errors in editor
- Line-level editor decorations for agent-modified lines
- Git integration for visualizing agent changes
- Execution plan viewer for live agent step visualization
- Approval/permissions mode system
- Auto-inject editor selection and diagnostics in chat
- Route workflow buttons through Chat panel
- Handle extension UI requests from agent (select, confirm, input)
- Session persistence, ISO timestamp support, descriptive checkpoints
- Bump to v0.3.0
Previously, headless --verbose mode accumulated text_delta events into a
buffer and displayed a single truncated 120-char [thinking] line before
tool calls. The model's actual text responses between tool calls were
effectively invisible.
Changes:
- Stream text_delta and thinking_delta events directly to stderr in
verbose mode with [text] and [thinking] block markers
- No truncation — full model output is visible
- Fix non-verbose fallback: read from ame.delta (correct field) instead
of ame.text (always undefined for text_delta events)
- Track inTextBlock/inThinkingBlock state to properly close streaming
blocks before tool calls
- Expand summarizeToolArgs with support for async_bash, await_job,
cancel_job, find, ls, lsp, hashline_edit, subagent, browser_navigate,
and gsd_* tools
- Add streaming formatter functions: formatTextStart, formatTextEnd,
formatThinkingStart, formatThinkingEnd
- Update tests for new tool arg summarization and path field handling
Saves in-progress daemon work from M005-m138xe that was sitting uncommitted.
Includes orchestrator expansion, event bridge/formatter enhancements,
message batcher tweaks, and discord bot additions.
checkAutoStartAfterDiscuss() fire-and-forgets startAuto() when a
milestone is ready. The headless runner then chains `/gsd auto`,
calling startAuto() a second time. Two concurrent auto-loops on the
same AutoSession singleton corrupt shared state (counters, dispatch
maps), causing planning/execution to never run after research.
Add an early `s.active` check at the top of startAuto() so the second
call no-ops. Add source-scanning test to enforce the guard exists.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Three defects in the completing-milestone dispatch guard caused false
positive blocks on valid validation output:
1. Single-line constraint: [^\n]* stopped at newlines, missing verdicts
on subsequent lines. Fixed with [\s\S]{0,500}? (bounded lazy match).
2. Missing keywords: 'satisfied' and 'partially' were absent from the
alternation. LLMs commonly write 'PARTIALLY SATISFIED' or 'FULLY
SATISFIED'. Added both.
3. Markdown bold delimiters: **Operational** blocked [\s:] after the
word. The new [\s\S] class handles any character including *.
Also adds SATISFIED to the structuredMatch includes check, and ✅ to
the prose regex (overlaps with #2862).
Includes 8 regression test cases covering multi-line formats, satisfied
keyword variants, markdown bold tables, and checkmark emoji.
Bug 1 — Workers exit immediately (#2792):
spawnWorker() used `--print "/gsd auto"` which calls session.prompt()
that returns immediately when ctx.newSession() resets the session inside
the auto-loop. Changed to `headless --json auto` which uses an RPC
client that keeps the process alive until auto-mode completes.
Bug 2 — Dispatch guard blocks parallel workers (#2797):
getPriorSliceCompletionBlocker() checked ALL milestones in queue order,
blocking M012 when M011 had incomplete slices. When GSD_MILESTONE_LOCK
is set, the guard now only checks intra-milestone slice dependencies.
Added test covering cross-milestone bypass + intra-milestone preservation.
Bug 3 — Orphaned RPC children on stop (#2798):
stopParallel() gave only 750ms for SIGTERM before SIGKILL. The headless
parent needs ~1500ms to cascade shutdown to its RPC child via
client.stop(). Increased to 3000ms to prevent orphaned processes holding
auto.lock.
Updated tests:
- dispatch-guard.test.ts: new test for GSD_MILESTONE_LOCK bypass
- parallel-worker-monitoring.test.ts: updated spawn args assertion
The dashboard reads elapsed time, total cost, and tokens used
exclusively from AutoDashboardData. When auto-mode is not active
(e.g. manual /gsd next), auto is null and all three metrics show 0
— even though the status bar displays real values via /api/visualizer.
Add the same projectTotals polling pattern (30s interval via
/api/visualizer) that status-bar.tsx already uses, and wire it into
the fallback chain: projectTotals ?? auto ?? 0.
Closes#2709
When worktrees use shared-WAL mode (R012), the worktree DB path resolves
to the same physical file as the project root DB via symlink. Calling
reconcileWorktreeDb() ATTACHes this WAL-mode file to itself, corrupting
the database with 'database disk image is malformed'.
Fix 1 — auto-worktree.ts mergeMilestoneToMain(): skip reconciliation
when isSamePath() confirms both DB paths resolve to the same file.
Fix 2 — gsd-db.ts reconcileWorktreeDb(): defence-in-depth realpathSync
guard inside the function itself, before the ATTACH statement.
Fix 3 — auto/infra-errors.ts: classify 'database disk image is
malformed' as SQLITE_CORRUPT infrastructure error so the auto-loop
stops immediately instead of burning 3 retries on a guaranteed failure.
Regression tests verify:
1. Same-file via symlink returns zero (no ATTACH)
2. Identical string paths return zero
3. Genuinely different DBs still reconcile normally
4. Malformed DB message classified as infra error
5. Transient SQLITE_BUSY is not falsely classified
Closes#2823
When GSD_WEB_DAEMON_MODE=1 is set, scheduleShutdown() becomes a no-op.
The /api/shutdown endpoint still returns { ok: true } so the client
beacon fires without a network error, but process.exit() is never
called. This allows gsd --web to run as a persistent daemon behind a
reverse proxy without exiting on every browser tab close or refresh.
Closes#2835
Multi-turn commands (auto, next) have their own completion signals via
isTerminalNotification ("Auto-mode stopped..."/"Step-mode stopped...").
The execution_complete event fires after command setup before any real
work begins, causing these commands to exit immediately with zero work done.
Closes#2917
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
mapStatusToExitCode only handled "complete" but RPC v2 emits "completed",
causing all headless sessions to falsely timeout and restart.
Also emits milestone-ready notification in checkAutoStartAfterDiscuss so
headless parent can detect and chain into auto-mode.
Closes#2914
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Tests cover: provider registration, base URL + API type, reasoning +
context window specs, and non-collision with generated zai models.
Required by CI lint gate (require-tests.sh).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add mount point detection for /media, /mnt, /run/media
- Display mount points as quick-access entries when browsing home dir
- Allow navigation to mount points while maintaining security scope
Fixes#2908
Open the project database before the first auto bootstrap derive so cold-start resume uses DB-backed slice state instead of stale markdown fallback state.
Also recognize glyph completion markers in roadmap tables and lock the new bootstrap ordering with regression coverage.
Closes#2841
Auto-mode selected the correct unit model in runUnitPhase, but a fresh session could drop that selection before the first prompt was sent.
Persist the applied unit model on AutoSession, restore it immediately after newSession(), and cover the seam with a regression test that proves the model is re-applied before dispatch.
Closes#2853
`gsd headless new-milestone --auto --verbose` now works — flags are
parsed regardless of position relative to the command word.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Restructure from flat documentation reference into proper agent-oriented
skill with XML structure, mental model, routing to workflows, and restored
reference content (KNOWLEDGE.md, flags, event streaming, answer injection,
command table).
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The xterm-theme test reads shell-terminal.tsx and main-session-terminal.tsx
via readFileSync relative to import.meta.dirname. When compiled tests run
from dist-test/, this resolves to dist-test/web/components/gsd/ — but only
web/lib/ was being copied by compile-tests.mjs, causing the test to fail.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Keep auto-worktree sync and initial seeding aligned with the repo's canonical preferences filename while retaining the lowercase legacy fallback for older repos and case-sensitive filesystems.
* fix(test): wire src/resources/extensions/shared/tests/ into test:unit runner
The test:unit glob excluded src/resources/extensions/shared/tests/ entirely,
leaving format-utils.test.ts (and any future tests there) silently unfired.
- Add shared/tests/*.test.ts to the test:unit glob in package.json
- Export newestSrcMtime from ensure-workspace-builds.cjs (require.main guard
prevents side-effects on require) so the staleness logic can be tested
- Add src/tests/ensure-workspace-builds.test.ts covering newestSrcMtime:
non-existent dir, no .ts files, single file, max of multiple, recursion,
node_modules skip
Closes#2808
* perf(test): compile unit tests with esbuild and fix dist-test/node_modules
Replace per-file --experimental-strip-types with a single esbuild compilation
step (scripts/compile-tests.mjs) that compiles all src/ TypeScript to dist-test/
in ~3s, then runs the pre-compiled JS. Eliminates ~1.7s Node startup overhead
per test file.
- scripts/compile-tests.mjs: esbuild compilation, asset copy, .ts→.js rewrite,
stale file cleanup; creates dist-test/node_modules symlink so resource-loader.ts
resolves gsdNodeModules to a real path (fixes node-modules-symlink test failure)
- scripts/dist-test-resolve.mjs: ESM loader hook for @gsd/* bare specifiers and
.ts→.js fallback rewriting at runtime
- .gitignore: exclude dist-test/ from version control
- package.json: add test:compile script; update test:unit to compile-then-run;
update test:integration globs to cover new integration/ subdirectories
- worker-registry.ts: unref() cleanup timer so it does not keep the Node process
alive after tests complete
Closes#2858
* fix(test): update relative imports in tests/integration/ after directory move
When tests were moved from tests/ to tests/integration/ in the previous
commit, relative imports weren't updated. ../foo now resolves one level
too shallow.
Fix all 117 import paths across 43 test files:
- ../foo → ../../foo (source files at gsd/ level)
- ../../get-secrets-from-user.ts → ../../../ (at extensions/ level)
- ../../subagent/worker-registry.ts → ../../../ (at extensions/ level)
- ./marketplace-test-fixtures.js → ../marketplace-test-fixtures.ts
- ./test-helpers.ts → ../test-helpers.ts
typecheck:extensions now passes with zero errors.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* test(integration): set 10-minute timeout for integration test runner
build job takes ~7min on main. Without a global timeout, hanging tests
block the suite indefinitely. --test-timeout=600000 caps each test at 10min.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* Revert "test(integration): set 10-minute timeout for integration test runner"
This reverts commit be77ead77d369ad8569292ae6b69ba56435f5433.
* fix(test): correct formatDuration(0) edge case and docker test root path
- formatDuration(0) now returns '0s' instead of '0ms' by guarding the
sub-second branch with ms > 0
- docker-template.test.ts root path goes ../../.. from dist-test/src/tests/
to reach project root instead of landing in dist-test/
- replace require() calls in skill-health.ts and visualizer-overlay.ts
with proper ES module imports
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix(test): correct relative import paths in integration tests
All affected tests were one directory level off — importing from ../web/
and ../resources/ when the correct paths are ../../web/ and ../../resources/.
Tests live at src/tests/integration/, not src/tests/.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix(test): add esbuild to root devDeps and wire dist-test-resolve hook
P1: esbuild was only in web/package.json — compile-tests.mjs requires it
at the root node_modules path, so CI failed on clean installs.
P2: dist-test-resolve.mjs existed but was never loaded; @gsd/* imports in
compiled tests resolved to installed workspace packages instead of freshly
compiled dist-test output. Add --import to test:unit.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix(deps): align esbuild version with lock file (0.25.12)
^0.27.4 didn't satisfy the existing lock file entry. Use the version
already present so npm ci passes without regenerating the lock file.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix(test): correct all relative import depths in src/tests/integration/
Tests in src/tests/integration/ need 3 levels up (../../..) to reach
project-root dirs (web/, packages/) and 2 levels up (../..) to reach
src-level dirs (src/web/, src/cli-web-branch.ts).
Fixes:
- ../../web/lib/ → ../../../web/lib/ (Next.js app, not src/web/)
- ../../web/app/ → ../../../web/app/
- ../../packages/ → ../../../packages/
- ../cli-web-branch.ts → ../../cli-web-branch.ts
- ../web-mode.ts → ../../web-mode.ts
- ../resources/extensions/ → ../../resources/extensions/
- ci_monitor ROOT path: 2 levels up → 3 levels up
- web-responsive WEB_ROOT: 2 levels up → 3 levels up
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* chore(test): use dot reporter for test:unit to reduce noise
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* chore(test): switch test:unit reporter to tap
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* chore(test): compact test reporter — silent on pass, failures + summary only
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* chore(test): include shared/tests in test:coverage
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix(test): correct path depths in tests moved to integration/
Tests moved from tests/ to tests/integration/ need one extra ../
to reach the same source files. Also fix web component paths — those
files live at web/ not src/web/.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix(test): fix web component paths in web-session-parity-contract
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix(test): use process.cwd() for project root in docker-template test
Resolving relative to __dirname breaks under test:coverage which runs
source files directly from src/tests/ — needs ../.. not ../../..
(the extra level only exists in the compiled dist-test/ output).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* ci: retrigger CI
---------
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
* feat(parallel): add /gsd parallel watch command and Ctrl+Alt+P overlay
Integrates the parallel worker monitor as a native pi-tui overlay that
renders inside the GSD session, matching the existing dashboard overlay
pattern (GSDDashboardOverlay / Ctrl+Alt+G).
Three integration points:
- /gsd parallel watch — opens the live monitor overlay
- Ctrl+Alt+P — keyboard shortcut (same pattern as Ctrl+Alt+G for status)
- Tab completion: 'watch' added to parallel subcommand completions
The overlay (ParallelMonitorOverlay) provides:
- Per-worker panels: health dot, phase label, slice/task progress bars
- Event feed: recent task completions from worktree SQLite DBs
- Cost tracking: status.json with NDJSON fallback for respawned workers
- Heartbeat: orchestrator timestamp or file mtime proxy
- Scrollable: arrow keys / j/k, ESC/q to close
- 5s auto-refresh via setInterval
Reuses all data-reading logic from the standalone scripts/parallel-monitor.mjs
(merged in #2799) but renders through the pi-tui theme system instead of
raw ANSI codes. Follows the same overlay registration pattern as the
GSD dashboard (register-shortcuts.ts + handlers/core.ts).
* fix(parallel): align overlay with Component interface, add tests
- Add invalidate() method required by Component interface
- Fix handleInput signature: void return, not boolean
- Fix Key usage: Key.escape/Key.down/Key.up (constants, not functions)
- Fix render signature: single width arg, not (width, height)
- Add resize listener cleanup in dispose()
- Add parallel-monitor-overlay.test.ts (satisfies require-tests CI gate)
* fix(parallel): use spawnSync for cross-platform path safety
Replace execSync template literals with spawnSync array args for sqlite3
calls. Paths with spaces or special chars broke on Windows because
execSync interpolates into a shell string. spawnSync passes args directly
to the process, bypassing shell interpretation.
Fixes cross-platform-filesystem-safety.test.ts assertion.
Transient provider recovery previously sent a hidden continue message after the backoff timer elapsed, but the auto loop had already exited. Resume the paused session through startAuto() instead so the timer actually restarts auto-mode, and cover the resumed, duplicate-resume, and missing-base-path cases with regression tests.
Closes#2813
GLM-5.1 is the latest Zhipu AI model with 204K context window and
131K max output tokens. It uses the Z.AI Coding Plan endpoint
(OpenAI-compatible) and supports reasoning via enable_thinking.
Not yet tracked by models.dev, so added to models.custom.ts alongside
existing alibaba-coding-plan entries. Merges additively with the
generated Z.AI provider (glm-5, glm-5-turbo, etc.).
Specs from https://docs.z.ai/devpack/using5.1
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Per-milestone lock isolation prevents workers from contending on shared
.gsd/auto.lock. Budget ceiling scoped to current session for parallel
workers. Symlink sync skip prevents ERR_FS_CP_EINVAL. Planning artifacts
copied to worktree so workers can find their roadmap.
Closes#2184