chore(merge): resolve conflict with upstream/main for PR #3204

Keep catch-all STREAM_RE from PR; upstream's 5-variant whack-a-mole is superseded by the /in JSON at position \d+/ pattern. Also drop the now- stale comment about checking stream before server/connection (no longer needed since catch-all avoids those false-positive overlaps).
2026-04-01 14:05:28 -05:00 · 2026-04-01 14:05:28 -05:00 · f7cb3ec07b
commit f7cb3ec07b
parent 977e6bf963 6e7721406e
316 changed files with 24307 additions and 1333 deletions
--- a/.plans/extension-loading-multi-path.md
+++ b/.plans/extension-loading-multi-path.md
@ -0,0 +1,138 @@
+# Extension Loading: Dependency Sort + Unified Enable/Disable
+
+## Context
+
+GSD-2 has a well-structured extension system with three discovery paths (bundled, global/community, project-local) that are **already wired up** through pi's `DefaultPackageManager.addAutoDiscoveredResources()`. However, two critical gaps remain:
+
+1. `sortExtensionPaths()` (topological dependency sort) is implemented but **never called** — `dependencies.extensions` in manifests is decorative
+2. The GSD extension registry (enable/disable) only applies to **bundled** extensions — community extensions bypass it entirely
+
+### Architecture (Current Flow)
+
+```
+GSD loader.ts
+  → discoverExtensionEntryPaths(bundledExtDir)
+  → filter by GSD registry (isExtensionEnabled)
+  → set GSD_BUNDLED_EXTENSION_PATHS env var
+      ↓
+DefaultResourceLoader.reload()
+  → packageManager.resolve()
+    → addAutoDiscoveredResources()
+      → project: cwd/.gsd/extensions/     (CONFIG_DIR_NAME = ".gsd")
+      → global:  ~/.gsd/agent/extensions/  (includes synced bundled)
+  → loadExtensions(mergedPaths)            ← NO sort, NO registry check on community
+```
+
+### Key Files
+
+| File | Role |
+|------|------|
+| `src/loader.ts` (lines 146-161) | GSD startup — bundled discovery + registry filter |
+| `src/extension-sort.ts` | Topological sort (Kahn's BFS) — EXISTS but NEVER CALLED |
+| `src/extension-registry.ts` | Registry I/O, enable/disable, tier checks |
+| `src/resource-loader.ts` (lines 589-607) | `buildResourceLoader()` — constructs DefaultResourceLoader |
+| `packages/pi-coding-agent/src/core/resource-loader.ts` (lines 311-395) | `reload()` — merges paths, calls `loadExtensions()` |
+| `packages/pi-coding-agent/src/core/package-manager.ts` (lines 1585-1700) | `addAutoDiscoveredResources()` — auto-discovers from .gsd/ dirs |
+| `packages/pi-coding-agent/src/core/extensions/loader.ts` (lines 945-1002) | `discoverAndLoadExtensions()` — DEAD CODE, never invoked |
+
+---
+
+## Plan
+
+### Task 1: Wire topological sort into extension loading
+
+**What:** Call `sortExtensionPaths()` on the merged extension paths before passing them to `loadExtensions()`.
+
+**Where:** `packages/pi-coding-agent/src/core/resource-loader.ts` ~line 381-385
+
+**Before:**
+```typescript
+const extensionsResult = await loadExtensions(extensionPaths, this.cwd, this.eventBus);
+```
+
+**After:**
+```typescript
+import { sortExtensionPaths } from '../../../src/extension-sort.js';
+
+const { sortedPaths, warnings } = sortExtensionPaths(extensionPaths);
+for (const w of warnings) {
+  // emit as diagnostic, not hard error
+}
+const extensionsResult = await loadExtensions(sortedPaths, this.cwd, this.eventBus);
+```
+
+**Consideration:** `sortExtensionPaths` lives in `src/` (GSD side), not in `packages/pi-coding-agent/`. Need to either:
+- (a) Move it into pi-coding-agent as a shared utility, OR
+- (b) Import it cross-package (already done for other GSD→pi imports), OR
+- (c) Call it on the GSD side before paths reach pi — harder since auto-discovered paths are added inside pi's package manager
+
+Option (a) is cleanest — the sort logic only depends on `readManifestFromEntryPath` which is also in `src/extension-registry.ts` but could be duplicated or shared.
+
+### Task 2: Apply GSD registry to community extensions
+
+**What:** When `buildResourceLoader()` in `src/resource-loader.ts` constructs the DefaultResourceLoader, also discover and filter community extensions from `~/.gsd/agent/extensions/` through the GSD registry — same as it already does for `~/.pi/agent/extensions/` paths.
+
+**Where:** `src/resource-loader.ts` → `buildResourceLoader()` (lines 589-607)
+
+**Current code already filters pi extensions:**
+```typescript
+const piExtensionPaths = discoverExtensionEntryPaths(piExtensionsDir)
+  .filter((entryPath) => !bundledKeys.has(getExtensionKey(entryPath, piExtensionsDir)))
+  .filter((entryPath) => {
+    const manifest = readManifestFromEntryPath(entryPath)
+    if (!manifest) return true
+    return isExtensionEnabled(registry, manifest.id)
+  })
+```
+
+**Add similar filtering for community extensions in agentDir:**
+- Discover extensions in `~/.gsd/agent/extensions/` that are NOT bundled
+- Filter through `isExtensionEnabled(registry, manifest.id)`
+- Pass as disabled (via override patterns or pre-filtering) to the resource loader
+
+**Alternative approach:** Hook into `addAutoDiscoveredResources` or the `addResource` call to check the GSD registry. This might be cleaner since the auto-discovery already happens inside pi's package manager.
+
+### Task 3: Emit sort warnings as diagnostics
+
+**What:** Surface dependency warnings (missing deps, cycles) through GSD's diagnostic system so users see them.
+
+**Where:** Wherever the sort is invoked from Task 1.
+
+**Format:**
+```
+⚠ Extension 'gsd-watch' declares dependency 'gsd' which is not installed — loading anyway
+⚠ Extensions 'foo' and 'bar' form a dependency cycle — loading in alphabetical order
+```
+
+### Task 4: Clean up dead code
+
+**What:** The `discoverAndLoadExtensions()` function in `packages/pi-coding-agent/src/core/extensions/loader.ts` (lines 945-1002) is exported but never invoked. The project-local trust model inside it (`getUntrustedExtensionPaths`) also never runs.
+
+**Options:**
+- (a) Remove it entirely — it's dead
+- (b) Mark deprecated — in case upstream pi uses it
+- (c) Leave it — lowest risk
+
+Recommend (b) for now — add `@deprecated` JSDoc so it doesn't grow new callers.
+
+### Task 5: Tests
+
+- **Sort integration test:** Create two extensions where A depends on B. Verify B loads before A after sort.
+- **Registry community test:** Drop a community extension in `~/.gsd/agent/extensions/`, run `gsd extensions disable <id>`, verify it doesn't load.
+- **Conflict test:** Same extension ID in project-local and global — verify project-local wins.
+- **Missing dep test:** Extension declares dependency on non-existent extension — verify warning emitted, extension still loads.
+- **Cycle test:** Two extensions that depend on each other — verify warning, both load.
+
+---
+
+## Follow-up PR (separate)
+
+**Subagent extension forwarding:** Update `src/resources/extensions/subagent/index.ts` to forward ALL extension paths (not just bundled) to child processes. May need a second env var like `GSD_COMMUNITY_EXTENSION_PATHS` or consolidate into `GSD_EXTENSION_PATHS`.
+
+---
+
+## Open Questions
+
+1. **Where should `sortExtensionPaths` live?** Currently in `src/` (GSD side). Needs to be callable from pi's resource-loader. Options: move to pi, keep and import cross-package, or duplicate.
+2. **Should community extensions respect the same registry as bundled?** Or should they have their own enable/disable mechanism? Current plan unifies them.
+3. **Project-local trust:** The TOFU model in the dead `discoverAndLoadExtensions()` never runs. Should `addAutoDiscoveredResources` also gate project-local extensions behind trust? Or is `.gsd/extensions/` in your own project always trusted?
--- a/.plans/ollama-native-provider.md
+++ b/.plans/ollama-native-provider.md
@ -0,0 +1,241 @@
+# Ollama Extension — First-Class Local LLM Support
+
+## Status: DRAFT — Awaiting approval
+
+## Problem
+
+Ollama support in GSD2 currently requires manual `models.json` configuration. Users must:
+1. Know the OpenAI-compatibility endpoint (`localhost:11434/v1`)
+2. Manually list every model they want to use
+3. Set compat flags (`supportsDeveloperRole: false`, etc.)
+4. Use a dummy API key
+
+There's an `ollama-cloud` provider for hosted Ollama, and a discovery adapter that can list models, but no first-class **local Ollama** extension that "just works."
+
+## Goal
+
+Make Ollama the easiest way to use GSD2 — zero config when Ollama is running locally. All Ollama functionality lives in a single extension: `src/resources/extensions/ollama/`.
+
+## Architecture
+
+Everything is a self-contained extension under `src/resources/extensions/ollama/`. The extension:
+- Auto-detects Ollama on startup via health check
+- Discovers and registers local models with the model registry
+- Provides native Ollama API streaming (not OpenAI shim)
+- Exposes `/ollama` slash commands for model management
+- Registers an LLM-callable tool for model pull/status
+
+Minimal core changes — only `KnownProvider` and `KnownApi` type additions in `pi-ai`, and `env-api-keys.ts` for key resolution. Everything else is in the extension.
+
+## File Structure
+
+```
+src/resources/extensions/ollama/
+├── index.ts                  # Extension entry — wires everything on session_start
+├── ollama-client.ts          # HTTP client for Ollama REST API (/api/*)
+├── ollama-discovery.ts       # Model discovery + capability detection
+├── ollama-provider.ts        # Native /api/chat streaming provider (registers with pi-ai)
+├── ollama-commands.ts        # /ollama slash commands (status, pull, list, remove, ps)
+├── ollama-tool.ts            # LLM-callable tool for model management
+├── model-capabilities.ts     # Known model capability table (context window, vision, reasoning)
+└── types.ts                  # Shared types for Ollama API responses
+```
+
+## Scope
+
+### Phase 1: Auto-Discovery + OpenAI-Compat Routing
+
+**What:** Extension that auto-detects Ollama, discovers models, registers them using the existing `openai-completions` API provider. Zero config needed.
+
+**Extension files:**
+- `ollama/index.ts` — Main entry. On `session_start`:
+  1. Probe `localhost:11434` (or `OLLAMA_HOST`) with 1.5s timeout
+  2. If reachable, discover models via `/api/tags`
+  3. Register discovered models with `ctx.modelRegistry` using correct defaults
+  4. Show status widget if Ollama is detected
+- `ollama/ollama-client.ts` — Low-level HTTP client:
+  - `isRunning()` — `GET /` health check
+  - `getVersion()` — `GET /api/version`
+  - `listModels()` — `GET /api/tags`
+  - `showModel(name)` — `POST /api/show` (details, template, parameters, size)
+  - `getRunningModels()` — `GET /api/ps` (loaded models, VRAM usage)
+  - `pullModel(name, onProgress)` — `POST /api/pull` (streaming progress)
+  - `deleteModel(name)` — `DELETE /api/delete`
+  - `copyModel(source, dest)` — `POST /api/copy`
+  - Respects `OLLAMA_HOST` env var for non-default endpoints
+- `ollama/ollama-discovery.ts` — Enhanced model discovery:
+  - Calls `/api/tags` to get model list
+  - Calls `/api/show` per model (batch, cached) to get:
+    - `details.parameter_size` → estimate context window
+    - `details.families` → detect vision (clip), reasoning (deepseek-r1)
+    - `modelfile` → extract default parameters
+  - Returns enriched `DiscoveredModel[]` with proper capabilities
+- `ollama/model-capabilities.ts` — Known model lookup table:
+  - Maps well-known model families to capabilities
+  - e.g., `llama3.1` → `{ contextWindow: 131072, input: ["text"] }`
+  - e.g., `llava` → `{ contextWindow: 4096, input: ["text", "image"] }`
+  - e.g., `deepseek-r1` → `{ reasoning: true, contextWindow: 131072 }`
+  - e.g., `qwen2.5-coder` → `{ contextWindow: 131072, input: ["text"] }`
+  - Fallback: estimate from parameter count if not in table
+- `ollama/types.ts` — Ollama API response types
+
+**Core changes (minimal):**
+- `packages/pi-ai/src/types.ts` — Add `"ollama"` to `KnownProvider`
+- `packages/pi-ai/src/env-api-keys.ts` — Add `"ollama"` key resolution (returns `"ollama"` placeholder — no real key needed)
+- `src/onboarding.ts` — Add `"ollama"` to provider selection list
+- `src/wizard.ts` — Add `ollama` entry (no key required)
+
+**Model registration details:**
+Each discovered model registers as:
+```typescript
+{
+  id: "llama3.1:8b",           // from /api/tags
+  name: "Llama 3.1 8B",        // humanized
+  api: "openai-completions",    // uses existing provider
+  provider: "ollama",
+  baseUrl: "http://localhost:11434/v1",
+  cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 },
+  reasoning: false,             // from capabilities table
+  input: ["text"],              // from capabilities table
+  contextWindow: 131072,        // from capabilities table or /api/show
+  maxTokens: 16384,             // conservative default
+  compat: {
+    supportsDeveloperRole: false,
+    supportsReasoningEffort: false,
+    supportsUsageInStreaming: false,
+    maxTokensField: "max_tokens",
+  },
+}
+```
+
+**Behavior:**
+- `gsd --list-models` shows all locally-pulled Ollama models automatically
+- `/model ollama/llama3.1:8b` works without any config file
+- If Ollama isn't running, extension is silent — no errors, no models listed
+- `models.json` overrides still work (user config wins over auto-discovery)
+
+### Phase 2: Native Ollama API Provider (`/api/chat`)
+
+**What:** A dedicated streaming provider that talks Ollama's native protocol instead of the OpenAI compatibility shim.
+
+**Extension files:**
+- `ollama/ollama-provider.ts` — Native `/api/chat` streaming:
+  - Registers `"ollama-chat"` API with `registerApiProvider()`
+  - Implements `stream()` and `streamSimple()`:
+    - Maps GSD `Context` → Ollama messages format
+    - Maps GSD `Tool[]` → Ollama tool format
+    - Streams NDJSON responses, maps back to `AssistantMessage` events
+    - Extracts `<think>` blocks for reasoning models (deepseek-r1, qwq)
+  - Ollama-specific options:
+    - `keep_alive` — control model memory retention (default: "5m")
+    - `num_ctx` — pass through model's context window
+    - `num_predict` — max output tokens
+    - Temperature, top_p, top_k
+  - Response metadata:
+    - `eval_count` / `eval_duration` → tokens/sec in usage stats
+    - `total_duration`, `load_duration` → performance visibility
+  - Vision support: converts image content to base64 for multimodal models
+
+**Core changes:**
+- `packages/pi-ai/src/types.ts` — Add `"ollama-chat"` to `KnownApi`
+
+**Phase 1 models switch to `api: "ollama-chat"` by default.** Users can force OpenAI-compat via `models.json` override if needed.
+
+**Why native over OpenAI-compat:**
+- Full `keep_alive` / `num_ctx` control
+- Better error messages (Ollama-native vs generic OpenAI)
+- More reliable tool calling on Ollama's native format
+- Performance metrics in response (tokens/sec)
+- Foundation for model management commands
+
+### Phase 3: Local LLM Management UX
+
+**What:** `/ollama` slash commands and an LLM tool for model management.
+
+**Extension files:**
+- `ollama/ollama-commands.ts` — Slash commands registered via `pi.registerCommand()`:
+  - `/ollama` — Status overview:
+    ```
+    Ollama v0.5.7 — running (localhost:11434)
+
+    Loaded:
+      llama3.1:8b       4.7 GB VRAM   idle 3m
+
+    Available:
+      llama3.1:8b       (4.7 GB)
+      qwen2.5-coder:7b  (4.4 GB)
+      deepseek-r1:8b    (4.9 GB)
+    ```
+  - `/ollama pull <model>` — Pull with streaming progress via `ctx.ui.setWidget()`
+  - `/ollama list` — List all local models with sizes and families
+  - `/ollama remove <model>` — Delete a model (with confirmation)
+  - `/ollama ps` — Running models + VRAM usage
+- `ollama/ollama-tool.ts` — LLM-callable tool registered via `pi.registerTool()`:
+  - `ollama_manage` tool — lets the agent pull/list/check models
+  - Parameters: `{ action: "list" | "pull" | "status" | "ps", model?: string }`
+  - Use case: agent detects it needs a model, pulls it automatically
+
+**UX Flow:**
+```
+$ gsd
+> /ollama
+Ollama v0.5.7 — running (localhost:11434)
+Loaded:
+  llama3.1:8b    — 4.7 GB VRAM, idle 3m
+Available:
+  llama3.1:8b    (4.7 GB)
+  qwen2.5-coder:7b (4.4 GB)
+  deepseek-r1:8b (4.9 GB)
+
+> /ollama pull codestral:22b
+Pulling codestral:22b...
+████████████████████████████░░░░ 78% (14.2 GB / 18.1 GB)
+✓ codestral:22b ready
+
+> /model ollama/codestral:22b
+Switched to codestral:22b (local, Ollama)
+```
+
+## Implementation Order
+
+1. **Phase 1** — Auto-discovery with OpenAI-compat routing. Biggest user impact, smallest risk.
+2. **Phase 3** — Management UX (`/ollama` commands). Valuable even before native API.
+3. **Phase 2** — Native `/api/chat` provider. Optimization over OpenAI-compat; do last.
+
+## Core Changes Summary (minimal)
+
+| File | Change |
+|------|--------|
+| `packages/pi-ai/src/types.ts` | Add `"ollama"` to `KnownProvider`, `"ollama-chat"` to `KnownApi` (Phase 2) |
+| `packages/pi-ai/src/env-api-keys.ts` | Add `"ollama"` → always returns `"ollama"` placeholder |
+| `src/onboarding.ts` | Add `"ollama"` to provider picker |
+| `src/wizard.ts` | Add `"ollama"` key mapping (no key required) |
+
+Everything else lives in `src/resources/extensions/ollama/`.
+
+## Risks & Mitigations
+
+| Risk | Mitigation |
+|------|------------|
+| Ollama not running — startup probe latency | 1.5s timeout; cache result; probe async so it doesn't block TUI paint |
+| Model capabilities unknown | Known-model table + `/api/show` fallback + parameter_size estimation |
+| Tool calling unreliable on small models | Detect param count; warn on <7B models |
+| Ollama API changes between versions | Version detect via `/api/version`; stable endpoints only |
+| Conflicts with `models.json` Ollama config | User config always wins; auto-discovered models merge beneath manual config |
+| Extension disabled — no impact on core | Extension is additive; disabling removes all Ollama features cleanly |
+
+## Testing Strategy
+
+- Unit tests: `ollama-client.ts` with mocked fetch responses
+- Unit tests: `ollama-discovery.ts` model capability parsing
+- Unit tests: `ollama-provider.ts` message format mapping + NDJSON stream parsing
+- Unit tests: `model-capabilities.ts` known model lookups
+- Integration test: mock HTTP server simulating Ollama `/api/tags`, `/api/chat`, `/api/pull`
+- Manual test: real Ollama instance with llama3.1, qwen2.5-coder, deepseek-r1
+
+## Open Questions
+
+1. **Startup probe** — Probe Ollama on `session_start` (adds ~1.5s if not running) or lazy on first `/model`? **Recommendation: async probe on session_start (non-blocking), eager if `OLLAMA_HOST` is set.**
+2. **Auto-start** — Try to launch Ollama if installed but not running? **Recommendation: no — too invasive. Show helpful message in `/ollama` status.**
+3. **Vision support** — Support multimodal models (llava, etc.) in Phase 2 native API? **Recommendation: yes, detected via capabilities table.**
+4. **Model refresh** — How often to re-probe Ollama for new models? **Recommendation: on `/ollama list`, on `/model` command, and every 5 min (existing TTL).**
--- a/README.md
+++ b/README.md
@ -7,7 +7,7 @@
 [![npm version](https://img.shields.io/npm/v/gsd-pi?style=for-the-badge&logo=npm&logoColor=white&color=CB3837)](https://www.npmjs.com/package/gsd-pi)
 [![npm downloads](https://img.shields.io/npm/dm/gsd-pi?style=for-the-badge&logo=npm&logoColor=white&color=CB3837)](https://www.npmjs.com/package/gsd-pi)
 [![GitHub stars](https://img.shields.io/github/stars/gsd-build/GSD-2?style=for-the-badge&logo=github&color=181717)](https://github.com/gsd-build/GSD-2)
-[![Discord](https://img.shields.io/badge/Discord-Join%20us-5865F2?style=for-the-badge&logo=discord&logoColor=white)](https://discord.gg/gsd)
+[![Discord](https://img.shields.io/badge/Discord-Join%20us-5865F2?style=for-the-badge&logo=discord&logoColor=white)](https://discord.com/invite/nKXTsAcmbT)
 [![License](https://img.shields.io/badge/license-MIT-blue?style=for-the-badge)](LICENSE)
 [![$GSD Token](https://img.shields.io/badge/$GSD-Dexscreener-1C1C1C?style=for-the-badge&logo=data:image/svg+xml;base64,PHN2ZyB3aWR0aD0iMjQiIGhlaWdodD0iMjQiIHZpZXdCb3g9IjAgMCAyNCAyNCIgZmlsbD0ibm9uZSIgeG1sbnM9Imh0dHA6Ly93d3cudzMub3JnLzIwMDAvc3ZnIj48Y2lyY2xlIGN4PSIxMiIgY3k9IjEyIiByPSIxMCIgZmlsbD0iIzAwRkYwMCIvPjwvc3ZnPg==&logoColor=00FF00)](https://dexscreener.com/solana/dwudwjvan7bzkw9zwlbyv6kspdlvhwzrqy6ebk8xzxkv)

--- a/docs/what-is-pi/15-pi-packages-the-ecosystem.md
+++ b/docs/what-is-pi/15-pi-packages-the-ecosystem.md
@ -38,6 +38,6 @@ Or just use conventional directory names (`extensions/`, `skills/`, `prompts/`,

 - [Package gallery](https://shittycodingagent.ai/packages)
 - [npm search](https://www.npmjs.com/search?q=keywords%3Api-package)
- [Discord community](https://discord.com/invite/3cU7Bz4UPx)
+- [Discord community](https://discord.com/invite/nKXTsAcmbT)

 ---
--- a/package.json
+++ b/package.json
@ -54,7 +54,7 @@
    "copy-themes": "node scripts/copy-themes.cjs",
    "copy-export-html": "node scripts/copy-export-html.cjs",
    "test:compile": "node scripts/compile-tests.mjs",
-    "test:unit": "npm run test:compile && node --import ./scripts/dist-test-resolve.mjs --experimental-test-isolation=process --test-reporter=./scripts/test-reporter-compact.mjs --test 'dist-test/src/tests/*.test.js' 'dist-test/src/resources/extensions/gsd/tests/*.test.js' 'dist-test/src/resources/extensions/gsd/tests/*.test.mjs' 'dist-test/src/resources/extensions/shared/tests/*.test.js' 'dist-test/src/resources/extensions/claude-code-cli/tests/*.test.js' 'dist-test/src/resources/extensions/github-sync/tests/*.test.js' 'dist-test/src/resources/extensions/universal-config/tests/*.test.js' 'dist-test/src/resources/extensions/voice/tests/*.test.js'",
+    "test:unit": "npm run test:compile && node --import ./scripts/dist-test-resolve.mjs --experimental-test-isolation=process --test-reporter=./scripts/test-reporter-compact.mjs --test 'dist-test/src/tests/*.test.js' 'dist-test/src/resources/extensions/gsd/tests/*.test.js' 'dist-test/src/resources/extensions/gsd/tests/*.test.mjs' 'dist-test/src/resources/extensions/shared/tests/*.test.js' 'dist-test/src/resources/extensions/claude-code-cli/tests/*.test.js' 'dist-test/src/resources/extensions/github-sync/tests/*.test.js' 'dist-test/src/resources/extensions/universal-config/tests/*.test.js' 'dist-test/src/resources/extensions/voice/tests/*.test.js' 'dist-test/src/resources/extensions/mcp-client/tests/*.test.js'",
    "test:packages": "node --test packages/pi-coding-agent/dist/core/*.test.js",
    "test:marketplace": "GSD_TEST_CLONE_MARKETPLACES=1 node --import ./src/resources/extensions/gsd/tests/resolve-ts.mjs --experimental-strip-types --test src/resources/extensions/gsd/tests/claude-import-tui.test.ts src/resources/extensions/gsd/tests/plugin-importer-live.test.ts src/tests/marketplace-discovery.test.ts",
    "test:coverage": "c8 --reporter=text --reporter=lcov --exclude='src/resources/extensions/gsd/tests/**' --exclude='src/tests/**' --exclude='scripts/**' --exclude='native/**' --exclude='node_modules/**' --check-coverage --statements=40 --lines=40 --branches=20 --functions=20 node --import ./src/resources/extensions/gsd/tests/resolve-ts.mjs --experimental-strip-types --experimental-test-isolation=process --test src/resources/extensions/gsd/tests/*.test.ts src/resources/extensions/gsd/tests/*.test.mjs src/tests/*.test.ts src/resources/extensions/shared/tests/*.test.ts",
--- a/packages/native/package.json
+++ b/packages/native/package.json
@ -2,7 +2,7 @@
  "name": "@gsd/native",
  "version": "0.1.0",
  "description": "Native Rust bindings for GSD \u2014 high-performance native modules via N-API",
-  "type": "module",
+  "type": "commonjs",
  "main": "./dist/index.js",
  "types": "./dist/index.d.ts",
  "scripts": {
@ -14,75 +14,75 @@
  "exports": {
    ".": {
      "types": "./dist/index.d.ts",
-      "import": "./dist/index.js"
+      "default": "./dist/index.js"
    },
    "./grep": {
      "types": "./dist/grep/index.d.ts",
-      "import": "./dist/grep/index.js"
+      "default": "./dist/grep/index.js"
    },
    "./ps": {
      "types": "./dist/ps/index.d.ts",
-      "import": "./dist/ps/index.js"
+      "default": "./dist/ps/index.js"
    },
    "./glob": {
      "types": "./dist/glob/index.d.ts",
-      "import": "./dist/glob/index.js"
+      "default": "./dist/glob/index.js"
    },
    "./clipboard": {
      "types": "./dist/clipboard/index.d.ts",
-      "import": "./dist/clipboard/index.js"
+      "default": "./dist/clipboard/index.js"
    },
    "./ast": {
      "types": "./dist/ast/index.d.ts",
-      "import": "./dist/ast/index.js"
+      "default": "./dist/ast/index.js"
    },
    "./html": {
      "types": "./dist/html/index.d.ts",
-      "import": "./dist/html/index.js"
+      "default": "./dist/html/index.js"
    },
    "./text": {
      "types": "./dist/text/index.d.ts",
-      "import": "./dist/text/index.js"
+      "default": "./dist/text/index.js"
    },
    "./fd": {
      "types": "./dist/fd/index.d.ts",
-      "import": "./dist/fd/index.js"
+      "default": "./dist/fd/index.js"
    },
    "./image": {
      "types": "./dist/image/index.d.ts",
-      "import": "./dist/image/index.js"
+      "default": "./dist/image/index.js"
    },
    "./xxhash": {
      "types": "./dist/xxhash/index.d.ts",
-      "import": "./dist/xxhash/index.js"
+      "default": "./dist/xxhash/index.js"
    },
    "./diff": {
      "types": "./dist/diff/index.d.ts",
-      "import": "./dist/diff/index.js"
+      "default": "./dist/diff/index.js"
    },
    "./gsd-parser": {
      "types": "./dist/gsd-parser/index.d.ts",
-      "import": "./dist/gsd-parser/index.js"
+      "default": "./dist/gsd-parser/index.js"
    },
    "./highlight": {
      "types": "./dist/highlight/index.d.ts",
-      "import": "./dist/highlight/index.js"
+      "default": "./dist/highlight/index.js"
    },
    "./json-parse": {
      "types": "./dist/json-parse/index.d.ts",
-      "import": "./dist/json-parse/index.js"
+      "default": "./dist/json-parse/index.js"
    },
    "./stream-process": {
      "types": "./dist/stream-process/index.d.ts",
-      "import": "./dist/stream-process/index.js"
+      "default": "./dist/stream-process/index.js"
    },
    "./truncate": {
      "types": "./dist/truncate/index.d.ts",
-      "import": "./dist/truncate/index.js"
+      "default": "./dist/truncate/index.js"
    },
    "./ttsr": {
      "types": "./dist/ttsr/index.d.ts",
-      "import": "./dist/ttsr/index.js"
+      "default": "./dist/ttsr/index.js"
    }
  },
  "files": [
--- a/packages/native/src/tests/module-compat.test.mjs
+++ b/packages/native/src/tests/module-compat.test.mjs
@ -0,0 +1,91 @@
+/**
+ * Tests that the @gsd/native package.json is correctly configured
+ * for Node.js module resolution (ESM/CJS compatibility).
+ *
+ * Regression test for #2861: "type": "module" + "import"-only export
+ * conditions caused crashes on Node.js v24 when the parent package also
+ * declared "type": "module" and strict ESM resolution was enforced.
+ */
+
+import { test, describe } from "node:test";
+import assert from "node:assert/strict";
+import { readFileSync } from "node:fs";
+import * as path from "node:path";
+import { fileURLToPath } from "node:url";
+
+const __dirname = path.dirname(fileURLToPath(import.meta.url));
+const pkgPath = path.resolve(__dirname, "..", "..", "package.json");
+const pkg = JSON.parse(readFileSync(pkgPath, "utf8"));
+
+describe("@gsd/native module compatibility (#2861)", () => {
+  test("package.json must not declare type: module (compiled output is CJS-compatible)", () => {
+    // The compiled output uses createRequire() to load .node addons.
+    // Declaring "type": "module" forces Node.js to treat .js files as ESM,
+    // but the package needs "type": "commonjs" to override the parent
+    // package's "type": "module" and ensure correct CJS semantics.
+    assert.notEqual(
+      pkg.type,
+      "module",
+      'package.json must not set "type": "module" — this causes crashes on Node.js v24 ' +
+        "when the parent package also declares ESM (see #2861)",
+    );
+  });
+
+  test("package.json should explicitly declare type: commonjs", () => {
+    // When installed as a dependency under a parent with "type": "module"
+    // (e.g. gsd-pi), an absent "type" field would inherit the parent's
+    // ESM setting. Explicit "commonjs" overrides this.
+    assert.equal(
+      pkg.type,
+      "commonjs",
+      'package.json must explicitly set "type": "commonjs" to override ' +
+        "the parent package's ESM declaration",
+    );
+  });
+
+  test("all export conditions must use 'default' (not 'import'-only)", () => {
+    // The "import" condition key restricts resolution to ESM import
+    // statements only. Using "default" ensures the export works for both
+    // require() and import, which is essential for a CJS package that may
+    // be consumed from ESM code via Node's CJS interop.
+    const exportsMap = pkg.exports;
+    assert.ok(exportsMap, "package.json must have an exports map");
+
+    for (const [subpath, conditions] of Object.entries(exportsMap)) {
+      assert.ok(
+        !conditions.import || conditions.default,
+        `exports["${subpath}"] uses "import" condition without "default" — ` +
+          `this breaks CJS consumers and Node.js v24 strict resolution`,
+      );
+    }
+  });
+
+  test("native.ts source must not use bare import.meta.url (parse-time error in CJS)", () => {
+    // When compiled to CJS, import.meta is a *parse-time* syntax error --
+    // typeof guards don't help because Node rejects the syntax before
+    // executing any code.  The source must wrap import.meta access in
+    // an indirect eval so the CJS parser never sees the bare syntax.
+    const nativeSrc = readFileSync(
+      path.resolve(__dirname, "..", "native.ts"),
+      "utf8",
+    );
+
+    // Bare import.meta.url (NOT wrapped) would crash at parse time in CJS.
+    // These regexes match direct usage like fileURLToPath(import.meta.url)
+    // and createRequire(import.meta.url), but NOT indirect patterns that
+    // hide import.meta from the CJS parser.
+    const hasBareImportMetaDirname = /path\.dirname\(.*fileURLToPath\(import\.meta\.url\)\)/.test(nativeSrc);
+    const hasBareImportMetaRequire = /createRequire\(import\.meta\.url\)/.test(nativeSrc);
+
+    assert.ok(
+      !hasBareImportMetaDirname,
+      "native.ts must not use bare import.meta.url in fileURLToPath() -- " +
+        "this is a parse-time syntax error in CJS; use indirect eval",
+    );
+    assert.ok(
+      !hasBareImportMetaRequire,
+      "native.ts must not use bare import.meta.url in createRequire() -- " +
+        "this is a parse-time syntax error in CJS; use indirect eval",
+    );
+  });
+});
--- a/packages/native/src/native.ts
+++ b/packages/native/src/native.ts
@ -8,14 +8,15 @@
 *   3. native/addon/gsd_engine.dev.node (local debug build)
 */

-import { createRequire } from "node:module";
 import * as path from "node:path";
-import { fileURLToPath } from "node:url";

-const __dirname = path.dirname(fileURLToPath(import.meta.url));
-const require = createRequire(import.meta.url);
+// __dirname and require are available in both execution contexts:
+//   - CJS (production build via tsc): provided natively by Node
+//   - ESM (CI test loader): injected by the dist-redirect.mjs preamble
+const _dirname = __dirname;
+const _require = require;

-const addonDir = path.resolve(__dirname, "..", "..", "..", "native", "addon");
+const addonDir = path.resolve(_dirname, "..", "..", "..", "native", "addon");
 const platformTag = `${process.platform}-${process.arch}`;

 /** Map Node.js platform/arch to the npm package suffix */
@ -36,7 +37,7 @@ function loadNative(): Record<string, unknown> {
  const packageSuffix = platformPackageMap[platformTag];
  if (packageSuffix) {
    try {
-      _loadedSuccessfully = true; return require(`@gsd-build/engine-${packageSuffix}`) as Record<string, unknown>;
+      _loadedSuccessfully = true; return _require(`@gsd-build/engine-${packageSuffix}`) as Record<string, unknown>;
    } catch (err) {
      const message = err instanceof Error ? err.message : String(err);
      errors.push(`@gsd-build/engine-${packageSuffix}: ${message}`);
@ -46,7 +47,7 @@ function loadNative(): Record<string, unknown> {
  // 2. Try local release build (native/addon/gsd_engine.{platform}.node)
  const releasePath = path.join(addonDir, `gsd_engine.${platformTag}.node`);
  try {
-    _loadedSuccessfully = true; return require(releasePath) as Record<string, unknown>;
+    _loadedSuccessfully = true; return _require(releasePath) as Record<string, unknown>;
  } catch (err) {
    const message = err instanceof Error ? err.message : String(err);
    errors.push(`${releasePath}: ${message}`);
@ -55,7 +56,7 @@ function loadNative(): Record<string, unknown> {
  // 3. Try local dev build (native/addon/gsd_engine.dev.node)
  const devPath = path.join(addonDir, "gsd_engine.dev.node");
  try {
-    _loadedSuccessfully = true; return require(devPath) as Record<string, unknown>;
+    _loadedSuccessfully = true; return _require(devPath) as Record<string, unknown>;
  } catch (err) {
    const message = err instanceof Error ? err.message : String(err);
    errors.push(`${devPath}: ${message}`);
--- a/packages/pi-agent-core/src/agent-loop.test.ts
+++ b/packages/pi-agent-core/src/agent-loop.test.ts
@ -0,0 +1,45 @@
+// agent-loop pauseTurn handling tests
+// Verifies that pause_turn / pauseTurn stop reason causes the inner loop
+// to continue (re-invoke the LLM) instead of exiting.
+// Regression test for https://github.com/gsd-build/gsd-2/issues/2869
+
+import { describe, it } from "node:test";
+import assert from "node:assert/strict";
+import { readFileSync } from "node:fs";
+import { join, dirname } from "node:path";
+import { fileURLToPath } from "node:url";
+
+const __dirname = dirname(fileURLToPath(import.meta.url));
+
+describe("agent-loop — pauseTurn handling (#2869)", () => {
+	it("sets hasMoreToolCalls when stopReason is pauseTurn", () => {
+		const source = readFileSync(join(__dirname, "agent-loop.ts"), "utf-8");
+
+		// The agent loop must treat pauseTurn as a reason to continue the inner
+		// loop, just like toolUse. This prevents incomplete server_tool_use blocks
+		// from being saved to history, which would cause a 400 on the next request.
+		assert.match(
+			source,
+			/pauseTurn/,
+			"agent-loop.ts must handle the pauseTurn stop reason",
+		);
+
+		// Verify it sets hasMoreToolCalls = true for pauseTurn
+		assert.match(
+			source,
+			/stopReason\s*===?\s*["']pauseTurn["']/,
+			'agent-loop.ts must check for stopReason === "pauseTurn"',
+		);
+	});
+
+	it("pauseTurn is in the StopReason union type", () => {
+		// Read the pi-ai types to ensure pauseTurn is a valid StopReason
+		const typesPath = join(__dirname, "..", "..", "pi-ai", "src", "types.ts");
+		const typesSource = readFileSync(typesPath, "utf-8");
+		assert.match(
+			typesSource,
+			/["']pauseTurn["']/,
+			'StopReason type must include "pauseTurn"',
+		);
+	});
+});
--- a/packages/pi-agent-core/src/agent-loop.ts
+++ b/packages/pi-agent-core/src/agent-loop.ts
@ -231,9 +231,10 @@ async function runLoop(
 				return;
 			}

-			// Check for tool calls
+			// Check for tool calls or paused server turn
 			const toolCalls = message.content.filter((c) => c.type === "toolCall");
-			hasMoreToolCalls = toolCalls.length > 0;
+			hasMoreToolCalls =
+				toolCalls.length > 0 || message.stopReason === "pauseTurn";

 			const toolResults: ToolResultMessage[] = [];
 			if (hasMoreToolCalls && config.externalToolExecution) {
--- a/packages/pi-agent-core/src/proxy.ts
+++ b/packages/pi-agent-core/src/proxy.ts
@ -47,7 +47,7 @@ export type ProxyAssistantMessageEvent =
 	| { type: "toolcall_end"; contentIndex: number }
 	| {
 			type: "done";
-			reason: Extract<StopReason, "stop" | "length" | "toolUse">;
+			reason: Extract<StopReason, "stop" | "length" | "toolUse" | "pauseTurn">;
 			usage: AssistantMessage["usage"];
 	  }
 	| {
--- a/packages/pi-ai/src/env-api-keys.ts
+++ b/packages/pi-ai/src/env-api-keys.ts
@ -137,6 +137,7 @@ export function getEnvApiKey(provider: any): string | undefined {
 		"opencode-go": "OPENCODE_API_KEY",
 		"kimi-coding": "KIMI_API_KEY",
 		"alibaba-coding-plan": "ALIBABA_API_KEY",
+		ollama: "OLLAMA_API_KEY",
 		"ollama-cloud": "OLLAMA_API_KEY",
 		"custom-openai": "CUSTOM_OPENAI_API_KEY",
 	};
--- a/packages/pi-ai/src/index.ts
+++ b/packages/pi-ai/src/index.ts
@ -27,4 +27,5 @@ export type {
 } from "./utils/oauth/types.js";
 export * from "./utils/overflow.js";
 export * from "./utils/typebox-helpers.js";
+export * from "./utils/repair-tool-json.js";
 export * from "./utils/validation.js";
--- a/packages/pi-ai/src/providers/anthropic-shared.test.ts
+++ b/packages/pi-ai/src/providers/anthropic-shared.test.ts
@ -0,0 +1,29 @@
+import { describe, it } from "node:test";
+import assert from "node:assert/strict";
+import { mapStopReason } from "./anthropic-shared.js";
+
+describe("mapStopReason", () => {
+	it("maps end_turn to stop", () => {
+		assert.equal(mapStopReason("end_turn"), "stop");
+	});
+
+	it("maps max_tokens to length", () => {
+		assert.equal(mapStopReason("max_tokens"), "length");
+	});
+
+	it("maps tool_use to toolUse", () => {
+		assert.equal(mapStopReason("tool_use"), "toolUse");
+	});
+
+	it("maps pause_turn to pauseTurn (not stop)", () => {
+		// pause_turn means the server paused a long-running turn (e.g. native
+		// web search hit its iteration limit). Mapping it to "stop" causes the
+		// agent loop to exit, leaving an incomplete server_tool_use block in
+		// history which triggers a 400 on the next request.
+		assert.equal(mapStopReason("pause_turn"), "pauseTurn");
+	});
+
+	it("throws on unknown stop reason", () => {
+		assert.throws(() => mapStopReason("bogus"), /Unhandled stop reason/);
+	});
+});
--- a/packages/pi-ai/src/providers/anthropic-shared.ts
+++ b/packages/pi-ai/src/providers/anthropic-shared.ts
@ -31,6 +31,7 @@ import type {
 export type AnthropicApi = "anthropic-messages" | "anthropic-vertex";
 import type { AssistantMessageEventStream } from "../utils/event-stream.js";
 import { parseStreamingJson } from "../utils/json-parse.js";
+import { repairToolJson } from "../utils/repair-tool-json.js";
 import { sanitizeSurrogates } from "../utils/sanitize-unicode.js";
 import { transformMessages } from "./transform-messages.js";

@ -502,7 +503,7 @@ export function mapStopReason(reason: string): StopReason {
 		case "refusal":
 			return "error";
 		case "pause_turn":
-			return "stop";
+			return "pauseTurn";
 		case "stop_sequence":
 			return "stop";
 		case "sensitive":
@ -696,7 +697,21 @@ export function processAnthropicStream(
 								partial: output,
 							});
 						} else if (block.type === "toolCall") {
-							block.arguments = parseStreamingJson(block.partialJson);
+							// Try strict parse first; if it fails, attempt YAML bullet
+							// repair (#2660) before falling back to the lenient streaming
+							// parser which silently swallows errors.
+							const raw = block.partialJson ?? "";
+							let parsed: Record<string, any> | undefined;
+							try {
+								parsed = JSON.parse(raw);
+							} catch {
+								try {
+									parsed = JSON.parse(repairToolJson(raw));
+								} catch {
+									// Fall through to streaming parser
+								}
+							}
+							block.arguments = parsed ?? parseStreamingJson(block.partialJson);
 							delete (block as any).partialJson;
 							stream.push({
 								type: "toolcall_end",
--- a/packages/pi-ai/src/types.ts
+++ b/packages/pi-ai/src/types.ts
@ -43,6 +43,7 @@ export type KnownProvider =
 	| "opencode-go"
 	| "kimi-coding"
 	| "alibaba-coding-plan"
+	| "ollama"
 	| "ollama-cloud";
 export type Provider = KnownProvider | string;

@ -192,7 +193,7 @@ export interface Usage {
 	};
 }

-export type StopReason = "stop" | "length" | "toolUse" | "error" | "aborted";
+export type StopReason = "stop" | "length" | "toolUse" | "pauseTurn" | "error" | "aborted";

 export interface UserMessage {
 	role: "user";
@ -253,7 +254,7 @@ export type AssistantMessageEvent =
 	| { type: "toolcall_end"; contentIndex: number; toolCall: ToolCall; partial: AssistantMessage; malformedArguments?: boolean }
 	| { type: "server_tool_use"; contentIndex: number; partial: AssistantMessage }
 	| { type: "web_search_result"; contentIndex: number; partial: AssistantMessage }
-	| { type: "done"; reason: Extract<StopReason, "stop" | "length" | "toolUse">; message: AssistantMessage }
+	| { type: "done"; reason: Extract<StopReason, "stop" | "length" | "toolUse" | "pauseTurn">; message: AssistantMessage }
 	| { type: "error"; reason: Extract<StopReason, "aborted" | "error">; error: AssistantMessage };

 /**
--- a/packages/pi-ai/src/utils/json-parse.ts
+++ b/packages/pi-ai/src/utils/json-parse.ts
@ -1,14 +1,41 @@
 import { parseStreamingJson as nativeParseStreamingJson } from "@gsd/native";
+import { hasYamlBulletLists, repairToolJson } from "./repair-tool-json.js";

 /**
 * Attempts to parse potentially incomplete JSON during streaming.
 * Always returns a valid object, even if the JSON is incomplete.
 *
 * Uses the native Rust streaming JSON parser for performance.
+ * Falls back to YAML bullet-list repair when the native parser
+ * returns an empty object from input that contains YAML-style
+ * bullet lists copied from template formatting (#2660).
 *
 * @param partialJson The partial JSON string from streaming
 * @returns Parsed object or empty object if parsing fails
 */
 export function parseStreamingJson<T = any>(partialJson: string | undefined): T {
-	return nativeParseStreamingJson<T>(partialJson);
+	if (!partialJson || partialJson.trim() === "") {
+		return {} as T;
+	}
+
+	// Fast path: try native streaming parser first
+	const result = nativeParseStreamingJson<T>(partialJson);
+
+	// If the native parser returned a non-empty result, use it.
+	// Only attempt repair when the result is empty AND the input
+	// contains YAML bullet patterns (avoids unnecessary work).
+	if (
+		result &&
+		typeof result === "object" &&
+		Object.keys(result as object).length === 0 &&
+		hasYamlBulletLists(partialJson)
+	) {
+		try {
+			return JSON.parse(repairToolJson(partialJson)) as T;
+		} catch {
+			// Repair failed — return the empty object from native parser
+		}
+	}
+
+	return result;
 }
--- a/packages/pi-ai/src/utils/repair-tool-json.ts
+++ b/packages/pi-ai/src/utils/repair-tool-json.ts
@ -0,0 +1,88 @@
+/**
+ * Repair malformed JSON in LLM tool-call arguments.
+ *
+ * LLMs sometimes copy YAML template formatting into JSON tool arguments,
+ * producing patterns like:
+ *
+ *   "keyDecisions": - Used Web Notification API...,
+ *   "keyFiles": - src-tauri/src/lib.rs — Extended...
+ *
+ * instead of valid JSON arrays:
+ *
+ *   "keyDecisions": ["Used Web Notification API..."],
+ *   "keyFiles": ["src-tauri/src/lib.rs — Extended..."]
+ *
+ * This module detects and repairs such patterns before JSON.parse is called.
+ *
+ * @see https://github.com/gsd-build/gsd-2/issues/2660
+ */
+
+/**
+ * Detect whether a JSON string contains YAML-style bullet-list values
+ * (i.e. `"key": - item` instead of `"key": ["item"]`).
+ */
+export function hasYamlBulletLists(json: string): boolean {
+	// Match: "key": followed by whitespace then a dash-space pattern (YAML bullet)
+	// The negative lookahead excludes negative numbers (e.g. "key": -1)
+	return /"\s*:\s*-\s+(?!\d)/.test(json);
+}
+
+/**
+ * Attempt to repair YAML-style bullet lists embedded in a JSON string.
+ *
+ * Converts patterns like:
+ *   "keyDecisions": - Used Web Notification API..., "keyFiles": - file1
+ *
+ * Into:
+ *   "keyDecisions": ["Used Web Notification API..."], "keyFiles": ["file1"]
+ *
+ * Returns the original string unchanged if no YAML patterns are detected
+ * or if the repair itself would produce invalid JSON.
+ */
+export function repairToolJson(json: string): string {
+	if (!hasYamlBulletLists(json)) {
+		return json;
+	}
+
+	// Strategy: find each `"key": - item1\n  - item2\n  - item3` region and
+	// wrap items in a JSON array.
+	//
+	// We work on the raw string because the JSON is not parseable yet.
+	// The pattern we target:
+	//   "someKey":\s*- item text (possibly multiline)
+	//   optionally followed by more `- item` lines
+	//   terminated by the next `"key":` or `}` or end of string.
+
+	let repaired = json;
+
+	// Match a key followed by YAML-style bullet list.
+	// Capture: (1) the key portion including colon, (2) the bullet-list body,
+	// (3) the separator (comma or empty) before the next key/bracket.
+	// The bullet list body ends at the next `"key":` or `}` or `]` or end of string.
+	const keyBulletPattern =
+		/("(?:[^"\\]|\\.)*"\s*:\s*)(- .+?)(,?\s*)(?="(?:[^"\\]|\\.)*"\s*:|[}\]]|$)/gs;
+
+	repaired = repaired.replace(
+		keyBulletPattern,
+		(_match, keyPart: string, bulletBody: string, separator: string) => {
+			// Split the bullet body into individual items on `- ` boundaries.
+			// Items may contain embedded newlines for multi-line values.
+			const items = bulletBody
+				.split(/\n?\s*- /)
+				.filter((s) => s.trim().length > 0)
+				.map((s) => s.replace(/,\s*$/, "").trim());
+
+			// JSON-encode each item as a string, then wrap in an array.
+			const jsonArray = "[" + items.map((item) => JSON.stringify(item)).join(", ") + "]";
+
+			// Re-emit the separator (comma) so the next key is properly delimited
+			const sep = separator.trim() ? separator : (/^\s*"/.test(separator + "x") ? ", " : "");
+			return keyPart + jsonArray + sep;
+		},
+	);
+
+	// Strip trailing commas before } or ] (common in repaired JSON)
+	repaired = repaired.replace(/,(\s*[}\]])/g, "$1");
+
+	return repaired;
+}
--- a/packages/pi-ai/src/utils/tests/repair-tool-json.test.ts
+++ b/packages/pi-ai/src/utils/tests/repair-tool-json.test.ts
@ -0,0 +1,102 @@
+import { describe, test } from "node:test";
+import assert from "node:assert/strict";
+import { repairToolJson, hasYamlBulletLists } from "../repair-tool-json.js";
+
+describe("repairToolJson — YAML bullet list repair (#2660)", () => {
+	// ── Detection ──────────────────────────────────────────────────────────
+
+	test("hasYamlBulletLists detects YAML-style bullets", () => {
+		assert.equal(
+			hasYamlBulletLists('"keyDecisions": - Used Web Notification API'),
+			true,
+		);
+	});
+
+	test("hasYamlBulletLists ignores negative numbers", () => {
+		assert.equal(
+			hasYamlBulletLists('"offset": -1'),
+			false,
+			"negative number should not be detected as YAML bullet",
+		);
+	});
+
+	test("hasYamlBulletLists returns false for valid JSON", () => {
+		assert.equal(
+			hasYamlBulletLists('{"keyDecisions": ["item1", "item2"]}'),
+			false,
+		);
+	});
+
+	// ── Single bullet item ────────────────────────────────────────────────
+
+	test("repairs single YAML bullet to JSON array", () => {
+		const malformed = '{"keyDecisions": - Used Web Notification API}';
+		const repaired = repairToolJson(malformed);
+		const parsed = JSON.parse(repaired);
+		assert.deepEqual(parsed.keyDecisions, ["Used Web Notification API"]);
+	});
+
+	// ── Multiple bullet items (newline-separated) ─────────────────────────
+
+	test("repairs multiple YAML bullets separated by newlines", () => {
+		const malformed =
+			'{"keyDecisions": - Used Web Notification API\n  - Chose Tauri over Electron\n  - Adopted SQLite for storage, "title": "M005"}';
+		const repaired = repairToolJson(malformed);
+		const parsed = JSON.parse(repaired);
+		assert.deepEqual(parsed.keyDecisions, [
+			"Used Web Notification API",
+			"Chose Tauri over Electron",
+			"Adopted SQLite for storage",
+		]);
+		assert.equal(parsed.title, "M005");
+	});
+
+	// ── Multiple fields with YAML bullets ─────────────────────────────────
+
+	test("repairs multiple fields each with YAML bullet lists", () => {
+		const malformed =
+			'{"keyDecisions": - decision one\n  - decision two, "keyFiles": - src/lib.rs — Extended menu\n  - src/main.ts — Entry point, "title": "done"}';
+		const repaired = repairToolJson(malformed);
+		const parsed = JSON.parse(repaired);
+		assert.deepEqual(parsed.keyDecisions, ["decision one", "decision two"]);
+		assert.deepEqual(parsed.keyFiles, [
+			"src/lib.rs \u2014 Extended menu",
+			"src/main.ts \u2014 Entry point",
+		]);
+		assert.equal(parsed.title, "done");
+	});
+
+	// ── Exact reproduction from issue #2660 ───────────────────────────────
+
+	test("repairs the exact malformed JSON from issue #2660", () => {
+		const malformed = `{"milestoneId": "M005", "title": "Native Desktop Polish", "oneLiner": "summary", "narrative": "details", "successCriteriaResults": "all pass", "definitionOfDoneResults": "all done", "requirementOutcomes": "met", "keyDecisions": - Used Web Notification API (new window.Notification()) instead of Tauri sendNotification wrapper, "keyFiles": - src-tauri/src/lib.rs \u2014 Extended menu builder with notification toggle, "lessonsLearned": - Always test notification permissions before sending, "followUps": "none", "deviations": "none", "verificationPassed": true}`;
+
+		const repaired = repairToolJson(malformed);
+		const parsed = JSON.parse(repaired);
+
+		assert.equal(parsed.milestoneId, "M005");
+		assert.equal(parsed.title, "Native Desktop Polish");
+		assert.ok(Array.isArray(parsed.keyDecisions), "keyDecisions should be an array");
+		assert.ok(parsed.keyDecisions[0].includes("Web Notification API"));
+		assert.ok(Array.isArray(parsed.keyFiles), "keyFiles should be an array");
+		assert.ok(parsed.keyFiles[0].includes("src-tauri/src/lib.rs"));
+		assert.ok(Array.isArray(parsed.lessonsLearned), "lessonsLearned should be an array");
+		assert.equal(parsed.verificationPassed, true);
+	});
+
+	// ── Passthrough for valid JSON ────────────────────────────────────────
+
+	test("returns valid JSON unchanged", () => {
+		const valid = '{"keyDecisions": ["item1", "item2"], "count": -5}';
+		const result = repairToolJson(valid);
+		assert.equal(result, valid, "valid JSON should be returned unchanged");
+	});
+
+	// ── Negative numbers are preserved ────────────────────────────────────
+
+	test("does not mangle negative numbers", () => {
+		const valid = '{"offset": -1, "limit": -100}';
+		const result = repairToolJson(valid);
+		assert.equal(result, valid);
+	});
+});
--- a/packages/pi-coding-agent/src/core/agent-session.ts
+++ b/packages/pi-coding-agent/src/core/agent-session.ts
@ -72,6 +72,7 @@ import type { ModelRegistry } from "./model-registry.js";
 import { expandPromptTemplate, type PromptTemplate } from "./prompt-templates.js";
 import type { ResourceExtensionPaths, ResourceLoader } from "./resource-loader.js";
 import { RetryHandler } from "./retry-handler.js";
+import { isImageDimensionError, downsizeConversationImages } from "./image-overflow-recovery.js";
 import type { BranchSummaryEntry, SessionManager } from "./session-manager.js";
 import { getLatestCompactionEntry } from "./session-manager.js";
 import type { SettingsManager } from "./settings-manager.js";
@ -136,7 +137,8 @@ export type AgentSessionEvent =
 	| { type: "auto_retry_end"; success: boolean; attempt: number; finalError?: string }
 	| { type: "fallback_provider_switch"; from: string; to: string; reason: string }
 	| { type: "fallback_provider_restored"; provider: string; reason: string }
-	| { type: "fallback_chain_exhausted"; reason: string };
+	| { type: "fallback_chain_exhausted"; reason: string }
+	| { type: "image_overflow_recovery"; strippedCount: number; imageCount: number };

 /** Listener function for agent session events */
 export type AgentSessionEventListener = (event: AgentSessionEvent) => void;
@ -487,6 +489,36 @@ export class AgentSession {
 				if (didRetry) return; // Retry was initiated, don't proceed to compaction
 			}

+			// Check for image dimension overflow (many-image 400 error).
+			// When a session accumulates many images, the API rejects requests
+			// whose images exceed the many-image dimension limit. Strip older
+			// images from the conversation and auto-retry. (#2874)
+			if (
+				msg.stopReason === "error" &&
+				isImageDimensionError(msg.errorMessage)
+			) {
+				const messages = this.agent.state.messages;
+				const result = downsizeConversationImages(messages as Message[]);
+				if (result.processed) {
+					// Remove the trailing error assistant message, then replace
+					if (messages.length > 0 && messages[messages.length - 1].role === "assistant") {
+						this.agent.replaceMessages(messages.slice(0, -1));
+					}
+
+					this._emit({
+						type: "image_overflow_recovery",
+						strippedCount: result.strippedCount,
+						imageCount: result.imageCount,
+					});
+
+					// Auto-retry after downsizing
+					setTimeout(() => {
+						this.agent.continue().catch(() => {});
+					}, 0);
+					return;
+				}
+			}
+
 			await this._compactionOrchestrator.checkCompaction(msg);
 		}
 	}
@ -1986,6 +2018,11 @@ export class AgentSession {
 					const messages = this.agent.state.messages;
 					const last = messages[messages.length - 1];
 					if (last?.role === "assistant" && (last as AssistantMessage).stopReason === "error") {
+						// If the error was an image dimension overflow, downsize images
+						// before retrying so the retry doesn't hit the same error (#2874)
+						if (isImageDimensionError((last as AssistantMessage).errorMessage)) {
+							downsizeConversationImages(messages as Message[]);
+						}
 						this.agent.replaceMessages(messages.slice(0, -1));
 						this.agent.continue().catch((err) => {
 							runner.emitError({
--- a/packages/pi-coding-agent/src/core/compaction/compaction.test.ts
+++ b/packages/pi-coding-agent/src/core/compaction/compaction.test.ts
@ -0,0 +1,236 @@
+/**
+ * Tests for chunked compaction fallback when messages exceed model context window.
+ * Regression test for #2932.
+ */
+
+import assert from "node:assert/strict";
+import { describe, it, mock } from "node:test";
+
+import type { AgentMessage } from "@gsd/pi-agent-core";
+import type { Model, AssistantMessage } from "@gsd/pi-ai";
+
+import { generateSummary, estimateTokens, chunkMessages } from "./compaction.js";
+
+// ---------------------------------------------------------------------------
+// Helpers
+// ---------------------------------------------------------------------------
+
+/** Create a user message with approximately `tokenCount` tokens (chars = tokens * 4). */
+function makeUserMessage(tokenCount: number): AgentMessage {
+	const text = "x".repeat(tokenCount * 4);
+	return { role: "user", content: text } as unknown as AgentMessage;
+}
+
+/** Create a mock model with a given context window. */
+function makeModel(contextWindow: number): Model<any> {
+	return {
+		id: "test-model",
+		name: "Test Model",
+		api: "anthropic-messages",
+		provider: "anthropic",
+		baseUrl: "https://api.test",
+		reasoning: false,
+		input: ["text"],
+		cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 },
+		contextWindow,
+		maxTokens: 4096,
+	} as Model<any>;
+}
+
+function makeFakeResponse(text: string): AssistantMessage {
+	return {
+		content: [{ type: "text", text }],
+		stopReason: "end_turn",
+	} as unknown as AssistantMessage;
+}
+
+// ---------------------------------------------------------------------------
+// chunkMessages tests
+// ---------------------------------------------------------------------------
+
+describe("chunkMessages", () => {
+	it("returns a single chunk when messages fit in budget", () => {
+		const messages: AgentMessage[] = [
+			makeUserMessage(1_000),
+			makeUserMessage(1_000),
+		];
+		const chunks = chunkMessages(messages, 100_000);
+		assert.equal(chunks.length, 1);
+		assert.equal(chunks[0].length, 2);
+	});
+
+	it("splits messages into multiple chunks when they exceed budget", () => {
+		const messages: AgentMessage[] = [
+			makeUserMessage(50_000),
+			makeUserMessage(50_000),
+			makeUserMessage(50_000),
+		];
+		// Budget of 80k tokens means each 50k message gets its own chunk
+		// (or two fit together if budget allows)
+		const chunks = chunkMessages(messages, 80_000);
+		assert.ok(chunks.length > 1, `Expected multiple chunks, got ${chunks.length}`);
+		// All messages should be present across chunks
+		const totalMessages = chunks.reduce((sum, c) => sum + c.length, 0);
+		assert.equal(totalMessages, 3);
+	});
+
+	it("puts a single oversized message in its own chunk", () => {
+		const messages: AgentMessage[] = [
+			makeUserMessage(200_000), // Way over any reasonable budget
+		];
+		const chunks = chunkMessages(messages, 80_000);
+		assert.equal(chunks.length, 1);
+		assert.equal(chunks[0].length, 1);
+	});
+
+	it("preserves message order across chunks", () => {
+		// Create messages with identifiable sizes
+		const messages: AgentMessage[] = [
+			makeUserMessage(30_000), // ~30k tokens
+			makeUserMessage(30_000),
+			makeUserMessage(30_000),
+			makeUserMessage(30_000),
+		];
+		const chunks = chunkMessages(messages, 50_000);
+		// Reconstruct original order
+		const flat = chunks.flat();
+		assert.equal(flat.length, 4);
+		for (let i = 0; i < flat.length; i++) {
+			assert.strictEqual(flat[i], messages[i], `Message ${i} should be in order`);
+		}
+	});
+});
+
+// ---------------------------------------------------------------------------
+// generateSummary chunked fallback tests
+// ---------------------------------------------------------------------------
+
+describe("generateSummary — chunked fallback (#2932)", () => {
+	it("calls _completeFn multiple times when messages exceed model context window", async () => {
+		// Arrange: 3 messages of ~80k tokens each = ~240k total, model has 200k window
+		const messages: AgentMessage[] = [
+			makeUserMessage(80_000),
+			makeUserMessage(80_000),
+			makeUserMessage(80_000),
+		];
+		const model = makeModel(200_000);
+		const reserveTokens = 16_384;
+
+		// Verify our test setup: messages really do exceed the model window
+		let totalTokens = 0;
+		for (const m of messages) totalTokens += estimateTokens(m);
+		assert.ok(
+			totalTokens > model.contextWindow,
+			`Test setup: ${totalTokens} tokens should exceed ${model.contextWindow} context window`,
+		);
+
+		// Track calls
+		const calls: string[] = [];
+		const mockComplete = mock.fn(async (_model: any, context: any, _options: any) => {
+			const userMsg = context.messages?.[0];
+			const text =
+				typeof userMsg?.content === "string"
+					? userMsg.content
+					: userMsg?.content?.[0]?.text ?? "";
+
+			if (text.includes("<previous-summary>")) {
+				calls.push("update");
+			} else {
+				calls.push("initial");
+			}
+			return makeFakeResponse("Summary of chunk");
+		});
+
+		const summary = await generateSummary(
+			messages,
+			model,
+			reserveTokens,
+			undefined, // apiKey
+			undefined, // signal
+			undefined, // customInstructions
+			undefined, // previousSummary
+			mockComplete, // _completeFn override for testing
+		);
+
+		// Assert: should have called completeSimple more than once (chunked)
+		assert.ok(
+			mockComplete.mock.callCount() > 1,
+			`Expected multiple calls for chunked summarization, got ${mockComplete.mock.callCount()}`,
+		);
+
+		// First call should be an initial summary, subsequent should be updates
+		assert.equal(calls[0], "initial", "First chunk should use initial summarization prompt");
+		for (let i = 1; i < calls.length; i++) {
+			assert.equal(calls[i], "update", `Chunk ${i + 1} should use update summarization prompt`);
+		}
+
+		// Should return a non-empty summary
+		assert.ok(summary.length > 0, "Summary should not be empty");
+	});
+
+	it("uses single-pass when messages fit within model context window", async () => {
+		const messages: AgentMessage[] = [
+			makeUserMessage(10_000),
+			makeUserMessage(10_000),
+		];
+		const model = makeModel(200_000);
+		const reserveTokens = 16_384;
+
+		// Verify test setup
+		let totalTokens = 0;
+		for (const m of messages) totalTokens += estimateTokens(m);
+		assert.ok(
+			totalTokens < model.contextWindow,
+			`Test setup: ${totalTokens} tokens should fit in ${model.contextWindow} context window`,
+		);
+
+		const mockComplete = mock.fn(async () => makeFakeResponse("Single pass summary"));
+
+		await generateSummary(messages, model, reserveTokens, undefined, undefined, undefined, undefined, mockComplete);
+
+		assert.equal(
+			mockComplete.mock.callCount(),
+			1,
+			"Should use single-pass summarization when messages fit in context window",
+		);
+	});
+
+	it("passes previousSummary through chunked summarization", async () => {
+		const messages: AgentMessage[] = [
+			makeUserMessage(80_000),
+			makeUserMessage(80_000),
+			makeUserMessage(80_000),
+		];
+		const model = makeModel(200_000);
+		const reserveTokens = 16_384;
+		const previousSummary = "Previous session summary content";
+
+		const prompts: string[] = [];
+		const mockComplete = mock.fn(async (_model: any, context: any) => {
+			const userMsg = context.messages?.[0];
+			const text =
+				typeof userMsg?.content === "string"
+					? userMsg.content
+					: userMsg?.content?.[0]?.text ?? "";
+			prompts.push(text);
+			return makeFakeResponse("Chunk summary");
+		});
+
+		await generateSummary(
+			messages,
+			model,
+			reserveTokens,
+			undefined,
+			undefined,
+			undefined,
+			previousSummary,
+			mockComplete,
+		);
+
+		// First chunk should include the previousSummary
+		assert.ok(
+			prompts[0].includes(previousSummary),
+			"First chunk should incorporate the previousSummary",
+		);
+	});
+});
--- a/packages/pi-coding-agent/src/core/compaction/compaction.ts
+++ b/packages/pi-coding-agent/src/core/compaction/compaction.ts
@ -489,9 +489,49 @@ Use this EXACT format:

 Keep each section concise. Preserve exact file paths, function names, and error messages.`;

+/**
+ * Split messages into chunks where each chunk's estimated token count
+ * stays within `maxTokensPerChunk`. A single message that exceeds the
+ * budget is placed alone in its own chunk (never dropped).
+ */
+export function chunkMessages(messages: AgentMessage[], maxTokensPerChunk: number): AgentMessage[][] {
+	const chunks: AgentMessage[][] = [];
+	let currentChunk: AgentMessage[] = [];
+	let currentTokens = 0;
+
+	for (const msg of messages) {
+		const msgTokens = estimateTokens(msg);
+
+		if (currentChunk.length > 0 && currentTokens + msgTokens > maxTokensPerChunk) {
+			// Current chunk is full — start a new one
+			chunks.push(currentChunk);
+			currentChunk = [msg];
+			currentTokens = msgTokens;
+		} else {
+			currentChunk.push(msg);
+			currentTokens += msgTokens;
+		}
+	}
+
+	if (currentChunk.length > 0) {
+		chunks.push(currentChunk);
+	}
+
+	return chunks;
+}
+
+/** Type for the completion function, allowing injection for tests. */
+type CompleteFn = typeof completeSimple;
+
 /**
 * Generate a summary of the conversation using the LLM.
 * If previousSummary is provided, uses the update prompt to merge.
+ *
+ * When the messages exceed the model's context window, automatically
+ * falls back to chunked summarization: summarize the first chunk,
+ * then iteratively merge subsequent chunks using the update prompt.
+ *
+ * @param _completeFn - Internal override for testing; defaults to completeSimple.
 */
 export async function generateSummary(
 	currentMessages: AgentMessage[],
@ -501,6 +541,59 @@ export async function generateSummary(
 	signal?: AbortSignal,
 	customInstructions?: string,
 	previousSummary?: string,
+	_completeFn?: CompleteFn,
+): Promise<string> {
+	const complete = _completeFn ?? completeSimple;
+
+	// Estimate total tokens for the messages to summarize
+	let totalTokens = 0;
+	for (const msg of currentMessages) {
+		totalTokens += estimateTokens(msg);
+	}
+
+	// Overhead for the prompt framing, system prompt, and response budget
+	const promptOverhead = 4_000;
+	const maxTokens = Math.floor(0.8 * reserveTokens);
+	const maxInputTokens = (model.contextWindow || 200_000) - reserveTokens - promptOverhead;
+
+	// If messages fit in the context window, use single-pass summarization
+	if (totalTokens <= maxInputTokens) {
+		return singlePassSummary(currentMessages, model, reserveTokens, apiKey, signal, customInstructions, previousSummary, complete);
+	}
+
+	// Chunked fallback: split messages and iteratively summarize
+	const chunks = chunkMessages(currentMessages, maxInputTokens);
+	let runningSummary = previousSummary;
+
+	for (let i = 0; i < chunks.length; i++) {
+		runningSummary = await singlePassSummary(
+			chunks[i],
+			model,
+			reserveTokens,
+			apiKey,
+			signal,
+			customInstructions,
+			runningSummary,
+			complete,
+		);
+	}
+
+	return runningSummary!;
+}
+
+/**
+ * Single-pass summarization of messages using the LLM.
+ * If previousSummary is provided, uses the update prompt to merge.
+ */
+async function singlePassSummary(
+	currentMessages: AgentMessage[],
+	model: Model<any>,
+	reserveTokens: number,
+	apiKey: string | undefined,
+	signal?: AbortSignal,
+	customInstructions?: string,
+	previousSummary?: string,
+	complete: CompleteFn = completeSimple,
 ): Promise<string> {
 	const maxTokens = Math.floor(0.8 * reserveTokens);

@ -526,7 +619,7 @@ export async function generateSummary(
 		? { maxTokens, signal, apiKey, reasoning: "high" as const }
 		: { maxTokens, signal, apiKey };

-	const response = await completeSimple(
+	const response = await complete(
 		model,
 		{ systemPrompt: SUMMARIZATION_SYSTEM_PROMPT, messages: createSummarizationMessage(promptText) },
 		completionOptions,
--- a/packages/pi-coding-agent/src/core/exec.ts
+++ b/packages/pi-coding-agent/src/core/exec.ts
@ -39,7 +39,9 @@ export async function execCommand(
 	return new Promise((resolve) => {
 		const proc = spawn(command, args, {
 			cwd,
-			shell: false,
+			// On Windows, npm/npx/tsc etc. are .cmd scripts that require shell
+			// resolution.  Without this, spawn fails with ENOENT or EINVAL (#2854).
+			shell: process.platform === "win32",
 			stdio: ["ignore", "pipe", "pipe"],
 		});

--- a/packages/pi-coding-agent/src/core/extensions/extension-manifest.test.ts
+++ b/packages/pi-coding-agent/src/core/extensions/extension-manifest.test.ts
@ -0,0 +1,77 @@
+// GSD-2 — Extension Manifest Tests
+// Copyright (c) 2026 Jeremy McSpadden <jeremy@fluxlabs.net>
+
+import { describe, it } from "node:test";
+import assert from "node:assert/strict";
+import { mkdtempSync, mkdirSync, writeFileSync } from "node:fs";
+import { join } from "node:path";
+import { tmpdir } from "node:os";
+import { readManifest, readManifestFromEntryPath } from "./extension-manifest.js";
+
+describe("readManifest", () => {
+	it("returns null for missing directory", () => {
+		assert.equal(readManifest("/nonexistent/path"), null);
+	});
+
+	it("returns null for directory without manifest", () => {
+		const dir = mkdtempSync(join(tmpdir(), "ext-manifest-"));
+		assert.equal(readManifest(dir), null);
+	});
+
+	it("returns null for invalid JSON", () => {
+		const dir = mkdtempSync(join(tmpdir(), "ext-manifest-"));
+		writeFileSync(join(dir, "extension-manifest.json"), "not json{{{", "utf-8");
+		assert.equal(readManifest(dir), null);
+	});
+
+	it("returns null for manifest missing required fields", () => {
+		const dir = mkdtempSync(join(tmpdir(), "ext-manifest-"));
+		writeFileSync(
+			join(dir, "extension-manifest.json"),
+			JSON.stringify({ id: "test", name: "test" }),
+		);
+		assert.equal(readManifest(dir), null);
+	});
+
+	it("returns valid manifest", () => {
+		const dir = mkdtempSync(join(tmpdir(), "ext-manifest-"));
+		const manifest = {
+			id: "test-ext",
+			name: "Test Extension",
+			version: "1.0.0",
+			tier: "bundled",
+			requires: { platform: ">=2.29.0" },
+		};
+		writeFileSync(join(dir, "extension-manifest.json"), JSON.stringify(manifest));
+		const result = readManifest(dir);
+		assert.equal(result?.id, "test-ext");
+		assert.equal(result?.tier, "bundled");
+	});
+});
+
+describe("readManifestFromEntryPath", () => {
+	it("reads manifest from parent of entry path", () => {
+		const dir = mkdtempSync(join(tmpdir(), "ext-manifest-"));
+		const extDir = join(dir, "my-ext");
+		mkdirSync(extDir);
+		writeFileSync(
+			join(extDir, "extension-manifest.json"),
+			JSON.stringify({
+				id: "my-ext",
+				name: "My Extension",
+				version: "1.0.0",
+				tier: "community",
+			}),
+		);
+		writeFileSync(join(extDir, "index.ts"), "");
+
+		const result = readManifestFromEntryPath(join(extDir, "index.ts"));
+		assert.equal(result?.id, "my-ext");
+		assert.equal(result?.tier, "community");
+	});
+
+	it("returns null when entry path parent has no manifest", () => {
+		const dir = mkdtempSync(join(tmpdir(), "ext-manifest-"));
+		assert.equal(readManifestFromEntryPath(join(dir, "index.ts")), null);
+	});
+});
--- a/packages/pi-coding-agent/src/core/extensions/extension-manifest.ts
+++ b/packages/pi-coding-agent/src/core/extensions/extension-manifest.ts
@ -0,0 +1,62 @@
+// GSD-2 — Extension Manifest: Types and reading for extension-manifest.json
+// Copyright (c) 2026 Jeremy McSpadden <jeremy@fluxlabs.net>
+
+import { existsSync, readFileSync } from "node:fs";
+import { dirname, join } from "node:path";
+
+// ─── Types ──────────────────────────────────────────────────────────────────
+
+export interface ExtensionManifest {
+	id: string;
+	name: string;
+	version: string;
+	description: string;
+	tier: "core" | "bundled" | "community";
+	requires: { platform: string };
+	provides?: {
+		tools?: string[];
+		commands?: string[];
+		hooks?: string[];
+		shortcuts?: string[];
+	};
+	dependencies?: {
+		extensions?: string[];
+		runtime?: string[];
+	};
+}
+
+// ─── Validation ─────────────────────────────────────────────────────────────
+
+function isManifest(data: unknown): data is ExtensionManifest {
+	if (typeof data !== "object" || data === null) return false;
+	const obj = data as Record<string, unknown>;
+	return (
+		typeof obj.id === "string" &&
+		typeof obj.name === "string" &&
+		typeof obj.version === "string" &&
+		typeof obj.tier === "string"
+	);
+}
+
+// ─── Reading ────────────────────────────────────────────────────────────────
+
+/** Read extension-manifest.json from a directory. Returns null if missing or invalid. */
+export function readManifest(extensionDir: string): ExtensionManifest | null {
+	const manifestPath = join(extensionDir, "extension-manifest.json");
+	if (!existsSync(manifestPath)) return null;
+	try {
+		const raw = JSON.parse(readFileSync(manifestPath, "utf-8"));
+		return isManifest(raw) ? raw : null;
+	} catch {
+		return null;
+	}
+}
+
+/**
+ * Given an entry path (e.g. `.../extensions/browser-tools/index.ts`),
+ * resolve the parent directory and read its manifest.
+ */
+export function readManifestFromEntryPath(entryPath: string): ExtensionManifest | null {
+	const dir = dirname(entryPath);
+	return readManifest(dir);
+}
--- a/packages/pi-coding-agent/src/core/extensions/extension-sort.test.ts
+++ b/packages/pi-coding-agent/src/core/extensions/extension-sort.test.ts
@ -0,0 +1,134 @@
+// GSD-2 — Extension Sort Tests
+// Copyright (c) 2026 Jeremy McSpadden <jeremy@fluxlabs.net>
+
+import { describe, it } from "node:test";
+import assert from "node:assert/strict";
+import { mkdtempSync, mkdirSync, writeFileSync } from "node:fs";
+import { join } from "node:path";
+import { tmpdir } from "node:os";
+import { sortExtensionPaths } from "./extension-sort.js";
+
+function createExtDir(base: string, id: string, deps?: string[]): string {
+	const dir = join(base, id);
+	mkdirSync(dir, { recursive: true });
+	writeFileSync(
+		join(dir, "extension-manifest.json"),
+		JSON.stringify({
+			id,
+			name: id,
+			version: "1.0.0",
+			tier: "bundled",
+			requires: { platform: ">=2.29.0" },
+			...(deps ? { dependencies: { extensions: deps } } : {}),
+		}),
+	);
+	writeFileSync(join(dir, "index.ts"), `export default function() {}`);
+	return join(dir, "index.ts");
+}
+
+describe("sortExtensionPaths", () => {
+	it("returns empty for empty input", () => {
+		const result = sortExtensionPaths([]);
+		assert.deepEqual(result.sortedPaths, []);
+		assert.deepEqual(result.warnings, []);
+	});
+
+	it("sorts independent extensions alphabetically", () => {
+		const base = mkdtempSync(join(tmpdir(), "ext-sort-"));
+		const pathC = createExtDir(base, "charlie");
+		const pathA = createExtDir(base, "alpha");
+		const pathB = createExtDir(base, "bravo");
+
+		const result = sortExtensionPaths([pathC, pathA, pathB]);
+		assert.deepEqual(result.sortedPaths, [pathA, pathB, pathC]);
+		assert.equal(result.warnings.length, 0);
+	});
+
+	it("sorts dependencies before dependents", () => {
+		const base = mkdtempSync(join(tmpdir(), "ext-sort-"));
+		const pathBase = createExtDir(base, "base-ext");
+		const pathDependent = createExtDir(base, "dependent-ext", ["base-ext"]);
+
+		// Pass dependent first — sort should reorder
+		const result = sortExtensionPaths([pathDependent, pathBase]);
+		assert.deepEqual(result.sortedPaths, [pathBase, pathDependent]);
+		assert.equal(result.warnings.length, 0);
+	});
+
+	it("handles deep dependency chains", () => {
+		const base = mkdtempSync(join(tmpdir(), "ext-sort-"));
+		const pathA = createExtDir(base, "a");
+		const pathB = createExtDir(base, "b", ["a"]);
+		const pathC = createExtDir(base, "c", ["b"]);
+
+		const result = sortExtensionPaths([pathC, pathB, pathA]);
+		assert.deepEqual(result.sortedPaths, [pathA, pathB, pathC]);
+		assert.equal(result.warnings.length, 0);
+	});
+
+	it("warns about missing dependencies but still loads", () => {
+		const base = mkdtempSync(join(tmpdir(), "ext-sort-"));
+		const pathExt = createExtDir(base, "my-ext", ["nonexistent"]);
+
+		const result = sortExtensionPaths([pathExt]);
+		assert.equal(result.sortedPaths.length, 1);
+		assert.equal(result.sortedPaths[0], pathExt);
+		assert.equal(result.warnings.length, 1);
+		assert.match(result.warnings[0].message, /nonexistent.*not installed/);
+	});
+
+	it("warns about cycles but still loads both", () => {
+		const base = mkdtempSync(join(tmpdir(), "ext-sort-"));
+		const pathA = createExtDir(base, "cycle-a", ["cycle-b"]);
+		const pathB = createExtDir(base, "cycle-b", ["cycle-a"]);
+
+		const result = sortExtensionPaths([pathA, pathB]);
+		assert.equal(result.sortedPaths.length, 2);
+		assert.ok(result.warnings.length > 0);
+		assert.ok(result.warnings.some((w) => w.message.includes("cycle")));
+	});
+
+	it("silently ignores self-dependencies", () => {
+		const base = mkdtempSync(join(tmpdir(), "ext-sort-"));
+		const pathExt = createExtDir(base, "self-dep", ["self-dep"]);
+
+		const result = sortExtensionPaths([pathExt]);
+		assert.deepEqual(result.sortedPaths, [pathExt]);
+		assert.equal(result.warnings.length, 0);
+	});
+
+	it("prepends extensions without manifests", () => {
+		const base = mkdtempSync(join(tmpdir(), "ext-sort-"));
+		const noManifestDir = join(base, "no-manifest");
+		mkdirSync(noManifestDir, { recursive: true });
+		writeFileSync(join(noManifestDir, "index.ts"), `export default function() {}`);
+		const noManifestPath = join(noManifestDir, "index.ts");
+
+		const pathWithManifest = createExtDir(base, "with-manifest");
+
+		const result = sortExtensionPaths([pathWithManifest, noManifestPath]);
+		assert.equal(result.sortedPaths[0], noManifestPath);
+		assert.equal(result.sortedPaths[1], pathWithManifest);
+	});
+
+	it("handles non-array dependencies gracefully", () => {
+		const base = mkdtempSync(join(tmpdir(), "ext-sort-"));
+		const dir = join(base, "bad-deps");
+		mkdirSync(dir, { recursive: true });
+		writeFileSync(
+			join(dir, "extension-manifest.json"),
+			JSON.stringify({
+				id: "bad-deps",
+				name: "bad-deps",
+				version: "1.0.0",
+				tier: "bundled",
+				dependencies: { extensions: "not-an-array" },
+			}),
+		);
+		writeFileSync(join(dir, "index.ts"), `export default function() {}`);
+
+		const result = sortExtensionPaths([join(dir, "index.ts")]);
+		assert.equal(result.sortedPaths.length, 1);
+		assert.equal(result.warnings.length, 0);
+	});
+});
--- a/packages/pi-coding-agent/src/core/extensions/extension-sort.ts
+++ b/packages/pi-coding-agent/src/core/extensions/extension-sort.ts
@ -0,0 +1,137 @@
+// GSD-2 — Extension Sort: Topological dependency ordering
+// Copyright (c) 2026 Jeremy McSpadden <jeremy@fluxlabs.net>
+
+import { readManifestFromEntryPath } from "./extension-manifest.js";
+
+export interface SortWarning {
+	declaringId: string;
+	missingId: string;
+	message: string;
+}
+
+export interface SortResult {
+	sortedPaths: string[];
+	warnings: SortWarning[];
+}
+
+/**
+ * Sort extension entry paths in topological dependency-first order using Kahn's BFS algorithm.
+ *
+ * - Extensions without manifests are prepended in input order.
+ * - Missing dependencies produce a structured warning but do not block loading.
+ * - Cycles produce warnings; cycle participants are appended alphabetically.
+ * - Self-dependencies are silently ignored.
+ */
+export function sortExtensionPaths(paths: string[]): SortResult {
+	const warnings: SortWarning[] = [];
+	const pathsWithoutId: string[] = [];
+	const idToPath = new Map<string, string>();
+
+	// Step 1: Build ID map
+	for (const p of paths) {
+		const manifest = readManifestFromEntryPath(p);
+		if (!manifest) {
+			pathsWithoutId.push(p);
+		} else {
+			idToPath.set(manifest.id, p);
+		}
+	}
+
+	// Step 2: Build graph — inDegree and dependents adjacency
+	const inDegree = new Map<string, number>();
+	const dependents = new Map<string, string[]>(); // dep → [ids that depend on dep]
+
+	for (const id of idToPath.keys()) {
+		if (!inDegree.has(id)) inDegree.set(id, 0);
+		if (!dependents.has(id)) dependents.set(id, []);
+	}
+
+	for (const [id, entryPath] of idToPath) {
+		const manifest = readManifestFromEntryPath(entryPath);
+		const rawDeps = manifest?.dependencies?.extensions ?? [];
+		const deps = Array.isArray(rawDeps) ? rawDeps : [];
+
+		for (const depId of deps) {
+			// Silently ignore self-deps
+			if (depId === id) continue;
+
+			if (!idToPath.has(depId)) {
+				// Missing dependency — warn and skip edge
+				warnings.push({
+					declaringId: id,
+					missingId: depId,
+					message: `Extension '${id}' declares dependency '${depId}' which is not installed — loading anyway`,
+				});
+				continue;
+			}
+
+			// Valid edge: id depends on depId → increment inDegree[id], add id to dependents[depId]
+			inDegree.set(id, (inDegree.get(id) ?? 0) + 1);
+			const depDependents = dependents.get(depId) ?? [];
+			depDependents.push(id);
+			dependents.set(depId, depDependents);
+		}
+	}
+
+	// Step 3: Kahn's algorithm — start with nodes that have inDegree 0
+	const sorted: string[] = [];
+	// Ready queue: IDs with inDegree 0, maintained in alphabetical order
+	const ready: string[] = [...idToPath.keys()]
+		.filter((id) => inDegree.get(id) === 0)
+		.sort();
+
+	while (ready.length > 0) {
+		const id = ready.shift()!;
+		sorted.push(idToPath.get(id)!);
+
+		const deps = dependents.get(id) ?? [];
+		for (const depId of deps) {
+			const newDegree = (inDegree.get(depId) ?? 0) - 1;
+			inDegree.set(depId, newDegree);
+			if (newDegree === 0) {
+				// Insert into ready queue maintaining alphabetical order
+				const insertIdx = ready.findIndex((r) => r > depId);
+				if (insertIdx === -1) {
+					ready.push(depId);
+				} else {
+					ready.splice(insertIdx, 0, depId);
+				}
+			}
+		}
+	}
+
+	// Step 4: Cycle handling — any remaining IDs with inDegree > 0
+	const cycleIds = [...idToPath.keys()]
+		.filter((id) => (inDegree.get(id) ?? 0) > 0)
+		.sort();
+
+	if (cycleIds.length > 0) {
+		const cycleSet = new Set(cycleIds);
+
+		for (const id of cycleIds) {
+			const entryPath = idToPath.get(id)!;
+			const manifest = readManifestFromEntryPath(entryPath);
+			const rawDeps = manifest?.dependencies?.extensions ?? [];
+			const deps = Array.isArray(rawDeps) ? rawDeps : [];
+
+			for (const depId of deps) {
+				if (depId === id) continue;
+				if (!cycleSet.has(depId)) continue;
+
+				// Both id and depId are in cycle — emit warning
+				warnings.push({
+					declaringId: id,
+					missingId: depId,
+					message: `Extension '${id}' and '${depId}' form a dependency cycle — loading both anyway (alphabetical order)`,
+				});
+			}
+
+			sorted.push(entryPath);
+		}
+	}
+
+	return {
+		sortedPaths: [...pathsWithoutId, ...sorted],
+		warnings,
+	};
+}
--- a/packages/pi-coding-agent/src/core/extensions/index.ts
+++ b/packages/pi-coding-agent/src/core/extensions/index.ts
@ -2,6 +2,10 @@
 * Extension system for lifecycle events and custom tools.
 */

+export type { ExtensionManifest } from "./extension-manifest.js";
+export { readManifest, readManifestFromEntryPath } from "./extension-manifest.js";
+export type { SortResult, SortWarning } from "./extension-sort.js";
+export { sortExtensionPaths } from "./extension-sort.js";
 export type { SlashCommandInfo, SlashCommandLocation, SlashCommandSource } from "../slash-commands.js";
 export {
 	createExtensionRuntime,
--- a/packages/pi-coding-agent/src/core/extensions/loader.ts
+++ b/packages/pi-coding-agent/src/core/extensions/loader.ts
@ -941,6 +941,11 @@ function discoverExtensionsInDir(dir: string): string[] {

 /**
 * Discover and load extensions from standard locations.
+ *
+ * @deprecated Use DefaultResourceLoader.reload() instead — this function is
+ * not called in the GSD loading flow. Extension discovery happens through
+ * DefaultPackageManager.resolve() → addAutoDiscoveredResources(). Kept for
+ * backwards compatibility with direct pi-coding-agent consumers.
 */
 export async function discoverAndLoadExtensions(
 	configuredPaths: string[],
--- a/packages/pi-coding-agent/src/core/image-overflow-recovery.test.ts
+++ b/packages/pi-coding-agent/src/core/image-overflow-recovery.test.ts
@ -0,0 +1,228 @@
+import assert from "node:assert/strict";
+import { describe, it } from "node:test";
+import {
+	isImageDimensionError,
+	MANY_IMAGE_MAX_DIMENSION,
+	downsizeConversationImages,
+} from "./image-overflow-recovery.js";
+import type { Message } from "@gsd/pi-ai";
+
+// ─── isImageDimensionError ────────────────────────────────────────────────────
+
+describe("isImageDimensionError", () => {
+	it("returns true for Anthropic many-image dimension error", () => {
+		const errorMessage =
+			'Error: 400 {"type":"error","error":{"type":"invalid_request_error","message":"messages.125.content.38.image.source.base64.data: At least one of the image dimensions exceed max allowed size for many-image requests: 2000 pixels"}}';
+		assert.equal(isImageDimensionError(errorMessage), true);
+	});
+
+	it("returns true for bare dimension exceed message", () => {
+		const errorMessage =
+			"image dimensions exceed max allowed size for many-image requests: 2000 pixels";
+		assert.equal(isImageDimensionError(errorMessage), true);
+	});
+
+	it("returns false for unrelated 400 error", () => {
+		const errorMessage =
+			'Error: 400 {"type":"error","error":{"type":"invalid_request_error","message":"max_tokens: 4096 > 2048"}}';
+		assert.equal(isImageDimensionError(errorMessage), false);
+	});
+
+	it("returns false for rate limit error", () => {
+		assert.equal(isImageDimensionError("429 rate limit exceeded"), false);
+	});
+
+	it("returns false for empty string", () => {
+		assert.equal(isImageDimensionError(""), false);
+	});
+
+	it("returns false for undefined", () => {
+		assert.equal(isImageDimensionError(undefined), false);
+	});
+});
+
+// ─── MANY_IMAGE_MAX_DIMENSION ─────────────────────────────────────────────────
+
+describe("MANY_IMAGE_MAX_DIMENSION", () => {
+	it("is less than 2000 (the API-enforced limit)", () => {
+		assert.ok(MANY_IMAGE_MAX_DIMENSION < 2000);
+	});
+
+	it("is a positive integer", () => {
+		assert.ok(MANY_IMAGE_MAX_DIMENSION > 0);
+		assert.equal(MANY_IMAGE_MAX_DIMENSION, Math.floor(MANY_IMAGE_MAX_DIMENSION));
+	});
+});
+
+// ─── helpers ──────────────────────────────────────────────────────────────────
+
+function makeUserMsg(content: Message["content"] & any): Message {
+	return { role: "user", content, timestamp: Date.now() } as Message;
+}
+
+function makeAssistantMsg(text: string): Message {
+	return {
+		role: "assistant",
+		content: [{ type: "text", text }],
+		api: "anthropic-messages",
+		provider: "anthropic",
+		model: "claude-opus-4-6",
+		usage: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0, totalTokens: 0, cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0, total: 0 } },
+		stopReason: "stop",
+		timestamp: Date.now(),
+	} as Message;
+}
+
+function makeToolResultMsg(images: number): Message {
+	const content: any[] = [];
+	for (let i = 0; i < images; i++) {
+		content.push({ type: "image", data: `img${i}`, mimeType: "image/png" });
+	}
+	return {
+		role: "toolResult",
+		toolCallId: `tc${Math.random()}`,
+		toolName: "screenshot",
+		content,
+		isError: false,
+		timestamp: Date.now(),
+	} as Message;
+}
+
+// ─── downsizeConversationImages ───────────────────────────────────────────────
+
+describe("downsizeConversationImages", () => {
+	it("counts images in user and toolResult messages", () => {
+		const messages: Message[] = [
+			makeUserMsg([
+				{ type: "image", data: "img1", mimeType: "image/png" },
+				{ type: "image", data: "img2", mimeType: "image/png" },
+			]),
+			makeAssistantMsg("I see them"),
+			makeToolResultMsg(1),
+		];
+
+		const result = downsizeConversationImages(messages);
+		assert.equal(result.imageCount, 3);
+	});
+
+	it("returns processed=false when no images present", () => {
+		const messages: Message[] = [
+			makeUserMsg("just text"),
+			makeAssistantMsg("reply"),
+		];
+
+		const result = downsizeConversationImages(messages);
+		assert.equal(result.imageCount, 0);
+		assert.equal(result.processed, false);
+	});
+
+	it("returns processed=false when image count <= RECENT_IMAGES_TO_KEEP", () => {
+		const messages: Message[] = [
+			makeUserMsg([
+				{ type: "image", data: "img1", mimeType: "image/png" },
+			]),
+			makeAssistantMsg("got it"),
+		];
+
+		const result = downsizeConversationImages(messages);
+		assert.equal(result.imageCount, 1);
+		assert.equal(result.processed, false);
+	});
+
+	it("strips older images when many images present, preserves recent ones", () => {
+		const messages: Message[] = [];
+		for (let i = 0; i < 25; i++) {
+			messages.push(
+				makeUserMsg([
+					{ type: "text", text: `message ${i}` },
+					{ type: "image", data: `img${i}`, mimeType: "image/png" },
+				]),
+			);
+			messages.push(makeAssistantMsg(`reply ${i}`));
+		}
+
+		const result = downsizeConversationImages(messages);
+		assert.ok(result.processed);
+		assert.equal(result.imageCount, 25);
+		assert.equal(result.strippedCount, 20); // 25 - 5 recent
+
+		// Count remaining images
+		let remainingImages = 0;
+		for (const msg of messages) {
+			if (msg.role === "assistant") continue;
+			if (typeof msg.content === "string") continue;
+			const arr = msg.content as any[];
+			for (const block of arr) {
+				if (block.type === "image") remainingImages++;
+			}
+		}
+		assert.equal(remainingImages, 5, "Should keep exactly 5 most recent images");
+
+		// The 5 most recent user messages (indices 40,42,44,46,48) should have images
+		for (let i = 20; i < 25; i++) {
+			const userMsg = messages[i * 2]; // user messages at even indices
+			const arr = userMsg.content as any[];
+			const hasImage = arr.some((c: any) => c.type === "image");
+			assert.ok(hasImage, `Recent message ${i} should retain its image`);
+		}
+	});
+
+	it("adds text placeholder when stripping an image", () => {
+		const messages: Message[] = [];
+		for (let i = 0; i < 10; i++) {
+			messages.push(
+				makeUserMsg([
+					{ type: "image", data: `img${i}`, mimeType: "image/jpeg" },
+				]),
+			);
+			messages.push(makeAssistantMsg(`reply ${i}`));
+		}
+
+		downsizeConversationImages(messages);
+
+		// First message's image should have been replaced with text
+		const firstMsg = messages[0];
+		const arr = firstMsg.content as any[];
+		const placeholder = arr.find(
+			(c: any) => c.type === "text" && c.text.includes("[image removed"),
+		);
+		assert.ok(placeholder, "Stripped image should be replaced with text placeholder");
+		assert.ok(
+			placeholder.text.includes("image/jpeg"),
+			"Placeholder should mention original mime type",
+		);
+	});
+
+	it("handles toolResult messages with images", () => {
+		const messages: Message[] = [];
+		for (let i = 0; i < 10; i++) {
+			messages.push(makeToolResultMsg(1));
+			messages.push(makeAssistantMsg(`reply ${i}`));
+		}
+
+		const result = downsizeConversationImages(messages);
+		assert.equal(result.imageCount, 10);
+		assert.equal(result.strippedCount, 5);
+		assert.ok(result.processed);
+	});
+
+	it("handles mixed user and toolResult images", () => {
+		const messages: Message[] = [];
+		for (let i = 0; i < 8; i++) {
+			messages.push(
+				makeUserMsg([
+					{ type: "text", text: `check ${i}` },
+					{ type: "image", data: `uimg${i}`, mimeType: "image/png" },
+				]),
+			);
+			messages.push(makeAssistantMsg(`processing ${i}`));
+			messages.push(makeToolResultMsg(1));
+			messages.push(makeAssistantMsg(`done ${i}`));
+		}
+
+		const result = downsizeConversationImages(messages);
+		// 8 user images + 8 tool result images = 16 total
+		assert.equal(result.imageCount, 16);
+		assert.equal(result.strippedCount, 11); // 16 - 5 recent
+	});
+});
--- a/packages/pi-coding-agent/src/core/image-overflow-recovery.ts
+++ b/packages/pi-coding-agent/src/core/image-overflow-recovery.ts
@ -0,0 +1,118 @@
+/**
+ * Image overflow recovery for many-image sessions.
+ *
+ * When a conversation accumulates many images (screenshots, file reads, etc.),
+ * the Anthropic API enforces a stricter per-image dimension limit (2000px) for
+ * "many-image requests." This module detects the resulting 400 error and
+ * recovers by stripping older images from the conversation history, preserving
+ * the most recent ones to maintain session continuity.
+ *
+ * @see https://github.com/gsd-build/gsd-2/issues/2874
+ */
+
+import type { Message, ImageContent, TextContent } from "@gsd/pi-ai";
+
+/**
+ * Maximum image dimension (px) that the Anthropic API allows in many-image
+ * requests. Images at or above this size in a large conversation will be
+ * rejected with a 400 error. We use 1568 as the safe ceiling (Anthropic's
+ * recommended max for multi-image requests).
+ */
+export const MANY_IMAGE_MAX_DIMENSION = 1568;
+
+/**
+ * Number of recent images to preserve when stripping old images.
+ * Keeps the most recent screenshots/images so the model retains visual context
+ * for the current task.
+ */
+const RECENT_IMAGES_TO_KEEP = 5;
+
+/**
+ * Regex matching the Anthropic API error for oversized images in many-image requests.
+ */
+const IMAGE_DIMENSION_ERROR_RE =
+	/image.dimensions?.exceed.*max.*allowed.*size.*many.image/i;
+
+/**
+ * Detect whether an error message is the Anthropic "image dimensions exceed max
+ * allowed size for many-image requests" 400 error.
+ */
+export function isImageDimensionError(errorMessage: string | undefined | null): boolean {
+	if (!errorMessage) return false;
+	return IMAGE_DIMENSION_ERROR_RE.test(errorMessage);
+}
+
+export interface DownsizeResult {
+	/** Total number of images found in the conversation */
+	imageCount: number;
+	/** Whether any images were stripped */
+	processed: boolean;
+	/** Number of images that were stripped */
+	strippedCount: number;
+}
+
+/**
+ * Strip older images from conversation messages to recover from many-image
+ * dimension errors. Preserves the N most recent images and replaces older ones
+ * with a text placeholder.
+ *
+ * Mutates messages in place (same pattern as replaceMessages/compaction).
+ *
+ * Accepts Message[] (the LLM message union) so it works with both
+ * agent.state.messages and session entries.
+ */
+export function downsizeConversationImages(messages: Message[]): DownsizeResult {
+	// First pass: collect all image locations (message index + content index)
+	const imageLocations: Array<{ msgIdx: number; contentIdx: number }> = [];
+
+	for (let msgIdx = 0; msgIdx < messages.length; msgIdx++) {
+		const msg = messages[msgIdx];
+		if (msg.role === "assistant") continue;
+
+		// UserMessage can have string content; ToolResultMessage always has array
+		if (msg.role === "user" && typeof msg.content === "string") continue;
+
+		const contentArr = msg.content as (TextContent | ImageContent)[];
+		if (!Array.isArray(contentArr)) continue;
+
+		for (let contentIdx = 0; contentIdx < contentArr.length; contentIdx++) {
+			if (contentArr[contentIdx].type === "image") {
+				imageLocations.push({ msgIdx, contentIdx });
+			}
+		}
+	}
+
+	const imageCount = imageLocations.length;
+	if (imageCount === 0) {
+		return { imageCount: 0, processed: false, strippedCount: 0 };
+	}
+
+	// Determine which images to strip (all except the N most recent)
+	const stripCount = Math.max(0, imageCount - RECENT_IMAGES_TO_KEEP);
+	if (stripCount === 0) {
+		return { imageCount, processed: false, strippedCount: 0 };
+	}
+
+	const toStrip = imageLocations.slice(0, stripCount);
+
+	// Second pass: replace stripped images with text placeholder.
+	// Process in reverse order to maintain content indices.
+	for (let i = toStrip.length - 1; i >= 0; i--) {
+		const { msgIdx, contentIdx } = toStrip[i];
+		const msg = messages[msgIdx];
+		if (msg.role === "assistant") continue;
+		if (msg.role === "user" && typeof msg.content === "string") continue;
+
+		const contentArr = msg.content as (TextContent | ImageContent)[];
+		const imageBlock = contentArr[contentIdx] as ImageContent;
+		const mimeType = imageBlock.mimeType || "image/unknown";
+
+		// Replace the image block with a text placeholder
+		(contentArr as any[])[contentIdx] = {
+			type: "text",
+			text: `[image removed to reduce context size — was ${mimeType}]`,
+		} as TextContent;
+	}
+
+	return { imageCount, processed: true, strippedCount: stripCount };
+}
--- a/packages/pi-coding-agent/src/core/index.ts
+++ b/packages/pi-coding-agent/src/core/index.ts
@ -29,6 +29,7 @@ export {
 	type ExecResult,
 	type Extension,
 	type ExtensionAPI,
+	type ExtensionManifest,
 	type ExtensionCommandContext,
 	type ExtensionContext,
 	type ExtensionError,
@ -53,6 +54,11 @@ export {
 	type SessionSwitchEvent,
 	type SessionTreeEvent,
 	type ToolCallEvent,
+	readManifest,
+	readManifestFromEntryPath,
+	type SortResult,
+	type SortWarning,
+	sortExtensionPaths,
 	type ToolDefinition,
 	type ToolRenderResultOptions,
 	type ToolResultEvent,
--- a/packages/pi-coding-agent/src/core/lsp/index.ts
+++ b/packages/pi-coding-agent/src/core/lsp/index.ts
@ -340,6 +340,9 @@ async function runWorkspaceDiagnostics(
 	const proc = spawn(cmd, cmdArgs, {
 		cwd,
 		stdio: ["ignore", "pipe", "pipe"],
+		// On Windows, project-type commands (tsc, cargo, etc.) may be .cmd
+		// wrappers that need shell resolution to avoid ENOENT/EINVAL (#2854).
+		shell: process.platform === "win32",
 	});
 	const abortHandler = () => {
 		proc.kill();
--- a/packages/pi-coding-agent/src/core/lsp/lspmux.ts
+++ b/packages/pi-coding-agent/src/core/lsp/lspmux.ts
@ -90,6 +90,9 @@ async function checkServerRunning(binaryPath: string): Promise<boolean> {
 	try {
 		const proc = spawn(binaryPath, ["status"], {
 			stdio: ["ignore", "pipe", "pipe"],
+			// On Windows, the binary may be a .cmd wrapper requiring shell
+			// resolution to avoid ENOENT/EINVAL (#2854).
+			shell: process.platform === "win32",
 		});

 		const exited = await Promise.race([
--- a/packages/pi-coding-agent/src/core/messages.test.ts
+++ b/packages/pi-coding-agent/src/core/messages.test.ts
@ -0,0 +1,114 @@
+/**
+ * messages.test.ts — Tests for convertToLlm custom message handling.
+ *
+ * Reproduction test for #3026: background job completion notifications
+ * delivered as custom messages must be clearly distinguishable from
+ * user-typed input when converted to LLM messages.
+ */
+
+import test from "node:test";
+import assert from "node:assert/strict";
+import { convertToLlm, type CustomMessage } from "./messages.js";
+
+/** Extract the first content block from a message, asserting array content. */
+function firstTextBlock(msg: ReturnType<typeof convertToLlm>[number]) {
+	const { content } = msg;
+	assert.ok(Array.isArray(content), "Expected content to be an array");
+	const block = content[0];
+	assert.ok(typeof block === "object" && block !== null, "Expected first block to be an object");
+	return block;
+}
+
+test("convertToLlm wraps custom messages with system notification prefix", () => {
+	const customMsg: CustomMessage = {
+		role: "custom",
+		customType: "async_job_result",
+		content: "**Background job done: bg_abc123** (sleep 2, 2.1s)\n\ndone",
+		display: true,
+		timestamp: Date.now(),
+	};
+
+	const result = convertToLlm([customMsg]);
+	assert.equal(result.length, 1);
+	assert.equal(result[0].role, "user");
+
+	// The content must include a system notification wrapper so the LLM
+	// does not confuse it with user input (#3026).
+	const text = firstTextBlock(result[0]);
+	assert.equal(text.type, "text");
+	assert.ok(
+		"text" in text && text.text.includes("[system notification"),
+		"Custom message should be wrapped with system notification marker",
+	);
+});
+
+test("convertToLlm wraps custom messages with array content", () => {
+	const customMsg: CustomMessage = {
+		role: "custom",
+		customType: "bg-shell-status",
+		content: [{ type: "text", text: "Background processes:\n  ✓ bg1 dev-server :3000" }],
+		display: false,
+		timestamp: Date.now(),
+	};
+
+	const result = convertToLlm([customMsg]);
+	assert.equal(result.length, 1);
+	assert.equal(result[0].role, "user");
+
+	const text = firstTextBlock(result[0]);
+	assert.equal(text.type, "text");
+	assert.ok(
+		"text" in text && text.text.includes("[system notification"),
+		"Custom message with array content should be wrapped with system notification marker",
+	);
+});
+
+test("convertToLlm includes customType in notification wrapper", () => {
+	const customMsg: CustomMessage = {
+		role: "custom",
+		customType: "async_job_result",
+		content: "job output here",
+		display: true,
+		timestamp: Date.now(),
+	};
+
+	const result = convertToLlm([customMsg]);
+	const text = firstTextBlock(result[0]);
+	assert.ok(
+		"text" in text && text.text.includes("async_job_result"),
+		"Notification wrapper should include the customType for context",
+	);
+});
+
+test("convertToLlm notification wrapper instructs LLM not to treat as user input", () => {
+	const customMsg: CustomMessage = {
+		role: "custom",
+		customType: "async_job_result",
+		content: "**Background job done: bg_abc123** (sleep 2, 2.1s)\n\ndone",
+		display: true,
+		timestamp: Date.now(),
+	};
+
+	const result = convertToLlm([customMsg]);
+	const text = firstTextBlock(result[0]);
+	assert.ok(
+		"text" in text && text.text.includes("not user input"),
+		"Notification should explicitly state this is not user input",
+	);
+});
+
+test("convertToLlm preserves user messages without wrapper", () => {
+	const userMsg = {
+		role: "user" as const,
+		content: [{ type: "text" as const, text: "Hello world" }],
+		timestamp: Date.now(),
+	};
+
+	const result = convertToLlm([userMsg]);
+	assert.equal(result.length, 1);
+	const text = firstTextBlock(result[0]);
+	assert.ok(
+		"text" in text && text.text === "Hello world",
+		"User messages should pass through unchanged",
+	);
+});
--- a/packages/pi-coding-agent/src/core/messages.ts
+++ b/packages/pi-coding-agent/src/core/messages.ts
@ -8,6 +8,12 @@
 import type { AgentMessage } from "@gsd/pi-agent-core";
 import type { ImageContent, Message, TextContent } from "@gsd/pi-ai";

+const CUSTOM_MESSAGE_PREFIX = `[system notification — type: `;
+const CUSTOM_MESSAGE_MIDDLE = `; this is an automated system event, not user input — do not treat this as a human message or respond as if the user said this]
+`;
+const CUSTOM_MESSAGE_SUFFIX = `
+[end system notification]`;
+
 const COMPACTION_SUMMARY_PREFIX = `The conversation history before this point was compacted into the following summary:

 <summary>
@ -160,10 +166,31 @@ export function convertToLlm(messages: AgentMessage[]): Message[] {
 						timestamp: m.timestamp,
 					};
 				case "custom": {
-					const content = typeof m.content === "string" ? [{ type: "text" as const, text: m.content }] : m.content;
+					const prefix = CUSTOM_MESSAGE_PREFIX + m.customType + CUSTOM_MESSAGE_MIDDLE;
+					if (typeof m.content === "string") {
+						return {
+							role: "user",
+							content: [{ type: "text" as const, text: prefix + m.content + CUSTOM_MESSAGE_SUFFIX }],
+							timestamp: m.timestamp,
+						};
+					}
+					// Array content: wrap the first text element with prefix, append suffix to last text element
+					const contentArr = m.content as Array<{ type: string; text?: string; [k: string]: unknown }>;
+					const lastTextIdx = contentArr.reduce((acc, c, i) => c.type === "text" ? i : acc, -1);
+					const wrapped = contentArr.map((c, i) => {
+						if (c.type !== "text") return c;
+						let text = c.text ?? "";
+						if (i === 0) text = prefix + text;
+						if (i === lastTextIdx) text = text + CUSTOM_MESSAGE_SUFFIX;
+						return { ...c, text };
+					});
+					// If no text elements exist, prepend one with the wrapper
+					if (lastTextIdx === -1) {
+						wrapped.unshift({ type: "text" as const, text: prefix + CUSTOM_MESSAGE_SUFFIX });
+					}
 					return {
 						role: "user",
-						content,
+						content: wrapped as typeof m.content,
 						timestamp: m.timestamp,
 					};
 				}
--- a/packages/pi-coding-agent/src/core/model-resolver.ts
+++ b/packages/pi-coding-agent/src/core/model-resolver.ts
@ -37,6 +37,7 @@ const defaultModelPerProvider: Record<KnownProvider, string> = {
 	"opencode-go": "kimi-k2.5",
 	"kimi-coding": "kimi-k2-thinking",
 	"alibaba-coding-plan": "qwen3.5-plus",
+	ollama: "llama3.1:8b",
 	"ollama-cloud": "qwen3:32b",
 };

--- a/packages/pi-coding-agent/src/core/resource-loader.ts
+++ b/packages/pi-coding-agent/src/core/resource-loader.ts
@ -129,6 +129,12 @@ export interface DefaultResourceLoaderOptions {
 	appendSystemPrompt?: string;
 	/** Names of bundled extensions (used to identify built-in extensions in conflict detection). */
 	bundledExtensionNames?: Set<string>;
+	/**
+	 * Transform extension paths before loading. Receives the merged list of all
+	 * discovered extension paths and returns a (possibly reordered/filtered) list.
+	 * Use this to apply dependency sorting or registry-based filtering.
+	 */
+	extensionPathsTransform?: (paths: string[]) => { paths: string[]; diagnostics?: string[] };
 	extensionsOverride?: (base: LoadExtensionsResult) => LoadExtensionsResult;
 	skillsOverride?: (base: { skills: Skill[]; diagnostics: ResourceDiagnostic[] }) => {
 		skills: Skill[];
@ -167,6 +173,7 @@ export class DefaultResourceLoader implements ResourceLoader {
 	private systemPromptSource?: string;
 	private appendSystemPromptSource?: string;
 	private bundledExtensionNames: Set<string>;
+	private extensionPathsTransform?: (paths: string[]) => { paths: string[]; diagnostics?: string[] };
 	private extensionsOverride?: (base: LoadExtensionsResult) => LoadExtensionsResult;
 	private skillsOverride?: (base: { skills: Skill[]; diagnostics: ResourceDiagnostic[] }) => {
 		skills: Skill[];
@ -223,6 +230,7 @@ export class DefaultResourceLoader implements ResourceLoader {
 		this.systemPromptSource = options.systemPrompt;
 		this.appendSystemPromptSource = options.appendSystemPrompt;
 		this.bundledExtensionNames = options.bundledExtensionNames ?? new Set();
+		this.extensionPathsTransform = options.extensionPathsTransform;
 		this.extensionsOverride = options.extensionsOverride;
 		this.skillsOverride = options.skillsOverride;
 		this.promptsOverride = options.promptsOverride;
@ -378,10 +386,21 @@ export class DefaultResourceLoader implements ResourceLoader {
 		const cliEnabledPrompts = getEnabledPaths(cliExtensionPaths.prompts);
 		const cliEnabledThemes = getEnabledPaths(cliExtensionPaths.themes);

-		const extensionPaths = this.noExtensions
+		let extensionPaths = this.noExtensions
 			? cliEnabledExtensions
 			: this.mergePaths(cliEnabledExtensions, enabledExtensions);

+		// Apply path transform (dependency sorting, registry filtering) if provided
+		if (this.extensionPathsTransform) {
+			const transformed = this.extensionPathsTransform(extensionPaths);
+			extensionPaths = transformed.paths;
+			if (transformed.diagnostics?.length) {
+				for (const msg of transformed.diagnostics) {
+					process.stderr.write(`[extensions] ${msg}\n`);
+				}
+			}
+		}
+
 		const extensionsResult = await loadExtensions(extensionPaths, this.cwd, this.eventBus);
 		const inlineExtensions = await this.loadExtensionFactories(extensionsResult.runtime);
 		extensionsResult.extensions.push(...inlineExtensions.extensions);
--- a/packages/pi-coding-agent/src/core/retry-handler.test.ts
+++ b/packages/pi-coding-agent/src/core/retry-handler.test.ts
@ -0,0 +1,255 @@
+/**
+ * RetryHandler tests — long-context entitlement 429 error handling (#2803)
+ *
+ * Verifies that "Extra usage is required for long context requests" errors
+ * are classified as quota_exhausted (not rate_limit) and trigger a model
+ * downgrade from [1m] to base when no cross-provider fallback exists.
+ */
+
+import { describe, it, beforeEach, mock, type Mock } from "node:test";
+import assert from "node:assert/strict";
+import { RetryHandler, type RetryHandlerDeps } from "./retry-handler.js";
+import type { Api, AssistantMessage, Model } from "@gsd/pi-ai";
+import type { FallbackResolver } from "./fallback-resolver.js";
+import type { ModelRegistry } from "./model-registry.js";
+import type { SettingsManager } from "./settings-manager.js";
+
+// ─── Helpers ────────────────────────────────────────────────────────────────
+
+function createMockModel(provider: string, id: string): Model<Api> {
+	return {
+		id,
+		name: id,
+		api: "anthropic" as Api,
+		provider,
+		baseUrl: "https://api.anthropic.com",
+		reasoning: false,
+		input: ["text"],
+		cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 },
+		contextWindow: 1_000_000,
+		maxTokens: 16384,
+	} as Model<Api>;
+}
+
+function errorMessage(msg: string): AssistantMessage {
+	return {
+		role: "assistant",
+		content: [],
+		api: "anthropic-messages",
+		provider: "anthropic",
+		model: "claude-opus-4-6[1m]",
+		usage: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0, totalTokens: 0, cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0, total: 0 } },
+		stopReason: "error",
+		errorMessage: msg,
+		timestamp: Date.now(),
+	} as AssistantMessage;
+}
+
+interface MockDeps {
+	deps: RetryHandlerDeps;
+	emittedEvents: Array<Record<string, any>>;
+	continueFn: Mock<() => Promise<void>>;
+	onModelChangeFn: Mock<(model: Model<any>) => void>;
+	markUsageLimitReached: Mock<(...args: any[]) => boolean>;
+	findFallback: Mock<(...args: any[]) => Promise<any>>;
+	findModel: Mock<(provider: string, modelId: string) => Model<Api> | undefined>;
+}
+
+function createMockDeps(overrides?: {
+	model?: Model<Api>;
+	retryEnabled?: boolean;
+	markUsageLimitReachedResult?: boolean;
+	fallbackResult?: any;
+	findModelResult?: (provider: string, modelId: string) => Model<Api> | undefined;
+}): MockDeps {
+	const model = overrides?.model ?? createMockModel("anthropic", "claude-opus-4-6[1m]");
+	const emittedEvents: Array<Record<string, any>> = [];
+	const continueFn = mock.fn(async () => {});
+	const onModelChangeFn = mock.fn((_model: Model<any>) => {});
+	const markUsageLimitReached = mock.fn(
+		() => overrides?.markUsageLimitReachedResult ?? false,
+	);
+	const findFallback = mock.fn(async () => overrides?.fallbackResult ?? null);
+	const findModel = mock.fn(
+		overrides?.findModelResult ?? ((_provider: string, _modelId: string) => undefined),
+	);
+
+	const messages: Array<{ role: string } & Record<string, any>> = [];
+
+	const deps: RetryHandlerDeps = {
+		agent: {
+			continue: continueFn,
+			state: { messages },
+			setModel: mock.fn(),
+			replaceMessages: mock.fn((newMessages: any[]) => {
+				messages.length = 0;
+				messages.push(...newMessages);
+			}),
+		} as any,
+		settingsManager: {
+			getRetryEnabled: () => overrides?.retryEnabled ?? true,
+			getRetrySettings: () => ({
+				enabled: overrides?.retryEnabled ?? true,
+				maxRetries: 5,
+				baseDelayMs: 1000,
+				maxDelayMs: 30000,
+			}),
+		} as unknown as SettingsManager,
+		modelRegistry: {
+			authStorage: {
+				markUsageLimitReached,
+			},
+			find: findModel,
+		} as unknown as ModelRegistry,
+		fallbackResolver: {
+			findFallback,
+		} as unknown as FallbackResolver,
+		getModel: () => model,
+		getSessionId: () => "test-session",
+		emit: (event: any) => emittedEvents.push(event),
+		onModelChange: onModelChangeFn,
+	};
+
+	return { deps, emittedEvents, continueFn, onModelChangeFn, markUsageLimitReached, findFallback, findModel };
+}
+
+// ─── _classifyErrorType (tested via handleRetryableError behavior) ──────────
+
+describe("RetryHandler — long-context entitlement 429 (#2803)", () => {
+
+	describe("error classification", () => {
+		it("classifies 'Extra usage is required for long context requests' as quota_exhausted, not rate_limit", async () => {
+			// When the error is classified as quota_exhausted AND no alternate credentials
+			// AND no fallback, the handler should emit fallback_chain_exhausted and stop.
+			// If misclassified as rate_limit, it would enter the backoff loop instead.
+			const { deps, emittedEvents, findModel } = createMockDeps({
+				model: createMockModel("anthropic", "claude-opus-4-6[1m]"),
+				markUsageLimitReachedResult: false, // no alternate credentials
+				fallbackResult: null, // no cross-provider fallback
+				findModelResult: () => undefined, // no base model either
+			});
+
+			const handler = new RetryHandler(deps);
+			const msg = errorMessage(
+				'429 {"type":"error","error":{"type":"rate_limit_error","message":"Extra usage is required for long context requests."}}'
+			);
+
+			const result = await handler.handleRetryableError(msg);
+
+			// Should NOT retry (would be true if misclassified as rate_limit entering backoff)
+			assert.equal(result, false);
+
+			// Should emit fallback_chain_exhausted (quota_exhausted path), NOT auto_retry_start (backoff path)
+			const chainExhausted = emittedEvents.find((e) => e.type === "fallback_chain_exhausted");
+			assert.ok(chainExhausted, "Expected fallback_chain_exhausted event for entitlement error");
+
+			const retryStart = emittedEvents.find((e) => e.type === "auto_retry_start");
+			assert.equal(retryStart, undefined, "Should NOT emit auto_retry_start for entitlement error");
+		});
+
+		it("still classifies regular 429 rate limits as rate_limit", async () => {
+			// A normal "rate limit" 429 should still be classified as rate_limit
+			const { deps, emittedEvents } = createMockDeps({
+				model: createMockModel("anthropic", "claude-opus-4-6"),
+				markUsageLimitReachedResult: false,
+				fallbackResult: null,
+			});
+
+			const handler = new RetryHandler(deps);
+			const msg = errorMessage("429 Too Many Requests");
+
+			const result = await handler.handleRetryableError(msg);
+
+			// Should enter the backoff loop (rate_limit path, not quota_exhausted)
+			assert.equal(result, true);
+
+			const retryStart = emittedEvents.find((e) => e.type === "auto_retry_start");
+			assert.ok(retryStart, "Regular 429 should enter backoff retry");
+		});
+	});
+
+	describe("long-context model downgrade", () => {
+		it("downgrades from [1m] to base model when entitlement error and no fallback", async () => {
+			const baseModel = createMockModel("anthropic", "claude-opus-4-6");
+			const { deps, emittedEvents, onModelChangeFn, continueFn } = createMockDeps({
+				model: createMockModel("anthropic", "claude-opus-4-6[1m]"),
+				markUsageLimitReachedResult: false,
+				fallbackResult: null,
+				findModelResult: (provider: string, modelId: string) => {
+					if (provider === "anthropic" && modelId === "claude-opus-4-6") return baseModel;
+					return undefined;
+				},
+			});
+
+			const handler = new RetryHandler(deps);
+			const msg = errorMessage("Extra usage is required for long context requests.");
+
+			const result = await handler.handleRetryableError(msg);
+
+			assert.equal(result, true, "Should retry after downgrade");
+
+			// Should have called setModel with the base model
+			const setModelCalls = (deps.agent.setModel as any).mock.calls;
+			assert.equal(setModelCalls.length, 1);
+			assert.equal(setModelCalls[0].arguments[0].id, "claude-opus-4-6");
+
+			// Should have notified about model change
+			assert.equal(onModelChangeFn.mock.calls.length, 1);
+
+			// Should emit a fallback_provider_switch event indicating downgrade
+			const switchEvent = emittedEvents.find((e) => e.type === "fallback_provider_switch");
+			assert.ok(switchEvent, "Expected fallback_provider_switch event for downgrade");
+			assert.ok(switchEvent!.reason.includes("long context downgrade"), `reason should mention downgrade: ${switchEvent!.reason}`);
+		});
+
+		it("emits fallback_chain_exhausted when base model is also unavailable", async () => {
+			const { deps, emittedEvents } = createMockDeps({
+				model: createMockModel("anthropic", "claude-opus-4-6[1m]"),
+				markUsageLimitReachedResult: false,
+				fallbackResult: null,
+				findModelResult: () => undefined, // base model not found
+			});
+
+			const handler = new RetryHandler(deps);
+			const msg = errorMessage("Extra usage is required for long context requests.");
+
+			const result = await handler.handleRetryableError(msg);
+
+			assert.equal(result, false);
+			const chainExhausted = emittedEvents.find((e) => e.type === "fallback_chain_exhausted");
+			assert.ok(chainExhausted, "Expected fallback_chain_exhausted when base model unavailable");
+		});
+
+		it("does not attempt downgrade for non-[1m] models", async () => {
+			// When a regular model (no [1m] suffix) gets a quota_exhausted error
+			// with no fallback, it should just stop — no downgrade attempt.
+			const { deps, emittedEvents } = createMockDeps({
+				model: createMockModel("anthropic", "claude-opus-4-6"),
+				markUsageLimitReachedResult: false,
+				fallbackResult: null,
+			});
+
+			const handler = new RetryHandler(deps);
+			const msg = errorMessage("Extra usage is required for long context requests.");
+
+			const result = await handler.handleRetryableError(msg);
+
+			assert.equal(result, false);
+			const chainExhausted = emittedEvents.find((e) => e.type === "fallback_chain_exhausted");
+			assert.ok(chainExhausted);
+
+			// No downgrade switch should occur
+			const switchEvent = emittedEvents.find((e) => e.type === "fallback_provider_switch");
+			assert.equal(switchEvent, undefined, "Should not switch for non-[1m] models");
+		});
+	});
+
+	describe("isRetryableError", () => {
+		it("considers long-context entitlement error as retryable", () => {
+			const { deps } = createMockDeps();
+			const handler = new RetryHandler(deps);
+			const msg = errorMessage("Extra usage is required for long context requests.");
+			assert.equal(handler.isRetryableError(msg), true);
+		});
+	});
+});
--- a/packages/pi-coding-agent/src/core/retry-handler.ts
+++ b/packages/pi-coding-agent/src/core/retry-handler.ts
@ -107,7 +107,7 @@ export class RetryHandler {
 		if (isContextOverflow(message, contextWindow)) return false;

 		const err = message.errorMessage;
-		return /overloaded|rate.?limit|too many requests|429|500|502|503|504|service.?unavailable|server.?error|internal.?error|connection.?error|connection.?refused|other side closed|fetch failed|upstream.?connect|reset before headers|terminated|retry delay|network.?(?:is\s+)?unavailable|credentials.*expired|temporarily backed off/i.test(
+		return /overloaded|rate.?limit|too many requests|429|500|502|503|504|service.?unavailable|server.?error|internal.?error|connection.?error|connection.?refused|other side closed|fetch failed|upstream.?connect|reset before headers|terminated|retry delay|network.?(?:is\s+)?unavailable|credentials.*expired|temporarily backed off|extra usage is required/i.test(
 			err,
 		);
 	}
@ -202,6 +202,10 @@ export class RetryHandler {

 				// No fallback available either
 				if (errorType === "quota_exhausted") {
+					// Try long-context model downgrade ([1m] → base) before giving up
+					const downgraded = this._tryLongContextDowngrade(message);
+					if (downgraded) return true;
+
 					this._deps.emit({
 						type: "fallback_chain_exhausted",
 						reason: `All providers exhausted for ${this._deps.getModel()!.provider}/${this._deps.getModel()!.id}`,
@ -343,12 +347,59 @@ export class RetryHandler {
 	 */
 	private _classifyErrorType(errorMessage: string): UsageLimitErrorType {
 		const err = errorMessage.toLowerCase();
+		// Long-context entitlement errors are billing gates, not transient rate limits.
+		// Must be checked before the generic 429/rate_limit regex.
+		if (/extra usage is required|long context required/i.test(err)) return "quota_exhausted";
 		if (/quota|billing|exceeded.*limit|usage.*limit/i.test(err)) return "quota_exhausted";
 		if (/rate.?limit|too many requests|429/i.test(err)) return "rate_limit";
 		if (/500|502|503|504|server.?error|internal.?error|service.?unavailable/i.test(err)) return "server_error";
 		return "unknown";
 	}

+	/**
+	 * Attempt to downgrade a long-context model (e.g. claude-opus-4-6[1m]) to its
+	 * base model (claude-opus-4-6) when the account lacks the long-context billing
+	 * entitlement. Returns true if the downgrade was initiated.
+	 */
+	private _tryLongContextDowngrade(message: AssistantMessage): boolean {
+		const currentModel = this._deps.getModel();
+		if (!currentModel) return false;
+
+		// Only attempt downgrade for [1m] (or similar long-context) model IDs
+		const match = currentModel.id.match(/^(.+)\[\d+m\]$/);
+		if (!match) return false;
+
+		const baseModelId = match[1];
+		const baseModel = this._deps.modelRegistry.find(currentModel.provider, baseModelId);
+		if (!baseModel) return false;
+
+		const previousId = currentModel.id;
+		this._deps.agent.setModel(baseModel);
+		this._deps.onModelChange(baseModel);
+		this._removeLastAssistantError();
+
+		this._deps.emit({
+			type: "fallback_provider_switch",
+			from: `${currentModel.provider}/${previousId}`,
+			to: `${baseModel.provider}/${baseModel.id}`,
+			reason: `long context downgrade: ${previousId} → ${baseModel.id}`,
+		});
+
+		this._deps.emit({
+			type: "auto_retry_start",
+			attempt: this._retryAttempt + 1,
+			maxAttempts: this._deps.settingsManager.getRetrySettings().maxRetries,
+			delayMs: 0,
+			errorMessage: `${message.errorMessage} (long context downgrade)`,
+		});
+
+		setTimeout(() => {
+			this._deps.agent.continue().catch(() => {});
+		}, 0);
+
+		return true;
+	}
+
 	/** Remove the last assistant error message from agent state */
 	private _removeLastAssistantError(): void {
 		const messages = this._deps.agent.state.messages;
--- a/packages/pi-coding-agent/src/core/tools/hashline-read.ts
+++ b/packages/pi-coding-agent/src/core/tools/hashline-read.ts
@ -123,12 +123,15 @@ export function createHashlineReadTool(cwd: string, options?: HashlineReadToolOp
 								const allLines = textContent.split("\n");
 								const totalFileLines = allLines.length;

-								const startLine = offset ? Math.max(0, offset - 1) : 0;
-								const startLineDisplay = startLine + 1;
+								let startLine = offset ? Math.max(0, offset - 1) : 0;

+								// Clamp offset to file bounds instead of throwing (#3007)
+								let offsetClamped = false;
 								if (startLine >= allLines.length) {
-									throw new Error(`Offset ${offset} is beyond end of file (${allLines.length} lines total)`);
+									startLine = Math.max(0, allLines.length - 1);
+									offsetClamped = true;
 								}
+								const startLineDisplay = startLine + 1;

 								let selectedContent: string;
 								let userLimitedLines: number | undefined;
@ -172,6 +175,11 @@ export function createHashlineReadTool(cwd: string, options?: HashlineReadToolOp
 									outputText = formatHashLines(truncation.content, startLineDisplay);
 								}

+								// Prepend clamp notice so the agent knows offset was adjusted
+								if (offsetClamped) {
+									outputText = `[Offset ${offset} beyond end of file (${totalFileLines} lines). Clamped to line ${startLineDisplay}.]\n\n${outputText}`;
+								}
+
 								content = [{ type: "text", text: outputText }];
 							}

--- a/packages/pi-coding-agent/src/core/tools/read.ts
+++ b/packages/pi-coding-agent/src/core/tools/read.ts
@ -133,13 +133,18 @@ export function createReadTool(cwd: string, options?: ReadToolOptions): AgentToo
 								const totalFileLines = allLines.length;

 								// Apply offset if specified (1-indexed to 0-indexed)
-								const startLine = offset ? Math.max(0, offset - 1) : 0;
-								const startLineDisplay = startLine + 1; // For display (1-indexed)
+								let startLine = offset ? Math.max(0, offset - 1) : 0;

-								// Check if offset is out of bounds
+								// Clamp offset to file bounds instead of throwing (#3007).
+								// When an agent requests offset:30 on a 13-line file, return
+								// the last line with a notice rather than an error that
+								// propagates as invalid JSON downstream.
+								let offsetClamped = false;
 								if (startLine >= allLines.length) {
-									throw new Error(`Offset ${offset} is beyond end of file (${allLines.length} lines total)`);
+									startLine = Math.max(0, allLines.length - 1);
+									offsetClamped = true;
 								}
+								const startLineDisplay = startLine + 1; // For display (1-indexed)

 								// If limit is specified by user, use it; otherwise we'll let truncateHead decide
 								let selectedContent: string;
@ -187,6 +192,11 @@ export function createReadTool(cwd: string, options?: ReadToolOptions): AgentToo
 									outputText = truncation.content;
 								}

+								// Prepend clamp notice so the agent knows offset was adjusted
+								if (offsetClamped) {
+									outputText = `[Offset ${offset} beyond end of file (${totalFileLines} lines). Clamped to line ${startLineDisplay}.]\n\n${outputText}`;
+								}
+
 								content = [{ type: "text", text: outputText }];
 							}

--- a/packages/pi-coding-agent/src/core/tools/spawn-shell-windows.test.ts
+++ b/packages/pi-coding-agent/src/core/tools/spawn-shell-windows.test.ts
@ -0,0 +1,92 @@
+/**
+ * spawn-shell-windows.test.ts — Regression test for Windows spawn ENOENT/EINVAL.
+ *
+ * On Windows, npm/npx/tsc and other tools are installed as .cmd batch scripts.
+ * Node's `spawn()` without `shell: true` cannot execute .cmd files, resulting
+ * in ENOENT or EINVAL errors. Every spawn site that may invoke a user-installed
+ * binary (not `node` or a shell like `sh`/`bash`/`cmd`) must include
+ * `shell: process.platform === "win32"` so the call is resolved through cmd.exe
+ * on Windows while remaining a direct exec on POSIX.
+ *
+ * This test structurally scans all spawn sites and verifies the guard is present.
+ *
+ * Fixes: gsd-build/gsd-2#2854
+ */
+
+import test from "node:test";
+import assert from "node:assert/strict";
+import { readFileSync } from "node:fs";
+import { join, dirname, relative } from "node:path";
+import { fileURLToPath } from "node:url";
+
+const __dirname = dirname(fileURLToPath(import.meta.url));
+const coreDir = join(__dirname, "..");
+
+/**
+ * Files that call `spawn()` with a user-facing binary (not `node`, `sh`, `bash`,
+ * or `cmd`) and therefore need the Windows shell guard.
+ *
+ * If a file spawns only hardcoded system binaries (like `node` in rpc-client.ts),
+ * it does not need the guard and should NOT appear here.
+ */
+const SPAWN_FILES_NEEDING_SHELL_GUARD = [
+	// Extension's GSD client — spawns the `gsd` binary which is a .cmd on Windows
+	join(coreDir, "..", "..", "..", "vscode-extension", "src", "gsd-client.ts"),
+	// exec.ts — used by extensions to run arbitrary commands
+	join(coreDir, "exec.ts"),
+	// LSP index — spawns project-type commands (tsc, cargo, etc.)
+	join(coreDir, "lsp", "index.ts"),
+	// LSP client — spawns LSP server binaries (npx, etc.)
+	join(coreDir, "lsp", "client.ts"),
+	// LSP mux — spawns lspmux binary
+	join(coreDir, "lsp", "lspmux.ts"),
+	// Package manager — spawns npm/yarn/pnpm
+	join(coreDir, "package-manager.ts"),
+];
+
+test("all spawn sites that invoke user-facing binaries include shell: process.platform === 'win32'", () => {
+	const failures: string[] = [];
+
+	for (const file of SPAWN_FILES_NEEDING_SHELL_GUARD) {
+		let content: string;
+		try {
+			content = readFileSync(file, "utf-8");
+		} catch {
+			// File may not exist in this checkout — skip
+			continue;
+		}
+
+		const lines = content.split("\n");
+
+		// Find all spawn(..., { ... }) call sites and check each one
+		// for the presence of `shell: process.platform === "win32"` within
+		// 5 lines after the spawn call.
+		for (let i = 0; i < lines.length; i++) {
+			const line = lines[i]!;
+			// Skip comments
+			if (line.trim().startsWith("//") || line.trim().startsWith("*")) continue;
+
+			// Detect a spawn() call
+			if (/\bspawn\(/.test(line)) {
+				// Look ahead up to 8 lines for the shell guard
+				const lookahead = lines.slice(i, i + 8).join("\n");
+				const hasShellGuard =
+					/shell:\s*process\.platform\s*===\s*["']win32["']/.test(lookahead);
+
+				if (!hasShellGuard) {
+					const relPath = relative(join(coreDir, "..", ".."), file);
+					failures.push(`${relPath}:${i + 1}`);
+				}
+			}
+		}
+	}
+
+	assert.deepEqual(
+		failures,
+		[],
+		`The following spawn sites are missing 'shell: process.platform === "win32"':\n` +
+		failures.map(f => `  - ${f}`).join("\n") +
+		`\nOn Windows, .cmd wrapper scripts (npm, npx, tsc, gsd) require shell ` +
+		`resolution. Without this guard, spawn fails with ENOENT or EINVAL.`,
+	);
+});
--- a/packages/pi-coding-agent/src/index.ts
+++ b/packages/pi-coding-agent/src/index.ts
@ -68,6 +68,7 @@ export type {
 	Extension,
 	ExtensionActions,
 	ExtensionAPI,
+	ExtensionManifest,
 	ExtensionCommandContext,
 	ExtensionCommandContextActions,
 	ExtensionContext,
@ -119,6 +120,8 @@ export type {
 	ToolCallEvent,
 	ToolDefinition,
 	ToolInfo,
+	SortResult,
+	SortWarning,
 	ToolRenderResultOptions,
 	ToolResultEvent,
 	TurnEndEvent,
@ -137,6 +140,9 @@ export {
 	importExtensionModule,
 	isToolCallEventType,
 	isToolResultEventType,
+	readManifest,
+	readManifestFromEntryPath,
+	sortExtensionPaths,
 	wrapRegisteredTool,
 	wrapRegisteredTools,
 	wrapToolsWithExtensions,
--- a/packages/pi-coding-agent/src/modes/interactive/controllers/chat-controller.ts
+++ b/packages/pi-coding-agent/src/modes/interactive/controllers/chat-controller.ts
@ -337,5 +337,12 @@ export async function handleAgentEvent(host: InteractiveModeStateHost & {
 			host.showError(event.reason);
 			host.ui.requestRender();
 			break;
+
+		case "image_overflow_recovery":
+			host.showStatus(
+				`Removed ${event.strippedCount} older image(s) to comply with API limits. Retrying...`,
+			);
+			host.ui.requestRender();
+			break;
 	}
 }
--- a/packages/pi-coding-agent/src/modes/rpc/remote-terminal.ts
+++ b/packages/pi-coding-agent/src/modes/rpc/remote-terminal.ts
@ -49,6 +49,12 @@ export class RemoteTerminal implements Terminal {
 		return this._rows;
 	}

+	get isTTY(): boolean {
+		// RemoteTerminal renders to a browser-based terminal emulator via
+		// the RPC bridge — it behaves like a real TTY for rendering purposes.
+		return true;
+	}
+
 	get kittyProtocolActive(): boolean {
 		return false;
 	}
--- a/packages/pi-tui/src/terminal.ts
+++ b/packages/pi-tui/src/terminal.ts
@ -9,6 +9,9 @@ const cjsRequire = createRequire(import.meta.url);
 * Minimal terminal interface for TUI
 */
 export interface Terminal {
+	// Whether stdout is a real TTY (false for pipes, e.g. RPC bridge processes)
+	readonly isTTY: boolean;
+
 	// Start the terminal with input and resize handlers
 	start(onInput: (data: string) => void, onResize: () => void): void;

@ -63,11 +66,22 @@ export class ProcessTerminal implements Terminal {
 	private stdinDataHandler?: (data: string) => void;
 	private writeLogPath = process.env.PI_TUI_WRITE_LOG || "";

+	get isTTY(): boolean {
+		return !!process.stdout.isTTY;
+	}
+
 	get kittyProtocolActive(): boolean {
 		return this._kittyProtocolActive;
 	}

 	start(onInput: (data: string) => void, onResize: () => void): void {
+		// Non-TTY stdout (pipe) — skip TUI initialization entirely.
+		// RPC bridge processes communicate via JSON, not terminal escape codes.
+		// Without this guard, the render loop burns 500%+ CPU. (issue #3095)
+		if (!this.isTTY) {
+			return;
+		}
+
 		this.inputHandler = onInput;
 		this.resizeHandler = onResize;

--- a/packages/pi-tui/src/tui.ts
+++ b/packages/pi-tui/src/tui.ts
@ -399,6 +399,12 @@ export class TUI extends Container {

 	start(): void {
 		this.stopped = false;
+		// Non-TTY stdout (pipe) — skip TUI entirely to avoid burning CPU.
+		// RPC bridge processes have piped stdio; rendering ANSI escape codes
+		// to a pipe is pure waste and causes a runaway render loop. (issue #3095)
+		if (!this.terminal.isTTY) {
+			return;
+		}
 		this.terminal.start(
 			(data) => this.handleInput(data),
 			() => this.requestRender(),
@ -458,6 +464,8 @@ export class TUI extends Container {
 	}

 	requestRender(force = false): void {
+		// Skip rendering on non-TTY stdout to prevent CPU burn (issue #3095)
+		if (!this.terminal.isTTY) return;
 		if (force) {
 			this.previousLines = [];
 			this.previousWidth = -1; // -1 triggers widthChanged, forcing a full clear
--- a/scripts/ensure-workspace-builds.cjs
+++ b/scripts/ensure-workspace-builds.cjs
@ -37,6 +37,48 @@ function newestSrcMtime(dir) {
  return newest
 }

+/**
+ * Detects workspace packages whose dist/ is missing or stale.
+ *
+ * Missing dist/index.js is always reported (the package won't work at all).
+ *
+ * Staleness (src/ newer than dist/) is ONLY checked when a .git directory
+ * exists at root — indicating a development clone. In npm tarball installs,
+ * file timestamps are unreliable (npm sets all files to a canonical date,
+ * but extraction ordering can cause src/ to appear 1-2 seconds newer than
+ * dist/). Attempting to rebuild in that scenario is dangerous: devDependencies
+ * (including TypeScript) are not installed, and any globally-installed tsc
+ * may produce broken output that overwrites the known-good dist/.
+ *
+ * @param {string} root    Project root directory
+ * @param {string[]} packages  Package directory names to check
+ * @returns {string[]} Package names that need rebuilding
+ */
+function detectStalePackages(root, packages) {
+  const packagesDir = join(root, 'packages')
+  const isDevClone = existsSync(join(root, '.git'))
+
+  const stale = []
+  for (const pkg of packages) {
+    const distIndex = join(packagesDir, pkg, 'dist', 'index.js')
+    if (!existsSync(distIndex)) {
+      stale.push(pkg)
+      continue
+    }
+    // Only check src vs dist timestamps in development clones.
+    // In npm tarball installs, timestamps are unreliable and rebuilding
+    // without devDependencies can corrupt the pre-built dist/ (#2877).
+    if (isDevClone) {
+      const distMtime = statSync(distIndex).mtimeMs
+      const srcMtime = newestSrcMtime(join(packagesDir, pkg, 'src'))
+      if (srcMtime > distMtime) {
+        stale.push(pkg)
+      }
+    }
+  }
+  return stale
+}
+
 if (require.main === module) {
  const root = resolve(__dirname, '..')
  const packagesDir = join(root, 'packages')
@ -57,19 +99,7 @@ if (require.main === module) {
    'pi-coding-agent',
  ]

-  const stale = []
-  for (const pkg of WORKSPACE_PACKAGES) {
-    const distIndex = join(packagesDir, pkg, 'dist', 'index.js')
-    if (!existsSync(distIndex)) {
-      stale.push(pkg)
-      continue
-    }
-    const distMtime = statSync(distIndex).mtimeMs
-    const srcMtime = newestSrcMtime(join(packagesDir, pkg, 'src'))
-    if (srcMtime > distMtime) {
-      stale.push(pkg)
-    }
-  }
+  const stale = detectStalePackages(root, WORKSPACE_PACKAGES)

  if (stale.length === 0) process.exit(0)

@ -78,6 +108,7 @@ if (require.main === module) {
  for (const pkg of stale) {
    const pkgDir = join(packagesDir, pkg)
    try {
+      // execSync is safe here: the command is a hardcoded string, not user input
      execSync('npm run build', { cwd: pkgDir, stdio: 'pipe' })
      process.stderr.write(`  ✓ ${pkg}\n`)
    } catch (err) {
@ -87,4 +118,4 @@ if (require.main === module) {
  }
 }

-module.exports = { newestSrcMtime }
+module.exports = { newestSrcMtime, detectStalePackages }
--- a/src/cli.ts
+++ b/src/cli.ts
@ -16,7 +16,8 @@ import { agentDir, sessionsDir, authFilePath } from './app-paths.js'
 import { initResources, buildResourceLoader, getNewerManagedResourceVersion } from './resource-loader.js'
 import { ensureManagedTools } from './tool-bootstrap.js'
 import { loadStoredEnvKeys } from './wizard.js'
-import { getPiDefaultModelAndProvider, migratePiCredentials } from './pi-migration.js'
+import { migratePiCredentials } from './pi-migration.js'
+import { validateConfiguredModel } from './startup-model-validation.js'
 import { shouldRunOnboarding, runOnboarding } from './onboarding.js'
 import chalk from 'chalk'
 import { checkForUpdates } from './update-check.js'
@ -170,6 +171,7 @@ const hasSubcommand = cliFlags.messages.length > 0
 if (!process.stdin.isTTY && !isPrintMode && !hasSubcommand && !cliFlags.listModels && !cliFlags.web) {
  process.stderr.write('[gsd] Error: Interactive mode requires a terminal (TTY).\n')
  process.stderr.write('[gsd] Non-interactive alternatives:\n')
+  process.stderr.write('[gsd]   gsd auto                       Auto-mode (pipeable, no TUI)\n')
  process.stderr.write('[gsd]   gsd --print "your message"     Single-shot prompt\n')
  process.stderr.write('[gsd]   gsd --mode rpc                 JSON-RPC over stdin/stdout\n')
  process.stderr.write('[gsd]   gsd --mode mcp                 MCP server over stdin/stdout\n')
@ -300,6 +302,23 @@ if (cliFlags.messages[0] === 'headless') {
  process.exit(0)
 }

+// `gsd auto [args...]` — shorthand for `gsd headless auto [args...]` (#2732)
+// Without this, `gsd auto` falls through to the interactive TUI which hangs
+// when stdin/stdout are piped (non-TTY environments).
+if (cliFlags.messages[0] === 'auto') {
+  await ensureRtkBootstrap()
+  const { runHeadless, parseHeadlessArgs } = await import('./headless.js')
+  // Rewrite argv so parseHeadlessArgs sees: [node, gsd, headless, auto, ...rest]
+  const rewrittenArgv = [
+    process.argv[0],
+    process.argv[1],
+    'headless',
+    ...cliFlags.messages,   // ['auto', ...extra args]
+  ]
+  await runHeadless(parseHeadlessArgs(rewrittenArgv))
+  process.exit(0)
+}
+
 // Pi's tool bootstrap can mis-detect already-installed fd/rg on some systems
 // because spawnSync(..., ["--version"]) returns EPERM despite a zero exit code.
 // Provision local managed binaries first so Pi sees them without probing PATH.
@ -391,42 +410,6 @@ if (cliFlags.listModels !== undefined) {
  process.exit(0)
 }

-// Validate configured model on startup — catches stale settings from prior installs
-// (e.g. grok-2 which no longer exists) and fresh installs with no settings.
-// Only resets the default when the configured model no longer exists in the registry;
-// never overwrites a valid user choice.
-const configuredProvider = settingsManager.getDefaultProvider()
-const configuredModel = settingsManager.getDefaultModel()
-const allModels = modelRegistry.getAll()
-const availableModels = modelRegistry.getAvailable()
-const configuredExists = configuredProvider && configuredModel &&
-  allModels.some((m) => m.provider === configuredProvider && m.id === configuredModel)
-const configuredAvailable = configuredProvider && configuredModel &&
-  availableModels.some((m) => m.provider === configuredProvider && m.id === configuredModel)
-
-if (!configuredModel || !configuredExists) {
-  // Model not configured at all, or removed from registry — pick a fallback.
-  // Only fires when the model is genuinely unknown (not just temporarily unavailable).
-  const piDefault = getPiDefaultModelAndProvider()
-  const preferred =
-    (piDefault
-      ? availableModels.find((m) => m.provider === piDefault.provider && m.id === piDefault.model)
-      : undefined) ||
-    availableModels.find((m) => m.provider === 'openai' && m.id === 'gpt-5.4') ||
-    availableModels.find((m) => m.provider === 'openai') ||
-    availableModels.find((m) => m.provider === 'anthropic' && m.id === 'claude-opus-4-6') ||
-    availableModels.find((m) => m.provider === 'anthropic' && m.id.includes('opus')) ||
-    availableModels.find((m) => m.provider === 'anthropic') ||
-    availableModels[0]
-  if (preferred) {
-    settingsManager.setDefaultModelAndProvider(preferred.provider, preferred.id)
-  }
-}
-
-if (settingsManager.getDefaultThinkingLevel() !== 'off' && !configuredExists) {
-  settingsManager.setDefaultThinkingLevel('off')
-}
-
 // GSD always uses quiet startup — the gsd extension renders its own branded header
 if (!settingsManager.getQuietStartup()) {
  settingsManager.setQuietStartup(true)
@ -477,6 +460,11 @@ if (isPrintMode) {
  })
  markStartup('createAgentSession')

+  // Validate configured model AFTER extensions have registered their models (#2626).
+  // Before this, extension-provided models (e.g. claude-code/*) were not yet in the
+  // registry, causing the user's valid choice to be silently overwritten.
+  validateConfiguredModel(modelRegistry, settingsManager)
+
  if (extensionsResult.errors.length > 0) {
    for (const err of extensionsResult.errors) {
      // Downgrade conflicts with built-in tools to warnings (#1347)
@ -565,6 +553,20 @@ if (!cliFlags.worktree && !isPrintMode) {
  } catch { /* non-fatal */ }
 }

+// ---------------------------------------------------------------------------
+// Auto-redirect: `gsd auto` with piped stdout → headless mode (#2732)
+// When stdout is not a TTY (e.g. `gsd auto | cat`, `gsd auto > file`),
+// the TUI cannot render and the process hangs. Redirect to headless mode
+// which handles non-interactive output gracefully.
+// ---------------------------------------------------------------------------
+if (cliFlags.messages[0] === 'auto' && !process.stdout.isTTY) {
+  await ensureRtkBootstrap()
+  const { runHeadless, parseHeadlessArgs } = await import('./headless.js')
+  process.stderr.write('[gsd] stdout is not a terminal — running auto-mode in headless mode.\n')
+  await runHeadless(parseHeadlessArgs(['node', 'gsd', 'headless', ...cliFlags.messages.slice(1)]))
+  process.exit(0)
+}
+
 // ---------------------------------------------------------------------------
 // Interactive mode — normal TTY session
 // ---------------------------------------------------------------------------
@ -611,6 +613,11 @@ const { session, extensionsResult } = await createAgentSession({
 })
 markStartup('createAgentSession')

+// Validate configured model AFTER extensions have registered their models (#2626).
+// Before this, extension-provided models (e.g. claude-code/*) were not yet in the
+// registry, causing the user's valid choice to be silently overwritten.
+validateConfiguredModel(modelRegistry, settingsManager)
+
 if (extensionsResult.errors.length > 0) {
  for (const err of extensionsResult.errors) {
    const isSuperseded = err.error.includes("supersedes");
@ -662,14 +669,21 @@ if (enabledModelPatterns && enabledModelPatterns.length > 0) {
  }
 }

-if (!process.stdin.isTTY) {
-  process.stderr.write('[gsd] Error: Interactive mode requires a terminal (TTY).\n')
+if (!process.stdin.isTTY || !process.stdout.isTTY) {
+  const missing = !process.stdin.isTTY && !process.stdout.isTTY
+    ? 'stdin and stdout are'
+    : !process.stdin.isTTY
+      ? 'stdin is'
+      : 'stdout is'
+  process.stderr.write(`[gsd] Error: Interactive mode requires a terminal (TTY) but ${missing} not a TTY.\n`)
  process.stderr.write('[gsd] Non-interactive alternatives:\n')
+  process.stderr.write('[gsd]   gsd auto                       Auto-mode (pipeable, no TUI)\n')
  process.stderr.write('[gsd]   gsd --print "your message"     Single-shot prompt\n')
  process.stderr.write('[gsd]   gsd --web [path]               Browser-only web mode\n')
  process.stderr.write('[gsd]   gsd --mode rpc                 JSON-RPC over stdin/stdout\n')
  process.stderr.write('[gsd]   gsd --mode mcp                 MCP server over stdin/stdout\n')
  process.stderr.write('[gsd]   gsd --mode text "message"      Text output mode\n')
+  process.stderr.write('[gsd]   gsd headless                   Auto-mode without TUI\n')
  process.exit(1)
 }

--- a/src/help-text.ts
+++ b/src/help-text.ts
@ -169,6 +169,7 @@ export function printHelp(version: string): void {
  process.stdout.write('  update                   Update GSD to the latest version\n')
  process.stdout.write('  sessions                 List and resume a past session\n')
  process.stdout.write('  worktree <cmd>           Manage worktrees (list, merge, clean, remove)\n')
+  process.stdout.write('  auto [args]              Run auto-mode without TUI (pipeable)\n')
  process.stdout.write('  headless [cmd] [args]    Run /gsd commands without TUI (default: auto)\n')
  process.stdout.write('\nRun gsd <subcommand> --help for subcommand-specific help.\n')
 }
--- a/src/onboarding.ts
+++ b/src/onboarding.ts
@ -74,6 +74,7 @@ const LLM_PROVIDER_IDS = [
  'xai',
  'openrouter',
  'mistral',
+  'ollama',
  'ollama-cloud',
  'custom-openai',
 ]
@ -90,6 +91,7 @@ const OTHER_PROVIDERS = [
  { value: 'xai', label: 'xAI (Grok)' },
  { value: 'openrouter', label: 'OpenRouter' },
  { value: 'mistral', label: 'Mistral' },
+  { value: 'ollama', label: 'Ollama (Local)' },
  { value: 'ollama-cloud', label: 'Ollama Cloud' },
  { value: 'custom-openai', label: 'Custom (OpenAI-compatible)' },
 ]
@ -335,6 +337,9 @@ async function runLlmStep(p: ClackModule, pc: PicoModule, authStorage: AuthStora
    if (provider === 'custom-openai') {
      return await runCustomOpenAIFlow(p, pc, authStorage)
    }
+    if (provider === 'ollama') {
+      return await runOllamaLocalFlow(p, pc, authStorage)
+    }
    const label = provider === 'anthropic' ? 'Anthropic'
      : provider === 'openai' ? 'OpenAI'
      : OTHER_PROVIDERS.find(op => op.value === provider)?.label ?? String(provider)
@ -444,6 +449,54 @@ async function runApiKeyFlow(
  return true
 }

+// ─── Ollama Local Flow ───────────────────────────────────────────────────────
+
+async function runOllamaLocalFlow(
+  p: ClackModule,
+  pc: PicoModule,
+  authStorage: AuthStorage,
+): Promise<boolean> {
+  const host = process.env.OLLAMA_HOST || 'http://localhost:11434'
+
+  const s = p.spinner()
+  s.start(`Checking Ollama at ${host}...`)
+
+  try {
+    const controller = new AbortController()
+    const timeout = setTimeout(() => controller.abort(), 3000)
+    const response = await fetch(host, { signal: controller.signal })
+    clearTimeout(timeout)
+
+    if (response.ok) {
+      s.stop(`Ollama is running at ${pc.green(host)}`)
+      // Store a placeholder so the provider is recognized as authenticated
+      authStorage.set('ollama', { type: 'api_key', key: 'ollama' })
+      p.log.success(`${pc.green('Ollama (Local)')} configured — no API key needed`)
+      p.log.info(pc.dim('Models are discovered automatically from your local Ollama instance.'))
+      return true
+    } else {
+      s.stop('Ollama check failed')
+      p.log.warn(`Ollama responded with status ${response.status} at ${host}`)
+    }
+  } catch {
+    s.stop('Ollama not detected')
+    p.log.warn(`Could not reach Ollama at ${host}`)
+    p.log.info(pc.dim('Install Ollama from https://ollama.com and run "ollama serve"'))
+    p.log.info(pc.dim('Set OLLAMA_HOST if using a non-default address.'))
+  }
+
+  // Even if not reachable now, save the config — the extension will detect it at runtime
+  const proceed = await p.confirm({
+    message: 'Save Ollama as your provider anyway? (it will auto-detect when running)',
+  })
+
+  if (p.isCancel(proceed) || !proceed) return false
+
+  authStorage.set('ollama', { type: 'api_key', key: 'ollama' })
+  p.log.success(`${pc.green('Ollama (Local)')} saved — models will appear when Ollama is running`)
+  return true
+}
+
 // ─── Custom OpenAI-compatible Flow ────────────────────────────────────────────

 async function runCustomOpenAIFlow(
--- a/src/resource-loader.ts
+++ b/src/resource-loader.ts
@ -1,4 +1,4 @@
-import { DefaultResourceLoader } from '@gsd/pi-coding-agent'
+import { DefaultResourceLoader, sortExtensionPaths } from '@gsd/pi-coding-agent'
 import { createHash } from 'node:crypto'
 import { homedir } from 'node:os'
 import { chmodSync, copyFileSync, cpSync, existsSync, lstatSync, mkdirSync, openSync, closeSync, readFileSync, readlinkSync, readdirSync, rmSync, statSync, symlinkSync, unlinkSync, writeFileSync } from 'node:fs'
@ -603,5 +603,21 @@ export function buildResourceLoader(agentDir: string): DefaultResourceLoader {
    agentDir,
    additionalExtensionPaths: piExtensionPaths,
    bundledExtensionNames: bundledKeys,
+    extensionPathsTransform: (paths: string[]) => {
+      // 1. Filter community extensions through the GSD registry
+      const filteredPaths = paths.filter((entryPath) => {
+        const manifest = readManifestFromEntryPath(entryPath)
+        if (!manifest) return true // no manifest = always load
+        return isExtensionEnabled(registry, manifest.id)
+      })
+
+      // 2. Sort in topological dependency order
+      const { sortedPaths, warnings } = sortExtensionPaths(filteredPaths)
+
+      return {
+        paths: sortedPaths,
+        diagnostics: warnings.map((w) => w.message),
+      }
+    },
  } as ConstructorParameters<typeof DefaultResourceLoader>[0])
 }
--- a/src/resources/agents/researcher.md
+++ b/src/resources/agents/researcher.md
@ -1,7 +1,7 @@
 ---
 name: researcher
 description: Web researcher that finds and synthesizes current information using Brave Search
-tools: web_search, bash
+tools: search-the-web, bash
 ---

 You are a web researcher. You find current, accurate information using web search and synthesize it into a clear, well-structured report.
--- a/src/resources/extensions/ask-user-questions.ts
+++ b/src/resources/extensions/ask-user-questions.ts
@ -162,9 +162,27 @@ export default function AskUserQuestions(pi: ExtensionAPI) {
 					if (selected === undefined) {
 						return errorResult("ask_user_questions was cancelled", params.questions);
 					}
-					answers[q.id] = {
-						answers: Array.isArray(selected) ? selected : [selected],
-					};
+
+					// When the user picks "None of the above" on a single-select
+					// question, prompt for a free-text explanation so they are not
+					// trapped in a re-asking loop (bug #2715).
+					let freeTextNote = "";
+					const selectedStr = Array.isArray(selected) ? selected[0] : selected;
+					if (!q.allowMultiple && selectedStr === OTHER_OPTION_LABEL) {
+						const note = await ctx.ui.input(
+							`${q.header}: Please explain in your own words`,
+							"Type your answer here…",
+						);
+						if (note) {
+							freeTextNote = note;
+						}
+					}
+
+					const answerList = Array.isArray(selected) ? selected : [selected];
+					if (freeTextNote) {
+						answerList.push(`user_note: ${freeTextNote}`);
+					}
+					answers[q.id] = { answers: answerList };
 				}
 				const roundResult: RoundResult = {
 					endInterview: false,
--- a/src/resources/extensions/async-jobs/extension-manifest.json
+++ b/src/resources/extensions/async-jobs/extension-manifest.json
@ -8,6 +8,6 @@
  "provides": {
    "tools": ["async_bash", "await_job", "cancel_job"],
    "commands": ["jobs"],
-    "hooks": ["session_start"]
+    "hooks": ["session_start", "session_before_switch", "session_shutdown"]
  }
 }
--- a/src/resources/extensions/bg-shell/extension-manifest.json
+++ b/src/resources/extensions/bg-shell/extension-manifest.json
@ -8,7 +8,7 @@
  "provides": {
    "tools": ["bg_shell"],
    "commands": ["bg"],
-    "hooks": ["session_shutdown"],
+    "hooks": ["session_shutdown", "session_compact", "session_tree", "session_switch", "before_agent_start", "session_start", "turn_end", "agent_end", "tool_execution_end"],
    "shortcuts": ["Ctrl+Alt+B"]
  }
 }
--- a/src/resources/extensions/browser-tools/extension-manifest.json
+++ b/src/resources/extensions/browser-tools/extension-manifest.json
@ -29,7 +29,7 @@
      "browser_visual_diff", "browser_zoom_region",
      "browser_generate_test", "browser_action_cache", "browser_check_injection"
    ],
-    "hooks": ["session_shutdown"]
+    "hooks": ["session_start", "session_shutdown"]
  },
  "dependencies": {
    "runtime": ["playwright"]
--- a/src/resources/extensions/claude-code-cli/partial-builder.ts
+++ b/src/resources/extensions/claude-code-cli/partial-builder.ts
@ -16,6 +16,7 @@ import type {
 	Usage,
 	WebSearchResultContent,
 } from "@gsd/pi-ai";
+import { repairToolJson } from "@gsd/pi-ai";
 import type { BetaContentBlock, BetaRawMessageStreamEvent, NonNullableUsage } from "./sdk-types.js";

 // ---------------------------------------------------------------------------
@ -244,12 +245,18 @@ export class PartialMessageBuilder {
 					try {
 						block.arguments = JSON.parse(jsonStr);
 					} catch {
-						// Stream was truncated mid-tool-call — JSON is garbage.
-						// Preserve the raw string for diagnostics but signal the
-						// malformation explicitly so downstream consumers can
-						// distinguish this from a healthy tool completion (#2574).
-						block.arguments = { _raw: jsonStr };
-						return { type: "toolcall_end", contentIndex, toolCall: block, partial: this.partial, malformedArguments: true };
+						// JSON.parse failed — attempt repair for YAML-style bullet
+						// lists that LLMs copy from template formatting (#2660).
+						try {
+							block.arguments = JSON.parse(repairToolJson(jsonStr));
+						} catch {
+							// Repair also failed — stream was truncated or garbage.
+							// Preserve the raw string for diagnostics but signal the
+							// malformation explicitly so downstream consumers can
+							// distinguish this from a healthy tool completion (#2574).
+							block.arguments = { _raw: jsonStr };
+							return { type: "toolcall_end", contentIndex, toolCall: block, partial: this.partial, malformedArguments: true };
+						}
 					}
 					return { type: "toolcall_end", contentIndex, toolCall: block, partial: this.partial };
 				}
--- a/src/resources/extensions/claude-code-cli/stream-adapter.ts
+++ b/src/resources/extensions/claude-code-cli/stream-adapter.ts
@ -23,9 +23,6 @@ import type {
 	SDKMessage,
 	SDKPartialAssistantMessage,
 	SDKResultMessage,
-	SDKSystemMessage,
-	SDKStatusMessage,
-	SDKUserMessage,
 } from "./sdk-types.js";

 // ---------------------------------------------------------------------------
@ -71,30 +68,49 @@ function getClaudePath(): string {
 }

 // ---------------------------------------------------------------------------
-// Prompt extraction
+// Prompt construction
 // ---------------------------------------------------------------------------

 /**
- * Extract the last user prompt text from GSD's context messages.
- * The SDK manages its own conversation history — we only send
- * the latest user message as the prompt.
+ * Extract text content from a single message regardless of content shape.
 */
-function extractLastUserPrompt(context: Context): string {
-	for (let i = context.messages.length - 1; i >= 0; i--) {
-		const msg = context.messages[i];
-		if (msg.role === "user") {
-			if (typeof msg.content === "string") return msg.content;
-			if (Array.isArray(msg.content)) {
-				const textParts = msg.content
-					.filter((part: any) => part.type === "text")
-					.map((part: any) => part.text);
-				if (textParts.length > 0) return textParts.join("\n");
-			}
-		}
+function extractMessageText(msg: { role: string; content: unknown }): string {
+	if (typeof msg.content === "string") return msg.content;
+	if (Array.isArray(msg.content)) {
+		const textParts = msg.content
+			.filter((part: any) => part.type === "text")
+			.map((part: any) => part.text ?? part.thinking ?? "");
+		if (textParts.length > 0) return textParts.join("\n");
 	}
 	return "";
 }

+/**
+ * Build a full conversational prompt from GSD's context messages.
+ *
+ * Previous behaviour sent only the last user message, making every SDK
+ * call effectively stateless. This version serialises the complete
+ * conversation history (system prompt + all user/assistant turns) so
+ * Claude Code has full context for multi-turn continuity.
+ */
+export function buildPromptFromContext(context: Context): string {
+	const parts: string[] = [];
+
+	if (context.systemPrompt) {
+		parts.push(`[System]\n${context.systemPrompt}`);
+	}
+
+	for (const msg of context.messages) {
+		const text = extractMessageText(msg);
+		if (!text) continue;
+
+		const label = msg.role === "user" ? "User" : msg.role === "assistant" ? "Assistant" : "System";
+		parts.push(`[${label}]\n${text}`);
+	}
+
+	return parts.join("\n\n");
+}
+
 // ---------------------------------------------------------------------------
 // Error helper
 // ---------------------------------------------------------------------------
@ -127,6 +143,31 @@ export function makeStreamExhaustedErrorMessage(model: string, lastTextContent:
 	return message;
 }

+// ---------------------------------------------------------------------------
+// SDK options builder
+// ---------------------------------------------------------------------------
+
+/**
+ * Build the options object passed to the Claude Agent SDK's `query()` call.
+ *
+ * Extracted for testability — callers can verify session persistence,
+ * beta flags, and other configuration without mocking the full SDK.
+ */
+export function buildSdkOptions(modelId: string, prompt: string): Record<string, unknown> {
+	return {
+		pathToClaudeCodeExecutable: getClaudePath(),
+		model: modelId,
+		includePartialMessages: true,
+		persistSession: true,
+		cwd: process.cwd(),
+		permissionMode: "bypassPermissions",
+		allowDangerouslySkipPermissions: true,
+		settingSources: ["project"],
+		systemPrompt: { type: "preset", preset: "claude_code" },
+		betas: modelId.includes("sonnet") ? ["context-1m-2025-08-07"] : [],
+	};
+}
+
 // ---------------------------------------------------------------------------
 // streamSimple implementation
 // ---------------------------------------------------------------------------
@ -180,22 +221,14 @@ async function pumpSdkMessages(
 			options.signal.addEventListener("abort", () => controller.abort(), { once: true });
 		}

-		const prompt = extractLastUserPrompt(context);
+		const prompt = buildPromptFromContext(context);
+		const sdkOpts = buildSdkOptions(modelId, prompt);

 		const queryResult = sdk.query({
 			prompt,
 			options: {
-				pathToClaudeCodeExecutable: getClaudePath(),
-				model: modelId,
-				includePartialMessages: true,
-				persistSession: false,
+				...sdkOpts,
 				abortController: controller,
-				cwd: process.cwd(),
-				permissionMode: "bypassPermissions",
-				allowDangerouslySkipPermissions: true,
-				settingSources: ["project"],
-				systemPrompt: { type: "preset", preset: "claude_code" },
-				betas: modelId.includes("sonnet") ? ["context-1m-2025-08-07"] : [],
 			},
 		});

@ -225,7 +258,6 @@ async function pumpSdkMessages(
 				// -- Streaming partial messages --
 				case "stream_event": {
 					const partial = msg as SDKPartialAssistantMessage;
-					if (partial.parent_tool_use_id !== null) break; // skip subagent

 					const event = partial.event;

@ -256,7 +288,6 @@ async function pumpSdkMessages(
 				// -- Complete assistant message (non-streaming fallback) --
 				case "assistant": {
 					const sdkAssistant = msg as SDKAssistantMessage;
-					if (sdkAssistant.parent_tool_use_id !== null) break;

 					// Capture text content from complete messages
 					for (const block of sdkAssistant.message.content) {
@ -271,9 +302,6 @@ async function pumpSdkMessages(

 				// -- User message (synthetic tool result — signals turn boundary) --
 				case "user": {
-					const userMsg = msg as SDKUserMessage;
-					if (userMsg.parent_tool_use_id !== null) break;
-
 					// Capture content from the completed turn before resetting
 					if (builder) {
 						for (const block of builder.message.content) {
--- a/src/resources/extensions/claude-code-cli/tests/partial-builder.test.ts
+++ b/src/resources/extensions/claude-code-cli/tests/partial-builder.test.ts
@ -102,4 +102,32 @@ describe("PartialMessageBuilder — malformed tool arguments (#2574)", () => {
 			"non-JSON content should set malformedArguments: true",
 		);
 	});
+
+	test("YAML bullet lists repaired to JSON arrays (#2660)", () => {
+		const builder = new PartialMessageBuilder("claude-sonnet-4-20250514");
+		const malformedJson =
+			'{"milestoneId": "M005", "keyDecisions": - Used Web Notification API, "keyFiles": - src/lib.rs, "title": "done"}';
+		const event = feedToolCall(builder, [malformedJson]);
+
+		assert.ok(event, "event should not be null");
+		assert.equal(event!.type, "toolcall_end");
+		// Repaired YAML bullets should NOT set malformedArguments
+		assert.equal(
+			(event as any).malformedArguments,
+			undefined,
+			"repaired YAML bullets should not set malformedArguments",
+		);
+		if (event!.type === "toolcall_end") {
+			assert.equal(event!.toolCall.arguments.milestoneId, "M005");
+			assert.ok(
+				Array.isArray(event!.toolCall.arguments.keyDecisions),
+				"keyDecisions should be repaired to an array",
+			);
+			assert.ok(
+				Array.isArray(event!.toolCall.arguments.keyFiles),
+				"keyFiles should be repaired to an array",
+			);
+			assert.equal(event!.toolCall.arguments.title, "done");
+		}
+	});
 });
--- a/src/resources/extensions/claude-code-cli/tests/stream-adapter.test.ts
+++ b/src/resources/extensions/claude-code-cli/tests/stream-adapter.test.ts
@ -1,6 +1,15 @@
 import { describe, test } from "node:test";
 import assert from "node:assert/strict";
-import { makeStreamExhaustedErrorMessage } from "../stream-adapter.ts";
+import {
+	makeStreamExhaustedErrorMessage,
+	buildPromptFromContext,
+	buildSdkOptions,
+} from "../stream-adapter.ts";
+import type { Context, Message } from "@gsd/pi-ai";
+
+// ---------------------------------------------------------------------------
+// Existing tests — exhausted stream fallback (#2575)
+// ---------------------------------------------------------------------------

 describe("stream-adapter — exhausted stream fallback (#2575)", () => {
 	test("generator exhaustion becomes an error message instead of clean completion", () => {
@ -19,3 +28,101 @@ describe("stream-adapter — exhausted stream fallback (#2575)", () => {
 		assert.match(String((message.content[0] as any)?.text ?? ""), /Claude Code error: stream_exhausted_without_result/);
 	});
 });
+
+// ---------------------------------------------------------------------------
+// Bug #2859 — stateless provider regression tests
+// ---------------------------------------------------------------------------
+
+describe("stream-adapter — full context prompt (#2859)", () => {
+	test("buildPromptFromContext includes all user and assistant messages, not just the last user message", () => {
+		const context: Context = {
+			systemPrompt: "You are a helpful assistant.",
+			messages: [
+				{ role: "user", content: "What is 2+2?" } as Message,
+				{
+					role: "assistant",
+					content: [{ type: "text", text: "4" }],
+					api: "anthropic-messages",
+					provider: "claude-code",
+					model: "claude-sonnet-4-20250514",
+					usage: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0, totalTokens: 0, cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0, total: 0 } },
+					stopReason: "stop",
+					timestamp: Date.now(),
+				} as Message,
+				{ role: "user", content: "Now multiply that by 3" } as Message,
+			],
+		};
+
+		const prompt = buildPromptFromContext(context);
+
+		// Must contain content from BOTH user messages, not just the last
+		assert.ok(prompt.includes("2+2"), "prompt must include first user message");
+		assert.ok(prompt.includes("multiply"), "prompt must include second user message");
+		// Must contain assistant response for continuity
+		assert.ok(prompt.includes("4"), "prompt must include assistant reply for context");
+	});
+
+	test("buildPromptFromContext includes system prompt when present", () => {
+		const context: Context = {
+			systemPrompt: "You are a coding assistant.",
+			messages: [
+				{ role: "user", content: "Write a function" } as Message,
+			],
+		};
+
+		const prompt = buildPromptFromContext(context);
+		assert.ok(prompt.includes("coding assistant"), "prompt must include system prompt");
+	});
+
+	test("buildPromptFromContext handles array content parts in user messages", () => {
+		const context: Context = {
+			messages: [
+				{
+					role: "user",
+					content: [
+						{ type: "text", text: "First part" },
+						{ type: "text", text: "Second part" },
+					],
+				} as Message,
+				{ role: "user", content: "Follow-up" } as Message,
+			],
+		};
+
+		const prompt = buildPromptFromContext(context);
+		assert.ok(prompt.includes("First part"), "prompt must include array content parts");
+		assert.ok(prompt.includes("Second part"), "prompt must include all text parts");
+		assert.ok(prompt.includes("Follow-up"), "prompt must include follow-up message");
+	});
+
+	test("buildPromptFromContext returns empty string for empty messages", () => {
+		const context: Context = { messages: [] };
+		const prompt = buildPromptFromContext(context);
+		assert.equal(prompt, "");
+	});
+});
+
+describe("stream-adapter — session persistence (#2859)", () => {
+	test("buildSdkOptions enables persistSession by default", () => {
+		const options = buildSdkOptions("claude-sonnet-4-20250514", "test prompt");
+		assert.equal(options.persistSession, true, "persistSession must default to true");
+	});
+
+	test("buildSdkOptions sets model and prompt correctly", () => {
+		const options = buildSdkOptions("claude-sonnet-4-20250514", "hello world");
+		assert.equal(options.model, "claude-sonnet-4-20250514");
+	});
+
+	test("buildSdkOptions enables betas for sonnet models", () => {
+		const sonnetOpts = buildSdkOptions("claude-sonnet-4-20250514", "test");
+		assert.ok(
+			Array.isArray(sonnetOpts.betas) && sonnetOpts.betas.length > 0,
+			"sonnet models should have betas enabled",
+		);
+
+		const opusOpts = buildSdkOptions("claude-opus-4-20250514", "test");
+		assert.ok(
+			Array.isArray(opusOpts.betas) && opusOpts.betas.length === 0,
+			"non-sonnet models should have empty betas",
+		);
+	});
+});
--- a/src/resources/extensions/context7/extension-manifest.json
+++ b/src/resources/extensions/context7/extension-manifest.json
@ -7,6 +7,6 @@
  "requires": { "platform": ">=2.29.0" },
  "provides": {
    "tools": ["resolve_library", "get_library_docs"],
-    "hooks": ["session_start"]
+    "hooks": ["session_start", "session_shutdown"]
  }
 }
--- a/src/resources/extensions/get-secrets-from-user.ts
+++ b/src/resources/extensions/get-secrets-from-user.ts
@ -54,6 +54,9 @@ function hydrateProcessEnv(key: string, value: string): void {
 }

 async function writeEnvKey(filePath: string, key: string, value: string): Promise<void> {
+	if (typeof value !== "string") {
+		throw new TypeError(`writeEnvKey expects a string value for key "${key}", got ${typeof value}`);
+	}
 	let content = "";
 	try {
 		content = await readFile(filePath, "utf8");
@ -419,7 +422,7 @@ export async function collectSecretsFromManifest(
 	for (const { key, value } of collected) {
 		const entry = manifest.entries.find((e) => e.key === key);
 		if (entry) {
-			entry.status = value !== null ? "collected" : "skipped";
+			entry.status = value != null ? "collected" : "skipped";
 		}
 	}

@ -427,14 +430,14 @@ export async function collectSecretsFromManifest(
 	await writeFile(manifestPath, formatSecretsManifest(manifest), "utf8");

 	// (j) Apply collected values to destination
-	const provided = collected.filter((c) => c.value !== null) as Array<{ key: string; value: string }>;
+	const provided = collected.filter((c) => c.value != null) as Array<{ key: string; value: string }>;
 	const { applied } = await applySecrets(provided, destination, {
 		envFilePath: resolve(ctx.cwd, ".env"),
 	});

 	const skipped = [
 		...alreadySkipped,
-		...collected.filter((c) => c.value === null).map((c) => c.key),
+		...collected.filter((c) => c.value == null).map((c) => c.key),
 	];

 	return { applied, skipped, existingSkipped };
@ -505,8 +508,8 @@ export default function secureEnv(pi: ExtensionAPI) {
 				collected.push({ key: item.key, value });
 			}

-			const provided = collected.filter((c) => c.value !== null) as Array<{ key: string; value: string }>;
-			const skipped = collected.filter((c) => c.value === null).map((c) => c.key);
+			const provided = collected.filter((c) => c.value != null) as Array<{ key: string; value: string }>;
+			const skipped = collected.filter((c) => c.value == null).map((c) => c.key);

 			// Apply to destination via shared helper
 			const { applied, errors } = await applySecrets(provided, destination, {
--- a/src/resources/extensions/google-search/extension-manifest.json
+++ b/src/resources/extensions/google-search/extension-manifest.json
@ -7,6 +7,6 @@
  "requires": { "platform": ">=2.29.0" },
  "provides": {
    "tools": ["google_search"],
-    "hooks": ["session_start"]
+    "hooks": ["session_start", "session_shutdown"]
  }
 }
--- a/src/resources/extensions/google-search/index.ts
+++ b/src/resources/extensions/google-search/index.ts
@ -79,7 +79,7 @@ async function searchWithOAuth(
 	signal?: AbortSignal,
 ): Promise<SearchResult> {
 	const model = process.env.GEMINI_SEARCH_MODEL || "gemini-2.5-flash";
-	const url = `https://cloudcode-pa.googleapis.com/v1internal:streamGenerateContent`;
+	const url = `https://cloudcode-pa.googleapis.com/v1internal:streamGenerateContent?alt=sse`;

 	const GEMINI_CLI_HEADERS = {
 	        ideType: "IDE_UNSPECIFIED",
@ -104,6 +104,7 @@ async function searchWithOAuth(
 	                                contents: [{ parts: [{ text: query }] }],
 	                                tools: [{ googleSearch: {} }],
 	                        },
+	                        userAgent: "pi-coding-agent",
 	                }),
 	                signal,
 	        });
--- a/src/resources/extensions/gsd/auto-artifact-paths.ts
+++ b/src/resources/extensions/gsd/auto-artifact-paths.ts
@ -56,7 +56,7 @@ export function resolveExpectedArtifactPath(
    }
    case "run-uat": {
      const dir = resolveSlicePath(base, mid, sid!);
-      return dir ? join(dir, buildSliceFileName(sid!, "UAT")) : null;
+      return dir ? join(dir, buildSliceFileName(sid!, "ASSESSMENT")) : null;
    }
    case "execute-task": {
      const dir = resolveSlicePath(base, mid, sid!);
@ -124,7 +124,7 @@ export function diagnoseExpectedArtifact(
    case "reassess-roadmap":
      return `${relSliceFile(base, mid, sid!, "ASSESSMENT")} (roadmap reassessment)`;
    case "run-uat":
-      return `${relSliceFile(base, mid, sid!, "UAT")} (UAT result)`;
+      return `${relSliceFile(base, mid, sid!, "ASSESSMENT")} (UAT assessment result)`;
    case "validate-milestone":
      return `${relMilestoneFile(base, mid, "VALIDATION")} (milestone validation report)`;
    case "complete-milestone":
--- a/src/resources/extensions/gsd/auto-dashboard.ts
+++ b/src/resources/extensions/gsd/auto-dashboard.ts
@ -569,6 +569,13 @@ export function updateProgressWidget(
          : "";
        lines.push(rightAlign(headerLeft, headerRight, width));

+        // Worktree/branch right-aligned below header
+        if (worktreeName && cachedBranch) {
+          lines.push(rightAlign("", theme.fg("dim", `${worktreeName} (${cachedBranch})`), width));
+        } else if (cachedBranch) {
+          lines.push(rightAlign("", theme.fg("dim", cachedBranch), width));
+        }
+
        // Show health signal details when degraded (yellow/red)
        if (score.level !== "green" && score.signals.length > 0 && widgetMode !== "min") {
          // Show up to 3 most relevant signals in compact form
@ -682,12 +689,12 @@ export function updateProgressWidget(
        const hasContext = !!(mid || (slice && unitType !== "research-milestone" && unitType !== "plan-milestone"));
        if (mid) {
          const modelTag = modelDisplay ? theme.fg("muted", `  ${modelDisplay}`) : "";
-          lines.push(truncateToWidth(`${pad}${theme.fg("dim", mid.title)}${modelTag}`, width));
+          lines.push(truncateToWidth(`${pad}${theme.fg("dim", mid.title)}${modelTag}`, width, "…"));
        }
        if (slice && unitType !== "research-milestone" && unitType !== "plan-milestone") {
          lines.push(truncateToWidth(
            `${pad}${theme.fg("text", theme.bold(`${slice.id}: ${slice.title}`))}`,
-            width,
+            width, "…",
          ));
        }
        if (hasContext) lines.push("");
@ -733,6 +740,12 @@ export function updateProgressWidget(
        const rightLines: string[] = [];
        const maxVisibleTasks = 8;

+        // Max visible chars for task title text (before ANSI theming)
+        const maxTaskTitleLen = 45;
+        function truncTitle(s: string): string {
+          return s.length > maxTaskTitleLen ? s.slice(0, maxTaskTitleLen - 1) + "…" : s;
+        }
+
        function formatTaskLine(t: { id: string; title: string; done: boolean }, isCurrent: boolean): string {
          const glyph = t.done
            ? theme.fg("success", "*")
@ -744,11 +757,12 @@ export function updateProgressWidget(
            : t.done
              ? theme.fg("muted", t.id)
              : theme.fg("dim", t.id);
+          const short = truncTitle(t.title);
          const title = isCurrent
-            ? theme.fg("text", t.title)
+            ? theme.fg("text", short)
            : t.done
-              ? theme.fg("muted", t.title)
-              : theme.fg("text", t.title);
+              ? theme.fg("muted", short)
+              : theme.fg("text", short);
          return `${glyph} ${id}: ${title}`;
        }

@ -771,7 +785,7 @@ export function updateProgressWidget(
          if (maxRows > 0) {
            lines.push("");
            for (let i = 0; i < maxRows; i++) {
-              const left = padToWidth(truncateToWidth(leftLines[i] ?? "", leftColWidth), leftColWidth);
+              const left = padToWidth(truncateToWidth(leftLines[i] ?? "", leftColWidth, "…"), leftColWidth);
              const right = rightLines[i] ?? "";
              lines.push(`${left}${right}`);
            }
@ -779,7 +793,7 @@ export function updateProgressWidget(
        } else {
          if (leftLines.length > 0) {
            lines.push("");
-            for (const l of leftLines) lines.push(truncateToWidth(l, width));
+            for (const l of leftLines) lines.push(truncateToWidth(l, width, "…"));
          }
        }

@ -808,23 +822,27 @@ export function updateProgressWidget(
            lines.push(rightAlign("", theme.fg("dim", cachedRtkLabel), width));
          }
        }
-        // PWD line with last commit info right-aligned
+        // Last commit info
        const lastCommit = getLastCommit(accessors.getBasePath());
-        const commitStr = lastCommit
-          ? theme.fg("dim", `${lastCommit.timeAgo} ago: ${lastCommit.message}`)
+        const maxCommitLen = 65;
+        const commitMsg = lastCommit
+          ? lastCommit.message.length > maxCommitLen
+            ? lastCommit.message.slice(0, maxCommitLen - 1) + "…"
+            : lastCommit.message
          : "";
-        const pwdStr = theme.fg("dim", widgetPwd);
-        if (commitStr) {
-          lines.push(rightAlign(`${pad}${pwdStr}`, truncateToWidth(commitStr, Math.floor(width * 0.45)), width));
-        } else {
-          lines.push(`${pad}${pwdStr}`);
-        }
        // Hints line
        const hintParts: string[] = [];
        hintParts.push("esc pause");
        hintParts.push(process.platform === "darwin" ? "⌃⌥G dashboard" : "Ctrl+Alt+G dashboard");
        const hintStr = theme.fg("dim", hintParts.join(" | "));
-        lines.push(rightAlign("", hintStr, width));
+        const commitStr = lastCommit
+          ? theme.fg("dim", `${lastCommit.timeAgo} ago: ${commitMsg}`)
+          : "";
+        if (commitStr) {
+          lines.push(rightAlign(`${pad}${commitStr}`, hintStr, width));
+        } else {
+          lines.push(rightAlign("", hintStr, width));
+        }

        lines.push(...ui.bar());

@ -851,12 +869,12 @@ function rightAlign(left: string, right: string, width: number): string {
  const leftVis = visibleWidth(left);
  const rightVis = visibleWidth(right);
  const gap = Math.max(1, width - leftVis - rightVis);
-  return truncateToWidth(left + " ".repeat(gap) + right, width);
+  return truncateToWidth(left + " ".repeat(gap) + right, width, "…");
 }

 /** Pad a string with trailing spaces to fill exactly `colWidth` (ANSI-aware). */
 function padToWidth(s: string, colWidth: number): string {
  const vis = visibleWidth(s);
-  if (vis >= colWidth) return truncateToWidth(s, colWidth);
+  if (vis >= colWidth) return truncateToWidth(s, colWidth, "…");
  return s + " ".repeat(colWidth - vis);
 }
--- a/src/resources/extensions/gsd/auto-dispatch.ts
+++ b/src/resources/extensions/gsd/auto-dispatch.ts
@ -28,6 +28,7 @@ import {
  buildSliceFileName,
 } from "./paths.js";
 import { existsSync, mkdirSync, readFileSync, writeFileSync } from "node:fs";
+import { logError } from "./workflow-logger.js";
 import { join } from "node:path";
 import { hasImplementationArtifacts } from "./auto-recovery.js";
 import {
@ -129,6 +130,21 @@ export function setRewriteCount(basePath: string, count: number): void {
  writeFileSync(filePath, JSON.stringify({ count, updatedAt: new Date().toISOString() }) + "\n");
 }

+// ─── Helpers ─────────────────────────────────────────────────────────────
+
+/**
+ * Returns true when the verification_operational value indicates that no
+ * operational verification is needed.  Covers common phrasings the planning
+ * agent may use: "None", "None required", "N/A", "Not applicable", etc.
+ *
+ * @see https://github.com/gsd-build/gsd-2/issues/2931
+ */
+export function isVerificationNotApplicable(value: string): boolean {
+  const v = (value ?? "").toLowerCase().trim();
+  if (!v || v === "none") return true;
+  return /^(?:none[\s._-]*(?:required|needed|planned)?|n\/?a|not[\s._-]+(?:applicable|required|needed)|no[\s._-]+operational[\s\S]*)$/i.test(v);
+}
+
 // ─── Rules ────────────────────────────────────────────────────────────────

 export const DISPATCH_RULES: DispatchRule[] = [
@ -511,7 +527,7 @@ export const DISPATCH_RULES: DispatchRule[] = [
        };
      } catch (err) {
        // Non-fatal — fall through to sequential execution
-        process.stderr.write(`gsd-reactive: graph derivation failed: ${(err as Error).message}\n`);
+        logError("dispatch", "reactive graph derivation failed", { error: (err as Error).message });
        return null;
      }
    },
@ -672,7 +688,7 @@ export const DISPATCH_RULES: DispatchRule[] = [
        if (isDbAvailable()) {
          const milestone = getMilestone(mid);
          if (milestone?.verification_operational &&
-              milestone.verification_operational.toLowerCase() !== "none") {
+              !isVerificationNotApplicable(milestone.verification_operational)) {
            const validationPath = resolveMilestoneFile(basePath, mid, "VALIDATION");
            if (validationPath) {
              const validationContent = await loadFile(validationPath);
--- a/src/resources/extensions/gsd/auto-model-selection.ts
+++ b/src/resources/extensions/gsd/auto-model-selection.ts
@ -222,9 +222,30 @@ export function resolveModelId<T extends { id: string; provider: string }>(
    );
  }

-  // Bare ID — prefer current provider, then first available
-  const exactProviderMatch = availableModels.find(
-    m => m.id === modelId && m.provider === currentProvider,
-  );
-  return exactProviderMatch ?? availableModels.find(m => m.id === modelId);
+  // Bare ID — resolve with provider precedence to avoid silent misrouting.
+  // Extension providers (e.g. claude-code) expose the same model IDs as their
+  // upstream API providers but route through a subprocess with different
+  // context, tool visibility, and cost characteristics (#2905).  Bare IDs in
+  // PREFERENCES.md must resolve to the canonical API provider, not to an
+  // extension wrapper that happens to be the current session provider.
+  const candidates = availableModels.filter(m => m.id === modelId);
+  if (candidates.length === 0) return undefined;
+  if (candidates.length === 1) return candidates[0];
+
+  // Extension / CLI-wrapper providers that should never win bare-ID resolution
+  // when a first-class API provider also offers the same model.
+  const EXTENSION_PROVIDERS = new Set(["claude-code"]);
+
+  // Prefer currentProvider only when it is a first-class API provider
+  if (currentProvider && !EXTENSION_PROVIDERS.has(currentProvider)) {
+    const providerMatch = candidates.find(m => m.provider === currentProvider);
+    if (providerMatch) return providerMatch;
+  }
+
+  // Prefer "anthropic" as the canonical provider for Anthropic models
+  const anthropicMatch = candidates.find(m => m.provider === "anthropic");
+  if (anthropicMatch) return anthropicMatch;
+
+  // Fall back to first non-extension candidate, or any candidate
+  return candidates.find(m => !EXTENSION_PROVIDERS.has(m.provider)) ?? candidates[0];
 }
--- a/src/resources/extensions/gsd/auto-post-unit.ts
+++ b/src/resources/extensions/gsd/auto-post-unit.ts
@ -13,6 +13,7 @@

 import type { ExtensionContext, ExtensionAPI } from "@gsd/pi-coding-agent";
 import { deriveState } from "./state.js";
+import { logWarning, logError } from "./workflow-logger.js";
 import { loadFile, parseSummary, resolveAllOverrides } from "./files.js";
 import { loadPrompt } from "./prompt-loader.js";
 import {
@ -412,10 +413,10 @@ export async function postUnitPreVerification(pctx: PostUnitContext, opts?: PreV
          );
        }
        for (const action of triageResult.actions) {
-          process.stderr.write(`gsd-triage: ${action}\n`);
+          logWarning("engine", `triage resolution: ${action}`);
        }
      } catch (err) {
-        process.stderr.write(`gsd-triage: resolution execution failed: ${(err as Error).message}\n`);
+        logError("engine", "triage resolution failed", { error: (err as Error).message });
      }
    }

@ -423,7 +424,7 @@ export async function postUnitPreVerification(pctx: PostUnitContext, opts?: PreV
    try {
      const rogueFiles = detectRogueFileWrites(s.currentUnit.type, s.currentUnit.id, s.basePath);
      for (const rogue of rogueFiles) {
-        process.stderr.write(`gsd-rogue: detected rogue file write: ${rogue.path} (unit: ${rogue.unitId})\n`);
+        logWarning("engine", "rogue file write detected", { path: rogue.path, unitId: rogue.unitId });
        ctx.ui.notify(`Rogue file write detected: ${rogue.path}`, "warning");
      }
    } catch (e) {
@ -465,7 +466,20 @@ export async function postUnitPreVerification(pctx: PostUnitContext, opts?: PreV
      // When artifact verification fails for a unit type that has a known expected
      // artifact, return "retry" so the caller re-dispatches with failure context
      // instead of blindly re-dispatching the same unit (#1571).
-      if (!triggerArtifactVerified) {
+      //
+      // HOWEVER, if the DB is unavailable (db_unavailable), the artifact was never
+      // written because the completion tool failed at the infra level. Retrying
+      // can never succeed and produces a costly re-dispatch loop (#2517).
+      if (!triggerArtifactVerified && !isDbAvailable()) {
+        // DB infra failure — do NOT retry; the completion tool returned
+        // db_unavailable so the artifact was never written. Retrying would
+        // produce an infinite re-dispatch loop (#2517).
+        debugLog("postUnit", { phase: "artifact-verify-skip-db-unavailable", unitType: s.currentUnit.type, unitId: s.currentUnit.id });
+        ctx.ui.notify(
+          `Artifact missing for ${s.currentUnit.type} ${s.currentUnit.id} but DB is unavailable — skipping retry to avoid loop (#2517)`,
+          "error",
+        );
+      } else if (!triggerArtifactVerified) {
        const hasExpectedArtifact = resolveExpectedArtifactPath(s.currentUnit.type, s.currentUnit.id, s.basePath) !== null;
        if (hasExpectedArtifact) {
          const retryKey = `${s.currentUnit.type}:${s.currentUnit.id}`;
--- a/src/resources/extensions/gsd/auto-prompts.ts
+++ b/src/resources/extensions/gsd/auto-prompts.ts
@ -1568,7 +1568,7 @@ export async function buildRunUatPrompt(

  const inlinedContext = capPreamble(`## Inlined Context (preloaded — do not re-read these files)\n\n${inlined.join("\n\n---\n\n")}`);

-  const uatResultPath = join(base, relSliceFile(base, mid, sliceId, "UAT"));
+  const uatResultPath = join(base, relSliceFile(base, mid, sliceId, "ASSESSMENT"));
  const uatType = getUatType(uatContent);

  return loadPrompt("run-uat", {
--- a/src/resources/extensions/gsd/auto-recovery.ts
+++ b/src/resources/extensions/gsd/auto-recovery.ts
@ -14,6 +14,7 @@ import { clearParseCache } from "./files.js";
 import { parseRoadmap as parseLegacyRoadmap, parsePlan as parseLegacyPlan } from "./parsers-legacy.js";
 import { isDbAvailable, getTask, getSlice, getSliceTasks, updateTaskStatus } from "./gsd-db.js";
 import { isValidationTerminal } from "./state.js";
+import { getErrorMessage } from "./error-utils.js";
 import {
  nativeConflictFiles,
  nativeCommit,
@ -476,11 +477,17 @@ export function reconcileMergeState(
  if (conflictedFiles.length === 0) {
    // All conflicts resolved — finalize the merge/squash commit
    try {
-      nativeCommit(basePath, ""); // --no-edit equivalent: use empty message placeholder
-      const mode = hasMergeHead ? "merge" : "squash commit";
-      ctx.ui.notify(`Finalized leftover ${mode} from prior session.`, "info");
-    } catch {
-      // Commit may already exist; non-fatal
+      const commitSha = nativeCommit(basePath, ""); // --no-edit equivalent: use empty message placeholder
+      if (commitSha) {
+        const mode = hasMergeHead ? "merge" : "squash commit";
+        ctx.ui.notify(`Finalized leftover ${mode} from prior session.`, "info");
+      } else {
+        ctx.ui.notify("No new commit needed for leftover merge/squash state — already committed.", "info");
+      }
+    } catch (err) {
+      const errorMessage = getErrorMessage(err);
+      ctx.ui.notify(`Failed to finalize leftover merge/squash commit: ${errorMessage}`, "error");
+      return false;
    }
  } else {
    // Still conflicted — try auto-resolving .gsd/ state file conflicts (#530)
--- a/src/resources/extensions/gsd/auto-start.ts
+++ b/src/resources/extensions/gsd/auto-start.ts
@ -58,9 +58,8 @@ import { initRoutingHistory } from "./routing-history.js";
 import { restoreHookState, resetHookState } from "./post-unit-hooks.js";
 import { resetProactiveHealing, setLevelChangeCallback } from "./doctor-proactive.js";
 import { snapshotSkills } from "./skill-discovery.js";
-import { isDbAvailable, getMilestone, openDatabase } from "./gsd-db.js";
+import { isDbAvailable, getMilestone } from "./gsd-db.js";
 import { hideFooter } from "./auto-dashboard.js";
-import { resolveProjectRootDbPath } from "./bootstrap/dynamic-tools.js";
 import {
  debugLog,
  enableDebug,
@ -68,7 +67,6 @@ import {
  getDebugLogPath,
 } from "./debug-logger.js";
 import { parseUnitId } from "./unit-id.js";
-import { setLogBasePath } from "./workflow-logger.js";
 import type { AutoSession } from "./auto/session.js";
 import {
  existsSync,
@ -80,6 +78,7 @@ import {
 import { join } from "node:path";
 import { sep as pathSep } from "node:path";

+import { resolveProjectRootDbPath } from "./bootstrap/dynamic-tools.js";
 import type { WorktreeResolver } from "./worktree-resolver.js";

 export interface BootstrapDeps {
@ -98,26 +97,32 @@ export interface BootstrapDeps {
 * concurrent session detected). Returns true when ready to dispatch.
 */

+/**
+ * Open the project-root DB before the first deriveState call (#2841).
+ * When auto-mode starts cold (no prior DB handle), state derivation that
+ * touches DB-backed helpers (queue-order, task status) silently falls back
+ * to markdown-only data, producing stale or incomplete state.  Opening the
+ * DB first ensures deriveState sees the full picture on its very first run.
+ */
+async function openProjectDbIfPresent(basePath: string): Promise<void> {
+  const gsdDbPath = resolveProjectRootDbPath(basePath);
+  if (!existsSync(gsdDbPath)) return;
+  if (isDbAvailable()) return;
+
+  try {
+    const { openDatabase } = await import("./gsd-db.js");
+    openDatabase(gsdDbPath);
+  } catch {
+    /* non-fatal — DB lifecycle block below will retry */
+  }
+}
+
 /** Guard: tracks consecutive bootstrap attempts that found phase === "complete".
 *  Prevents the recursive dialog loop described in #1348 where
 *  bootstrapAutoSession → showSmartEntry → checkAutoStartAfterDiscuss → startAuto
 *  cycles indefinitely when the discuss workflow doesn't produce a milestone. */
 let _consecutiveCompleteBootstraps = 0;
 const MAX_CONSECUTIVE_COMPLETE_BOOTSTRAPS = 2;
-
-async function openProjectDbIfPresent(basePath: string): Promise<void> {
-  const gsdDbPath = resolveProjectRootDbPath(basePath);
-  if (!existsSync(gsdDbPath) || isDbAvailable()) return;
-
-  try {
-    openDatabase(gsdDbPath);
-  } catch (err) {
-    process.stderr.write(
-      `gsd-db: failed to open existing database: ${(err as Error).message}\n`,
-    );
-  }
-}
-
 export async function bootstrapAutoSession(
  s: AutoSession,
  ctx: ExtensionCommandContext,
@ -198,10 +203,13 @@ export async function bootstrapAutoSession(
    ensureGitignore(base, { manageGitignore });
    if (manageGitignore !== false) untrackRuntimeFiles(base);

-    // Bootstrap .gsd/ if it doesn't exist
+    // Bootstrap milestones/ if it doesn't exist.
+    // Check milestones/ directly — ensureGsdSymlink above already created .gsd/,
+    // so checking .gsd/ existence would be dead code (#2942).
    const gsdDir = join(base, ".gsd");
-    if (!existsSync(gsdDir)) {
-      mkdirSync(join(gsdDir, "milestones"), { recursive: true });
+    const milestonesPath = join(gsdDir, "milestones");
+    if (!existsSync(milestonesPath)) {
+      mkdirSync(milestonesPath, { recursive: true });
      try {
        nativeAddAll(base);
        nativeCommit(base, "chore: init gsd");
@ -280,10 +288,6 @@ export async function bootstrapAutoSession(
      ctx.ui.notify(`Debug logging enabled → ${getDebugLogPath()}`, "info");
    }

-    // Open the project DB before the first derive so resume uses DB truth
-    // immediately on cold starts instead of falling back to markdown (#2841).
-    await openProjectDbIfPresent(base);
-
    // Invalidate caches before initial state derivation
    invalidateAllCaches();

@ -293,6 +297,10 @@ export async function bootstrapAutoSession(
      (mid) => !!resolveMilestoneFile(base, mid, "SUMMARY"),
    );

+    // Open the project-root DB before deriveState so DB-backed state
+    // derivation (queue-order, task status) works on a cold start (#2841).
+    await openProjectDbIfPresent(base);
+
    let state = await deriveState(base);

    // Stale worktree state recovery (#654)
@ -490,7 +498,6 @@ export async function bootstrapAutoSession(
    s.verbose = verboseMode;
    s.cmdCtx = ctx;
    s.basePath = base;
-    setLogBasePath(base);
    s.unitDispatchCount.clear();
    s.unitRecoveryCount.clear();
    s.lastBudgetAlertLevel = 0;
@ -554,14 +561,15 @@ export async function bootstrapAutoSession(
    }

    // ── DB lifecycle ──
-    const gsdDbPath = resolveProjectRootDbPath(s.basePath);
+    const gsdDbPath = join(s.basePath, ".gsd", "gsd.db");
    const gsdDirPath = join(s.basePath, ".gsd");
    if (existsSync(gsdDirPath) && !existsSync(gsdDbPath)) {
      const hasDecisions = existsSync(join(gsdDirPath, "DECISIONS.md"));
      const hasRequirements = existsSync(join(gsdDirPath, "REQUIREMENTS.md"));
      const hasMilestones = existsSync(join(gsdDirPath, "milestones"));
      try {
-        openDatabase(gsdDbPath);
+        const { openDatabase: openDb } = await import("./gsd-db.js");
+        openDb(gsdDbPath);
        if (hasDecisions || hasRequirements || hasMilestones) {
          const { migrateFromMarkdown } = await import("./md-importer.js");
          migrateFromMarkdown(s.basePath);
@ -574,7 +582,8 @@ export async function bootstrapAutoSession(
    }
    if (existsSync(gsdDbPath) && !isDbAvailable()) {
      try {
-        openDatabase(gsdDbPath);
+        const { openDatabase: openDb } = await import("./gsd-db.js");
+        openDb(gsdDbPath);
      } catch (err) {
        process.stderr.write(
          `gsd-db: failed to open existing database: ${(err as Error).message}\n`,
--- a/src/resources/extensions/gsd/auto-worktree.ts
+++ b/src/resources/extensions/gsd/auto-worktree.ts
@ -15,6 +15,7 @@ import {
  realpathSync,
  rmSync,
  unlinkSync,
+  statSync,
  lstatSync as lstatSyncFn,
 } from "node:fs";
 import { isAbsolute, join, sep as pathSep } from "node:path";
@ -62,6 +63,7 @@ import {
  nativeDiffNumstat,
  nativeUpdateRef,
  nativeIsAncestor,
+  nativeMergeAbort,
 } from "./native-git-bridge.js";

 const gsdHome = process.env.GSD_HOME || join(homedir(), ".gsd");
@ -84,6 +86,7 @@ const ROOT_STATE_FILES = [
  "QUEUE.md",
  "completed-units.json",
  "metrics.json",
+  "mcp.json",
  // NOTE: project preferences are intentionally NOT in ROOT_STATE_FILES.
  // Forward-sync (main → worktree) is handled explicitly in syncGsdStateToWorktree().
  // Back-sync (worktree → main) must NEVER overwrite the project root's copy
@ -102,6 +105,67 @@ function isSamePath(a: string, b: string): boolean {
  }
 }

+// ─── ASSESSMENT Force-Sync Helper (#2821) ─────────────────────────────────
+
+/** Regex matching YAML frontmatter `verdict:` field. */
+const VERDICT_RE = /verdict:\s*[\w-]+/i;
+
+/**
+ * Walk a milestone directory and force-overwrite ASSESSMENT files in the
+ * destination when the source copy contains a `verdict:` field.
+ *
+ * This is the targeted fix for the UAT stuck-loop (#2821): the main
+ * safeCopyRecursive uses force:false to protect worktree-authoritative
+ * files (#1886), but ASSESSMENT files written by run-uat must be
+ * forward-synced when the project root has a verdict. Without this,
+ * the worktree retains a stale FAIL or missing ASSESSMENT and
+ * checkNeedsRunUat re-dispatches run-uat indefinitely.
+ *
+ * Only overwrites when the source has a verdict — never clobbers a
+ * worktree ASSESSMENT with a verdictless project-root copy.
+ */
+function forceOverwriteAssessmentsWithVerdict(
+  srcMilestoneDir: string,
+  dstMilestoneDir: string,
+): void {
+  if (!existsSync(srcMilestoneDir)) return;
+
+  // Walk slices/<SID>/ looking for *-ASSESSMENT.md files
+  const slicesDir = join(srcMilestoneDir, "slices");
+  if (!existsSync(slicesDir)) return;
+
+  try {
+    for (const sliceEntry of readdirSync(slicesDir, { withFileTypes: true })) {
+      if (!sliceEntry.isDirectory()) continue;
+      const srcSliceDir = join(slicesDir, sliceEntry.name);
+      const dstSliceDir = join(dstMilestoneDir, "slices", sliceEntry.name);
+
+      try {
+        for (const fileEntry of readdirSync(srcSliceDir, { withFileTypes: true })) {
+          if (!fileEntry.isFile()) continue;
+          if (!fileEntry.name.endsWith("-ASSESSMENT.md")) continue;
+
+          const srcFile = join(srcSliceDir, fileEntry.name);
+          try {
+            const srcContent = readFileSync(srcFile, "utf-8");
+            if (!VERDICT_RE.test(srcContent)) continue; // no verdict in source — skip
+
+            // Source has a verdict — force-copy into worktree
+            mkdirSync(dstSliceDir, { recursive: true });
+            safeCopy(srcFile, join(dstSliceDir, fileEntry.name), { force: true });
+          } catch {
+            /* non-fatal per file */
+          }
+        }
+      } catch {
+        /* non-fatal per slice */
+      }
+    }
+  } catch {
+    /* non-fatal */
+  }
+}
+
 // ─── Module State ──────────────────────────────────────────────────────────

 /** Original project root before chdir into auto-worktree. */
@ -214,6 +278,19 @@ export function syncProjectRootToWorktree(
    { force: false },
  );

+  // Force-sync ASSESSMENT files that have a verdict from project root (#2821).
+  // The additive-only copy above preserves worktree-authoritative files, but
+  // ASSESSMENT files are special: after run-uat writes a verdict and post-unit
+  // syncs it to the project root, the worktree may retain a stale copy (e.g.
+  // verdict:fail while the project root has verdict:pass from a retry). On
+  // session resume the DB is rebuilt from disk, and if the stale ASSESSMENT
+  // persists, checkNeedsRunUat finds no passing verdict → re-dispatches
+  // run-uat indefinitely (stuck-loop ×9).
+  forceOverwriteAssessmentsWithVerdict(
+    join(prGsd, "milestones", milestoneId),
+    join(wtGsd, "milestones", milestoneId),
+  );
+
  // Forward-sync completed-units.json from project root to worktree.
  // Project root is authoritative for completion state after crash recovery;
  // without this, the worktree re-dispatches already-completed units (#1886).
@ -223,12 +300,18 @@ export function syncProjectRootToWorktree(
    { force: true },
  );

-  // Delete worktree gsd.db so it rebuilds from the freshly synced files.
-  // Stale DB rows are the root cause of the infinite skip loop (#853).
+  // Delete worktree gsd.db ONLY if it is empty (0 bytes).
+  // An empty DB is stale/corrupt and should be rebuilt (#853).
+  // A non-empty DB was populated by gsd-migrate on respawn and must be
+  // preserved — deleting it truncates the file to 0 bytes when
+  // openDatabase re-creates it, causing "no such table" failures (#2815).
  try {
    const wtDb = join(wtGsd, "gsd.db");
    if (existsSync(wtDb)) {
-      unlinkSync(wtDb);
+      const size = statSync(wtDb).size;
+      if (size === 0) {
+        unlinkSync(wtDb);
+      }
    }
  } catch {
    /* non-fatal */
@ -1004,6 +1087,7 @@ function copyPlanningArtifacts(srcBase: string, wtPath: string): void {
    "STATE.md",
    "KNOWLEDGE.md",
    "OVERRIDES.md",
+    "mcp.json",
  ]) {
    safeCopy(join(srcGsd, file), join(dstGsd, file), { force: true });
  }
@ -1414,9 +1498,19 @@ export function mergeMilestoneToMain(
      encoding: "utf-8",
    }).trim();
    if (status) {
+      // Use --include-untracked to stash untracked files that would block
+      // the squash merge, but EXCLUDE .gsd/milestones/ (#2505).
+      // --include-untracked without exclusion sweeps queued milestone
+      // CONTEXT files into the stash. If stash pop later fails, those files
+      // are permanently trapped in the stash entry and lost on the next
+      // stash push or drop.
      execFileSync(
        "git",
-        ["stash", "push", "--include-untracked", "-m", `gsd: pre-merge stash for ${milestoneId}`],
+        [
+          "stash", "push", "--include-untracked",
+          "-m", `gsd: pre-merge stash for ${milestoneId}`,
+          "--", ":(exclude).gsd/milestones",
+        ],
        { cwd: originalBasePath_, stdio: ["ignore", "pipe", "pipe"], encoding: "utf-8" },
      );
      stashed = true;
@ -1426,6 +1520,65 @@ export function mergeMilestoneToMain(
    // report the dirty tree if it fails.
  }

+  // 7a. Shelter queued milestone directories before the squash merge (#2505).
+  // The milestone branch may contain copies of queued milestone dirs (via
+  // copyPlanningArtifacts), so `git merge --squash` rejects when those same
+  // files exist as untracked in the working tree. Temporarily move them to
+  // a backup location, then restore after the merge+commit.
+  const milestonesDir = join(gsdRoot(originalBasePath_), "milestones");
+  const shelterDir = join(gsdRoot(originalBasePath_), ".milestone-shelter");
+  const shelteredDirs: string[] = [];
+
+  // Helper: restore sheltered milestone directories (#2505).
+  // Called on both success and error paths to ensure queued CONTEXT files
+  // are never permanently lost.
+  const restoreShelter = (): void => {
+    if (shelteredDirs.length === 0) return;
+    for (const dirName of shelteredDirs) {
+      try {
+        mkdirSync(milestonesDir, { recursive: true });
+        cpSync(join(shelterDir, dirName), join(milestonesDir, dirName), { recursive: true, force: true });
+      } catch { /* best-effort */ }
+    }
+    try { rmSync(shelterDir, { recursive: true, force: true }); } catch { /* best-effort */ }
+  };
+
+  try {
+    if (existsSync(milestonesDir)) {
+      const entries = readdirSync(milestonesDir, { withFileTypes: true });
+      for (const entry of entries) {
+        if (!entry.isDirectory()) continue;
+        // Only shelter directories that do NOT belong to the milestone being merged
+        if (entry.name === milestoneId) continue;
+        const srcDir = join(milestonesDir, entry.name);
+        const dstDir = join(shelterDir, entry.name);
+        try {
+          mkdirSync(shelterDir, { recursive: true });
+          cpSync(srcDir, dstDir, { recursive: true, force: true });
+          rmSync(srcDir, { recursive: true, force: true });
+          shelteredDirs.push(entry.name);
+        } catch {
+          // Non-fatal — if shelter fails, the merge may still succeed
+        }
+      }
+    }
+  } catch {
+    // Non-fatal — proceed with merge; untracked files may block it
+  }
+
+  // 7b. Clean up stale merge state before attempting squash merge (#2912).
+  // A leftover MERGE_HEAD (from a previous failed merge, libgit2 native path,
+  // or interrupted operation) causes `git merge --squash` to refuse with
+  // "fatal: You have not concluded your merge (MERGE_HEAD exists)".
+  // Defensively remove merge artifacts before starting.
+  try {
+    const gitDir_ = resolveGitDir(originalBasePath_);
+    for (const f of ["SQUASH_MSG", "MERGE_MSG", "MERGE_HEAD"]) {
+      const p = join(gitDir_, f);
+      if (existsSync(p)) unlinkSync(p);
+    }
+  } catch { /* best-effort */ }
+
  // 8. Squash merge — auto-resolve .gsd/ state file conflicts (#530)
  const mergeResult = nativeMergeSquash(originalBasePath_, milestoneBranch);

@ -1434,6 +1587,16 @@ export function mergeMilestoneToMain(
    // untracked .gsd/ files left by syncStateToProjectRoot).  Preserve the
    // milestone branch so commits are not lost.
    if (mergeResult.conflicts.includes("__dirty_working_tree__")) {
+      // Defensively clean merge state — the native path may leave MERGE_HEAD
+      // even when the merge is rejected (#2912).
+      try {
+        const gitDir_ = resolveGitDir(originalBasePath_);
+        for (const f of ["SQUASH_MSG", "MERGE_MSG", "MERGE_HEAD"]) {
+          const p = join(gitDir_, f);
+          if (existsSync(p)) unlinkSync(p);
+        }
+      } catch { /* best-effort */ }
+
      // Pop stash before throwing so local work is not lost.
      if (stashed) {
        try {
@ -1444,6 +1607,7 @@ export function mergeMilestoneToMain(
          });
        } catch { /* stash pop conflict is non-fatal */ }
      }
+      restoreShelter();
      // Restore cwd so the caller is not stranded on the integration branch
      process.chdir(previousCwd);
      // Surface the actual dirty filenames from git stderr instead of
@ -1490,6 +1654,18 @@ export function mergeMilestoneToMain(

      // If there are still real code conflicts, escalate
      if (codeConflicts.length > 0) {
+        // Abort merge state so MERGE_HEAD is not left on disk (#2912).
+        // libgit2's merge creates MERGE_HEAD even for squash merges; if left
+        // dangling, subsequent merges fail and doctor reports corrupt state.
+        try { nativeMergeAbort(originalBasePath_); } catch { /* best-effort */ }
+        try {
+          const gitDir_ = resolveGitDir(originalBasePath_);
+          for (const f of ["SQUASH_MSG", "MERGE_MSG", "MERGE_HEAD"]) {
+            const p = join(gitDir_, f);
+            if (existsSync(p)) unlinkSync(p);
+          }
+        } catch { /* best-effort */ }
+
        // Pop stash before throwing so local work is not lost (#2151).
        if (stashed) {
          try {
@ -1500,6 +1676,7 @@ export function mergeMilestoneToMain(
            });
          } catch { /* stash pop conflict is non-fatal */ }
        }
+        restoreShelter();
        throw new MergeConflictError(
          codeConflicts,
          "squash",
@ -1515,14 +1692,18 @@ export function mergeMilestoneToMain(
  const commitResult = nativeCommit(originalBasePath_, commitMessage);
  const nothingToCommit = commitResult === null;

-  // 9a. Clean up SQUASH_MSG left by git merge --squash (#1853).
+  // 9a. Clean up merge state files left by git merge --squash (#1853, #2912).
  // git only removes SQUASH_MSG when the commit reads it directly (plain
  // `git commit`).  nativeCommit uses `-F -` (stdin) or libgit2, neither
-  // of which trigger git's SQUASH_MSG cleanup.  If left on disk, doctor
-  // reports `corrupt_merge_state` on every subsequent run.
+  // of which trigger git's SQUASH_MSG cleanup.  MERGE_HEAD is created by
+  // libgit2's merge even in squash mode and is not removed by nativeCommit.
+  // If left on disk, doctor reports `corrupt_merge_state` on every subsequent run.
  try {
-    const squashMsgPath = join(resolveGitDir(originalBasePath_), "SQUASH_MSG");
-    if (existsSync(squashMsgPath)) unlinkSync(squashMsgPath);
+    const gitDir_ = resolveGitDir(originalBasePath_);
+    for (const f of ["SQUASH_MSG", "MERGE_MSG", "MERGE_HEAD"]) {
+      const p = join(gitDir_, f);
+      if (existsSync(p)) unlinkSync(p);
+    }
  } catch { /* best-effort */ }

  // 9a-ii. Restore stashed files now that the merge+commit is complete (#2151).
@ -1581,6 +1762,9 @@ export function mergeMilestoneToMain(
    }
  }

+  // 9a-iii. Restore sheltered queued milestone directories (#2505).
+  restoreShelter();
+
  // 9b. Safety check (#1792): if nothing was committed, verify the milestone
  // work is already on the integration branch before allowing teardown.
  // Compare only non-.gsd/ paths — .gsd/ state files diverge normally and
--- a/src/resources/extensions/gsd/auto/loop-deps.ts
+++ b/src/resources/extensions/gsd/auto/loop-deps.ts
@ -93,6 +93,7 @@ export interface LoopDeps {
    body: string,
    kind: string,
    category: string,
+    projectName?: string,
  ) => void;
  setActiveMilestoneId: (basePath: string, mid: string) => void;
  pruneQueueOrder: (basePath: string, pendingIds: string[]) => void;
--- a/src/resources/extensions/gsd/auto/phases.ts
+++ b/src/resources/extensions/gsd/auto/phases.ts
@ -26,7 +26,7 @@ import { runUnit } from "./run-unit.js";
 import { debugLog } from "../debug-logger.js";
 import { PROJECT_FILES } from "../detection.js";
 import { MergeConflictError } from "../git-service.js";
-import { join } from "node:path";
+import { join, basename } from "node:path";
 import { existsSync, cpSync } from "node:fs";
 import { logWarning, logError } from "../workflow-logger.js";
 import { gsdRoot } from "../paths.js";
@ -230,6 +230,7 @@ export async function runPreDispatch(
      `Milestone ${s.currentMilestoneId} complete!`,
      "success",
      "milestone",
+      basename(s.originalBasePath || s.basePath),
    );
    deps.logCmuxEvent(
      prefs,
@ -388,6 +389,7 @@ export async function runPreDispatch(
        "All milestones complete!",
        "success",
        "milestone",
+        basename(s.originalBasePath || s.basePath),
      );
      deps.logCmuxEvent(
        prefs,
@ -411,7 +413,7 @@ export async function runPreDispatch(
      const blockerMsg = `Blocked: ${state.blockers.join(", ")}`;
      await deps.stopAuto(ctx, pi, blockerMsg);
      ctx.ui.notify(`${blockerMsg}. Fix and run /gsd auto.`, "warning");
-      deps.sendDesktopNotification("GSD", blockerMsg, "error", "attention");
+      deps.sendDesktopNotification("GSD", blockerMsg, "error", "attention", basename(s.originalBasePath || s.basePath));
      deps.logCmuxEvent(prefs, blockerMsg, "error");
    } else {
      const ids = incomplete.map((m: { id: string }) => m.id).join(", ");
@ -492,6 +494,7 @@ export async function runPreDispatch(
      `Milestone ${mid} complete!`,
      "success",
      "milestone",
+      basename(s.originalBasePath || s.basePath),
    );
    deps.logCmuxEvent(
      prefs,
@ -509,7 +512,7 @@ export async function runPreDispatch(
    const blockerMsg = `Blocked: ${state.blockers.join(", ")}`;
    await closeoutAndStop(ctx, pi, s, deps, blockerMsg);
    ctx.ui.notify(`${blockerMsg}. Fix and run /gsd auto.`, "warning");
-    deps.sendDesktopNotification("GSD", blockerMsg, "error", "attention");
+    deps.sendDesktopNotification("GSD", blockerMsg, "error", "attention", basename(s.originalBasePath || s.basePath));
    deps.logCmuxEvent(prefs, blockerMsg, "error");
    debugLog("autoLoop", { phase: "exit", reason: "blocked" });
    deps.emitJournalEvent({ ts: new Date().toISOString(), flowId: ic.flowId, seq: ic.nextSeq(), eventType: "terminal", data: { reason: "blocked", blockers: state.blockers } });
@ -755,7 +758,7 @@ export async function runGuards(
        // 100% — special enforcement logic (halt/pause/warn)
        const msg = `Budget ceiling ${deps.formatCost(budgetCeiling)} reached (spent ${deps.formatCost(totalCost)}).`;
        if (budgetEnforcementAction === "halt") {
-          deps.sendDesktopNotification("GSD", msg, "error", "budget");
+          deps.sendDesktopNotification("GSD", msg, "error", "budget", basename(s.originalBasePath || s.basePath));
          await deps.stopAuto(ctx, pi, "Budget ceiling reached");
          debugLog("autoLoop", { phase: "exit", reason: "budget-halt" });
          return { action: "break", reason: "budget-halt" };
@ -765,14 +768,14 @@ export async function runGuards(
            `${msg} Pausing auto-mode — /gsd auto to override and continue.`,
            "warning",
          );
-          deps.sendDesktopNotification("GSD", msg, "warning", "budget");
+          deps.sendDesktopNotification("GSD", msg, "warning", "budget", basename(s.originalBasePath || s.basePath));
          deps.logCmuxEvent(prefs, msg, "warning");
          await deps.pauseAuto(ctx, pi);
          debugLog("autoLoop", { phase: "exit", reason: "budget-pause" });
          return { action: "break", reason: "budget-pause" };
        }
        ctx.ui.notify(`${msg} Continuing (enforcement: warn).`, "warning");
-        deps.sendDesktopNotification("GSD", msg, "warning", "budget");
+        deps.sendDesktopNotification("GSD", msg, "warning", "budget", basename(s.originalBasePath || s.basePath));
        deps.logCmuxEvent(prefs, msg, "warning");
      } else if (threshold.pct < 100) {
        // Sub-100% — simple notification
@ -783,6 +786,7 @@ export async function runGuards(
          msg,
          threshold.notifyLevel,
          "budget",
+          basename(s.originalBasePath || s.basePath),
        );
        deps.logCmuxEvent(prefs, msg, threshold.cmuxLevel);
      }
@ -812,6 +816,7 @@ export async function runGuards(
        `Context ${contextUsage.percent}% — paused`,
        "warning",
        "attention",
+        basename(s.originalBasePath || s.basePath),
      );
      await deps.pauseAuto(ctx, pi);
      debugLog("autoLoop", { phase: "exit", reason: "context-window" });
@ -929,6 +934,23 @@ export async function runUnitPhase(
    },
  );

+  // Select and apply model (with tier escalation on retry — normal units only)
+  const modelResult = await deps.selectAndApplyModel(
+    ctx,
+    pi,
+    unitType,
+    unitId,
+    s.basePath,
+    prefs,
+    s.verbose,
+    s.autoModeStartModel,
+    sidecarItem ? undefined : { isRetry, previousTier },
+  );
+  s.currentUnitRouting =
+    modelResult.routing as AutoSession["currentUnitRouting"];
+  s.currentUnitModel =
+    modelResult.appliedModel as AutoSession["currentUnitModel"];
+
  // Status bar + progress widget
  ctx.ui.setStatus("gsd-auto", "auto");
  if (mid)
@ -1001,23 +1023,6 @@ export async function runUnitPhase(
    logWarning("engine", "Prompt reorder failed", { error: msg });
  }

-  // Select and apply model (with tier escalation on retry — normal units only)
-  const modelResult = await deps.selectAndApplyModel(
-    ctx,
-    pi,
-    unitType,
-    unitId,
-    s.basePath,
-    prefs,
-    s.verbose,
-    s.autoModeStartModel,
-    sidecarItem ? undefined : { isRetry, previousTier },
-  );
-  s.currentUnitRouting =
-    modelResult.routing as AutoSession["currentUnitRouting"];
-  s.currentUnitModel =
-    modelResult.appliedModel as AutoSession["currentUnitModel"];
-
  // Apply sidecar/pre-dispatch hook model override (takes priority over standard model selection)
  const hookModelOverride = sidecarItem?.model ?? iterData.hookModelOverride;
  if (hookModelOverride) {
@ -1142,14 +1147,18 @@ export async function runUnitPhase(
  // ── Immediate unit closeout (metrics, activity log, memory) ────────
  // Run right after runUnit() returns so telemetry is never lost to a
  // crash between iterations.
-  await deps.closeoutUnit(
-    ctx,
-    s.basePath,
-    unitType,
-    unitId,
-    s.currentUnit.startedAt,
-    deps.buildSnapshotOpts(unitType, unitId),
-  );
+  // Guard: stopAuto() may have nulled s.currentUnit via s.reset() while
+  // this coroutine was suspended at `await runUnit(...)` (#2939).
+  if (s.currentUnit) {
+    await deps.closeoutUnit(
+      ctx,
+      s.basePath,
+      unitType,
+      unitId,
+      s.currentUnit.startedAt,
+      deps.buildSnapshotOpts(unitType, unitId),
+    );
+  }

  // ── Zero tool-call guard (#1833) ──────────────────────────────────
  // An execute-task agent that completes with 0 tool calls made no
@ -1159,7 +1168,7 @@ export async function runUnitPhase(
    const currentLedger = deps.getLedger() as { units: Array<{ type: string; id: string; startedAt: number; toolCalls: number }> } | null;
    if (currentLedger?.units) {
      const lastUnit = [...currentLedger.units].reverse().find(
-        (u: { type: string; id: string; startedAt: number; toolCalls: number }) => u.type === unitType && u.id === unitId && u.startedAt === s.currentUnit!.startedAt,
+        (u: { type: string; id: string; startedAt: number; toolCalls: number }) => u.type === unitType && u.id === unitId && u.startedAt === s.currentUnit?.startedAt,
      );
      if (lastUnit && lastUnit.toolCalls === 0) {
        debugLog("runUnitPhase", {
@ -1174,7 +1183,7 @@ export async function runUnitPhase(
        );
        // Fall through to next iteration where dispatch will re-derive
        // and re-dispatch this task.
-        return { action: "next", data: { unitStartedAt: s.currentUnit.startedAt } };
+        return { action: "next", data: { unitStartedAt: s.currentUnit?.startedAt } };
      }
    }
  }
@ -1198,7 +1207,7 @@ export async function runUnitPhase(

  deps.emitJournalEvent({ ts: new Date().toISOString(), flowId: ic.flowId, seq: ic.nextSeq(), eventType: "unit-end", data: { unitType, unitId, status: unitResult.status, artifactVerified, ...(unitResult.errorContext ? { errorContext: unitResult.errorContext } : {}) }, causedBy: { flowId: ic.flowId, seq: unitStartSeq } });

-  return { action: "next", data: { unitStartedAt: s.currentUnit.startedAt } };
+  return { action: "next", data: { unitStartedAt: s.currentUnit?.startedAt } };
 }

 // ─── runFinalize ──────────────────────────────────────────────────────────────
--- a/src/resources/extensions/gsd/bootstrap/agent-end-recovery.ts
+++ b/src/resources/extensions/gsd/bootstrap/agent-end-recovery.ts
@ -68,6 +68,28 @@ export async function handleAgentEnd(

  const lastMsg = event.messages[event.messages.length - 1];
  if (lastMsg && "stopReason" in lastMsg && lastMsg.stopReason === "aborted") {
+    // Empty content with aborted stopReason is a non-fatal agent stop (the LLM
+    // chose to end without producing output). Only pause on genuine fatal aborts
+    // that carry error context — e.g. errorMessage field or non-empty content
+    // indicating a mid-stream failure. (#2695)
+    const content = "content" in lastMsg ? lastMsg.content : undefined;
+    const hasEmptyContent = Array.isArray(content) && content.length === 0;
+    const hasErrorMessage = "errorMessage" in lastMsg && !!lastMsg.errorMessage;
+
+    if (hasEmptyContent && !hasErrorMessage) {
+      // Non-fatal: treat as a normal agent end so the loop can continue
+      // instead of entering a stuck re-dispatch cycle.
+      try {
+        resetRetryState(retryState);
+        resolveAgentEnd(event);
+      } catch (err) {
+        const message = err instanceof Error ? err.message : String(err);
+        ctx.ui.notify(`Auto-mode error after empty-content abort: ${message}. Stopping auto-mode.`, "error");
+        try { await pauseAuto(ctx, pi); } catch { /* best-effort */ }
+      }
+      return;
+    }
+
    await pauseAuto(ctx, pi);
    return;
  }
@ -79,6 +101,15 @@ export async function handleAgentEnd(
    // ── 1. Classify ──────────────────────────────────────────────────────
    const cls = classifyError(errorMsg, explicitRetryAfterMs);

+    // Cap rate-limit backoff for CLI-style providers (openai-codex, google-gemini-cli)
+    // which use per-user quotas with shorter windows (#2922).
+    if (cls.kind === "rate-limit") {
+      const currentProvider = ctx.model?.provider;
+      if (currentProvider === "openai-codex" || currentProvider === "google-gemini-cli") {
+        cls.retryAfterMs = Math.min(cls.retryAfterMs, 30_000);
+      }
+    }
+
    // ── 2. Decide & Act ──────────────────────────────────────────────────

    // --- Network errors: same-model retry with backoff ---
--- a/src/resources/extensions/gsd/bootstrap/db-tools.ts
+++ b/src/resources/extensions/gsd/bootstrap/db-tools.ts
@ -121,14 +121,6 @@ export function registerDbTools(pi: ExtensionAPI): void {
      };
    }
    try {
-      const db = await import("../gsd-db.js");
-      const existing = db.getRequirementById(params.id);
-      if (!existing) {
-        return {
-          content: [{ type: "text" as const, text: `Error: Requirement ${params.id} not found.` }],
-          details: { operation: "update_requirement", id: params.id, error: "not_found" } as any,
-        };
-      }
      const { updateRequirementInDb } = await import("../db-writer.js");
      const updates: Record<string, string | undefined> = {};
      if (params.status !== undefined) updates.status = params.status;
@ -196,6 +188,91 @@ export function registerDbTools(pi: ExtensionAPI): void {
  pi.registerTool(requirementUpdateTool);
  registerAlias(pi, requirementUpdateTool, "gsd_update_requirement", "gsd_requirement_update");

+  // ─── gsd_requirement_save ─────────────────────────────────────────────
+
+  const requirementSaveExecute = async (_toolCallId: string, params: any, _signal: AbortSignal | undefined, _onUpdate: unknown, _ctx: unknown) => {
+    const dbAvailable = await ensureDbOpen();
+    if (!dbAvailable) {
+      return {
+        content: [{ type: "text" as const, text: "Error: GSD database is not available. Cannot save requirement." }],
+        details: { operation: "save_requirement", error: "db_unavailable" } as any,
+      };
+    }
+    try {
+      const { saveRequirementToDb } = await import("../db-writer.js");
+      const result = await saveRequirementToDb(
+        {
+          class: params.class,
+          status: params.status,
+          description: params.description,
+          why: params.why,
+          source: params.source,
+          primary_owner: params.primary_owner,
+          supporting_slices: params.supporting_slices,
+          validation: params.validation,
+          notes: params.notes,
+        },
+        process.cwd(),
+      );
+      return {
+        content: [{ type: "text" as const, text: `Saved requirement ${result.id}` }],
+        details: { operation: "save_requirement", id: result.id } as any,
+      };
+    } catch (err) {
+      const msg = err instanceof Error ? err.message : String(err);
+      logError("tool", `gsd_requirement_save tool failed: ${msg}`, { tool: "gsd_requirement_save", error: String(err) });
+      return {
+        content: [{ type: "text" as const, text: `Error saving requirement: ${msg}` }],
+        details: { operation: "save_requirement", error: msg } as any,
+      };
+    }
+  };
+
+  const requirementSaveTool = {
+    name: "gsd_requirement_save",
+    label: "Save Requirement",
+    description:
+      "Record a new requirement to the GSD database and regenerate REQUIREMENTS.md. " +
+      "Requirement IDs are auto-assigned — never provide an ID manually.",
+    promptSnippet: "Record a new GSD requirement to the database (auto-assigns ID, regenerates REQUIREMENTS.md)",
+    promptGuidelines: [
+      "Use gsd_requirement_save when recording a new functional, non-functional, or operational requirement.",
+      "Requirement IDs are auto-assigned (R001, R002, ...) — never guess or provide an ID.",
+      "class, description, why, and source are required. All other fields are optional.",
+      "The tool writes to the DB and regenerates .gsd/REQUIREMENTS.md automatically.",
+    ],
+    parameters: Type.Object({
+      class: Type.String({ description: "Requirement class (e.g. 'functional', 'non-functional', 'operational')" }),
+      description: Type.String({ description: "Short description of the requirement" }),
+      why: Type.String({ description: "Why this requirement matters" }),
+      source: Type.String({ description: "Origin of the requirement (e.g. 'user-research', 'design', 'M001')" }),
+      status: Type.Optional(Type.String({ description: "Status (default: 'active')" })),
+      primary_owner: Type.Optional(Type.String({ description: "Primary owning slice" })),
+      supporting_slices: Type.Optional(Type.String({ description: "Supporting slices" })),
+      validation: Type.Optional(Type.String({ description: "Validation criteria" })),
+      notes: Type.Optional(Type.String({ description: "Additional notes" })),
+    }),
+    execute: requirementSaveExecute,
+    renderCall(args: any, theme: any) {
+      let text = theme.fg("toolTitle", theme.bold("requirement_save "));
+      if (args.class) text += theme.fg("accent", `[${args.class}] `);
+      if (args.description) text += theme.fg("muted", args.description);
+      return new Text(text, 0, 0);
+    },
+    renderResult(result: any, _options: any, theme: any) {
+      const d = result.details;
+      if (result.isError || d?.error) {
+        return new Text(theme.fg("error", `Error: ${d?.error ?? "unknown"}`), 0, 0);
+      }
+      let text = theme.fg("success", `Requirement ${d?.id ?? ""} saved`);
+      text += theme.fg("dim", ` → REQUIREMENTS.md`);
+      return new Text(text, 0, 0);
+    },
+  };
+
+  pi.registerTool(requirementSaveTool);
+  registerAlias(pi, requirementSaveTool, "gsd_save_requirement", "gsd_requirement_save");
+
  // ─── gsd_summary_save (formerly gsd_save_summary) ──────────────────────

  const summarySaveExecute = async (_toolCallId: string, params: any, _signal: AbortSignal | undefined, _onUpdate: unknown, _ctx: unknown) => {
--- a/src/resources/extensions/gsd/bootstrap/dynamic-tools.ts
+++ b/src/resources/extensions/gsd/bootstrap/dynamic-tools.ts
@ -32,6 +32,31 @@ export function resolveProjectRootDbPath(basePath: string): string {
    return join(projectRoot, ".gsd", "gsd.db");
  }

+  // Symlink-resolved layout: /.gsd/projects/<hash>/worktrees/M001/...
+  // The project root is everything before /.gsd/projects/ (#2517)
+  const symlinkMarker = `${sep}.gsd${sep}projects${sep}`;
+  const symlinkIdx = basePath.indexOf(symlinkMarker);
+  if (symlinkIdx !== -1) {
+    const afterProjects = basePath.slice(symlinkIdx + symlinkMarker.length);
+    // Expect: <hash>/worktrees/...
+    const worktreeSeg = `${sep}worktrees${sep}`;
+    if (afterProjects.includes(worktreeSeg)) {
+      const projectRoot = basePath.slice(0, symlinkIdx);
+      return join(projectRoot, ".gsd", "gsd.db");
+    }
+  }
+
+  // Forward-slash variant for symlink-resolved layout
+  const fwdSymlinkMarker = "/.gsd/projects/";
+  const fwdSymlinkIdx = basePath.indexOf(fwdSymlinkMarker);
+  if (fwdSymlinkIdx !== -1) {
+    const afterProjects = basePath.slice(fwdSymlinkIdx + fwdSymlinkMarker.length);
+    if (afterProjects.includes("/worktrees/")) {
+      const projectRoot = basePath.slice(0, fwdSymlinkIdx);
+      return join(projectRoot, ".gsd", "gsd.db");
+    }
+  }
+
  return join(basePath, ".gsd", "gsd.db");
 }

@ -81,8 +106,20 @@ export async function ensureDbOpen(): Promise<boolean> {
      return opened;
    }

+    process.stderr.write(
+      `gsd-db: ensureDbOpen failed — no .gsd directory found (resolvedPath=${resolveProjectRootDbPath(basePath)}, cwd=${basePath})\n`,
+    );
    return false;
-  } catch {
+  } catch (err) {
+    const basePath = process.cwd();
+    const diagnostic = {
+      resolvedPath: resolveProjectRootDbPath(basePath),
+      cwd: basePath,
+      error: (err as Error).message ?? String(err),
+    };
+    process.stderr.write(
+      `gsd-db: ensureDbOpen failed — ${JSON.stringify(diagnostic)}\n`,
+    );
    return false;
  }
 }
--- a/src/resources/extensions/gsd/bootstrap/register-hooks.ts
+++ b/src/resources/extensions/gsd/bootstrap/register-hooks.ts
@ -6,8 +6,9 @@ import { isToolCallEventType } from "@gsd/pi-coding-agent";
 import { buildMilestoneFileName, resolveMilestonePath, resolveSliceFile, resolveSlicePath } from "../paths.js";
 import { buildBeforeAgentStartResult } from "./system-context.js";
 import { handleAgentEnd } from "./agent-end-recovery.js";
-import { clearDiscussionFlowState, isDepthVerified, isQueuePhaseActive, markDepthVerified, resetWriteGateState, shouldBlockContextWrite } from "./write-gate.js";
+import { clearDiscussionFlowState, isDepthVerified, isQueuePhaseActive, markDepthVerified, resetWriteGateState, shouldBlockContextWrite, shouldBlockQueueExecution } from "./write-gate.js";
 import { isBlockedStateFile, isBashWriteToStateFile, BLOCKED_WRITE_ERROR } from "../write-intercept.js";
+import { cleanupQuickBranch } from "../quick.js";
 import { getDiscussionMilestoneId } from "../guided-flow.js";
 import { loadToolApiKeys } from "../commands-config.js";
 import { loadFile, saveFile, formatContinue } from "../files.js";
@ -16,8 +17,6 @@ import { getAutoDashboardData, isAutoActive, isAutoPaused, markToolEnd, markTool
 import { isParallelActive, shutdownParallel } from "../parallel-orchestrator.js";
 import { checkToolCallLoop, resetToolCallLoopGuard } from "./tool-call-loop-guard.js";
 import { saveActivityLog } from "../activity-log.js";
-import { startRtkStatusUpdates, stopRtkStatusUpdates } from "../rtk-status.js";
-import { rewriteCommandWithRtk } from "../../shared/rtk.js";

 // Skip the welcome screen on the very first session_start — cli.ts already
 // printed it before the TUI launched. Only re-print on /clear (subsequent sessions).
@ -29,19 +28,10 @@ async function syncServiceTierStatus(ctx: ExtensionContext): Promise<void> {
 }

 export function registerHooks(pi: ExtensionAPI): void {
-  // Route all agent bash tool commands through RTK rewrite when opted in.
-  // This is a no-op when RTK is disabled or not installed.
-  pi.on("bash_transform", async (event) => {
-    const rewritten = rewriteCommandWithRtk(event.command);
-    if (rewritten === event.command) return undefined;
-    return { command: rewritten };
-  });
-
  pi.on("session_start", async (_event, ctx) => {
    resetWriteGateState();
    resetToolCallLoopGuard();
    await syncServiceTierStatus(ctx);
-    startRtkStatusUpdates(ctx);

    // Apply show_token_cost preference (#1515)
    try {
@ -86,11 +76,6 @@ export function registerHooks(pi: ExtensionAPI): void {
    clearDiscussionFlowState();
    await syncServiceTierStatus(ctx);
    loadToolApiKeys();
-    startRtkStatusUpdates(ctx);
-  });
-
-  pi.on("session_fork", async (_event, ctx) => {
-    startRtkStatusUpdates(ctx);
  });

  pi.on("before_agent_start", async (event, ctx: ExtensionContext) => {
@ -102,6 +87,17 @@ export function registerHooks(pi: ExtensionAPI): void {
    await handleAgentEnd(pi, event, ctx);
  });

+  // Squash-merge quick-task branch back to the original branch after the
+  // agent turn completes (#2668). cleanupQuickBranch is a no-op when no
+  // quick-return state is pending, so this is safe to call on every turn.
+  pi.on("turn_end", async () => {
+    try {
+      cleanupQuickBranch();
+    } catch {
+      // Best-effort: don't break the turn lifecycle if cleanup fails.
+    }
+  });
+
  pi.on("session_before_compact", async () => {
    if (isAutoActive() || isAutoPaused()) {
      return { cancel: true };
@ -139,7 +135,6 @@ export function registerHooks(pi: ExtensionAPI): void {
  });

  pi.on("session_shutdown", async (_event, ctx: ExtensionContext) => {
-    stopRtkStatusUpdates(ctx);
    if (isParallelActive()) {
      try {
        await shutdownParallel(process.cwd());
@ -161,6 +156,23 @@ export function registerHooks(pi: ExtensionAPI): void {
      return { block: true, reason: loopCheck.reason };
    }

+    // ── Queue-mode execution guard (#2545): block source-code mutations ──
+    // When /gsd queue is active, the agent should only create milestones,
+    // not execute work. Block write/edit to non-.gsd/ paths and bash commands
+    // that would modify files.
+    if (isQueuePhaseActive()) {
+      let queueInput = "";
+      if (isToolCallEventType("write", event)) {
+        queueInput = event.input.path;
+      } else if (isToolCallEventType("edit", event)) {
+        queueInput = event.input.path;
+      } else if (isToolCallEventType("bash", event)) {
+        queueInput = event.input.command;
+      }
+      const queueGuard = shouldBlockQueueExecution(event.toolName, queueInput, true);
+      if (queueGuard.block) return queueGuard;
+    }
+
    // ── Single-writer engine: block direct writes to STATE.md ──────────
    // Covers write, edit, and bash tools to prevent bypass vectors.
    if (isToolCallEventType("write", event)) {
@ -245,7 +257,7 @@ export function registerHooks(pi: ExtensionAPI): void {

  pi.on("tool_execution_start", async (event) => {
    if (!isAutoActive()) return;
-    markToolStart(event.toolCallId, event.toolName);
+    markToolStart(event.toolCallId);
  });

  pi.on("tool_execution_end", async (event) => {
--- a/src/resources/extensions/gsd/bootstrap/system-context.ts
+++ b/src/resources/extensions/gsd/bootstrap/system-context.ts
@ -1,4 +1,4 @@
-import { existsSync, readFileSync } from "node:fs";
+import { existsSync, readFileSync, unlinkSync } from "node:fs";
 import { homedir } from "node:os";
 import { join } from "node:path";

@ -6,6 +6,7 @@ import type { ExtensionContext } from "@gsd/pi-coding-agent";

 import { debugTime } from "../debug-logger.js";
 import { loadPrompt } from "../prompt-loader.js";
+import { readForensicsMarker } from "../forensics.js";
 import { resolveAllSkillReferences, renderPreferencesForSystemPrompt, loadEffectiveGSDPreferences } from "../preferences.js";
 import { resolveGsdRootFile, resolveSliceFile, resolveSlicePath, resolveTaskFile, resolveTaskFiles, resolveTasksDir, relSliceFile, relSlicePath, relTaskFile } from "../paths.js";
 import { hasSkillSnapshot, detectNewSkills, formatSkillsXml } from "../skill-discovery.js";
@ -94,30 +95,54 @@ export async function buildBeforeAgentStartResult(
    }
  }

+  let codebaseBlock = "";
+  const codebasePath = resolveGsdRootFile(process.cwd(), "CODEBASE");
+  if (existsSync(codebasePath)) {
+    try {
+      const rawContent = readFileSync(codebasePath, "utf-8").trim();
+      if (rawContent) {
+        // Cap injection size to ~2 000 tokens to avoid bloating every request.
+        // Full map is always available at .gsd/CODEBASE.md.
+        const MAX_CODEBASE_CHARS = 8_000;
+        const generatedMatch = rawContent.match(/Generated: (\S+)/);
+        const generatedAt = generatedMatch?.[1] ?? "unknown";
+        const content = rawContent.length > MAX_CODEBASE_CHARS
+          ? rawContent.slice(0, MAX_CODEBASE_CHARS) + "\n\n*(truncated — see .gsd/CODEBASE.md for full map)*"
+          : rawContent;
+        codebaseBlock = `\n\n[PROJECT CODEBASE — File structure and descriptions (generated ${generatedAt}, may be stale — run /gsd codebase update to refresh)]\n\n${content}`;
+      }
+    } catch {
+      // skip
+    }
+  }
+
  warnDeprecatedAgentInstructions();

  const injection = await buildGuidedExecuteContextInjection(event.prompt, process.cwd());
+
+  // Re-inject forensics context on follow-up turns (#2941)
+  const forensicsInjection = !injection ? buildForensicsContextInjection(process.cwd()) : null;
+
  const worktreeBlock = buildWorktreeContextBlock();
-  const fullSystem = `${event.systemPrompt}\n\n[SYSTEM CONTEXT — GSD]\n\n${systemContent}${preferenceBlock}${knowledgeBlock}${memoryBlock}${newSkillsBlock}${worktreeBlock}`;
+  const fullSystem = `${event.systemPrompt}\n\n[SYSTEM CONTEXT — GSD]\n\n${systemContent}${preferenceBlock}${knowledgeBlock}${codebaseBlock}${memoryBlock}${newSkillsBlock}${worktreeBlock}`;

  stopContextTimer({
    systemPromptSize: fullSystem.length,
-    injectionSize: injection?.length ?? 0,
+    injectionSize: injection?.length ?? forensicsInjection?.length ?? 0,
    hasPreferences: preferenceBlock.length > 0,
    hasNewSkills: newSkillsBlock.length > 0,
  });

+  // Determine which context message to inject (guided execute takes priority)
+  const contextMessage = injection
+    ? { customType: "gsd-guided-context", content: injection, display: false as const }
+    : forensicsInjection
+      ? { customType: "gsd-forensics", content: forensicsInjection, display: false as const }
+      : null;
+
  return {
    systemPrompt: fullSystem,
-    ...(injection
-      ? {
-        message: {
-          customType: "gsd-guided-context",
-          content: injection,
-          display: false as const,
-        },
-      }
-      : {}),
+    ...(contextMessage ? { message: contextMessage } : {}),
  };
 }

@ -375,3 +400,38 @@ function oneLine(text: string): string {
  return text.replace(/\s+/g, " ").trim();
 }

+// ─── Forensics Context Re-injection (#2941) ──────────────────────────────────
+
+/**
+ * Check for an active forensics session and return the prompt content
+ * so it can be re-injected on follow-up turns.
+ */
+function buildForensicsContextInjection(basePath: string): string | null {
+  const marker = readForensicsMarker(basePath);
+  if (!marker) return null;
+
+  // Expire markers older than 2 hours to avoid stale context
+  const age = Date.now() - new Date(marker.createdAt).getTime();
+  if (age > 2 * 60 * 60 * 1000) {
+    clearForensicsMarker(basePath);
+    return null;
+  }
+
+  return marker.promptContent;
+}
+
+/**
+ * Remove the active forensics marker file, e.g. when the investigation
+ * is complete or the session expires.
+ */
+export function clearForensicsMarker(basePath: string): void {
+  const markerPath = join(basePath, ".gsd", "runtime", "active-forensics.json");
+  if (existsSync(markerPath)) {
+    try {
+      unlinkSync(markerPath);
+    } catch {
+      // non-fatal
+    }
+  }
+}
+
--- a/src/resources/extensions/gsd/bootstrap/write-gate.ts
+++ b/src/resources/extensions/gsd/bootstrap/write-gate.ts
@ -1,5 +1,31 @@
 const MILESTONE_CONTEXT_RE = /M\d+(?:-[a-z0-9]{6})?-CONTEXT\.md$/;

+/**
+ * Path segment that identifies .gsd/ planning artifacts.
+ * Writes to these paths are allowed during queue mode.
+ */
+const GSD_DIR_RE = /(^|[/\\])\.gsd([/\\]|$)/;
+
+/**
+ * Read-only tool names that are always safe during queue mode.
+ */
+const QUEUE_SAFE_TOOLS = new Set([
+  "read", "grep", "find", "ls", "glob",
+  // Discussion & planning tools
+  "ask_user_questions",
+  "gsd_milestone_generate_id",
+  "gsd_summary_save",
+  // Web research tools used during queue discussion
+  "search-the-web", "resolve_library", "get_library_docs", "fetch_page",
+  "search_and_read",
+]);
+
+/**
+ * Bash commands that are read-only / investigative — safe during queue mode.
+ * Matches the leading command in a bash invocation.
+ */
+const BASH_READ_ONLY_RE = /^\s*(cat|head|tail|less|more|wc|file|stat|du|df|which|type|echo|printf|ls|find|grep|rg|awk|sed\b(?!.*-i)|sort|uniq|diff|comm|tr|cut|tee\s+-a\s+\/dev\/null|git\s+(log|show|diff|status|branch|tag|remote|rev-parse|ls-files|blame|shortlog|describe|stash\s+list|config\s+--get|cat-file)|gh\s+(issue|pr|api|repo|release)\s+(view|list|diff|status|checks)|mkdir\s+-p\s+\.gsd|rtk\s)/;
+
 let depthVerificationDone = false;
 let activeQueuePhase = false;

@ -49,3 +75,52 @@ export function shouldBlockContextWrite(
  };
 }

+/**
+ * Queue-mode execution guard (#2545).
+ *
+ * When the queue phase is active, the agent should only create planning
+ * artifacts (milestones, CONTEXT.md, QUEUE.md, etc.) — never execute work.
+ * This function blocks write/edit/bash tool calls that would modify source
+ * code outside of .gsd/.
+ *
+ * @param toolName  The tool being called (write, edit, bash, etc.)
+ * @param input     For write/edit: the file path. For bash: the command string.
+ * @param queuePhaseActive  Whether the queue phase is currently active.
+ * @returns { block, reason } — block=true if the call should be rejected.
+ */
+export function shouldBlockQueueExecution(
+  toolName: string,
+  input: string,
+  queuePhaseActive: boolean,
+): { block: boolean; reason?: string } {
+  if (!queuePhaseActive) return { block: false };
+
+  // Always-safe tools (read-only, discussion, planning)
+  if (QUEUE_SAFE_TOOLS.has(toolName)) return { block: false };
+
+  // write/edit — allow if targeting .gsd/ planning artifacts
+  if (toolName === "write" || toolName === "edit") {
+    if (GSD_DIR_RE.test(input)) return { block: false };
+    return {
+      block: true,
+      reason: `Blocked: /gsd queue is a planning tool — it creates milestones, not executes work. ` +
+        `Cannot ${toolName} to "${input}" during queue mode. ` +
+        `Write CONTEXT.md files and update PROJECT.md/QUEUE.md instead.`,
+    };
+  }
+
+  // bash — allow read-only/investigative commands, block everything else
+  if (toolName === "bash") {
+    if (BASH_READ_ONLY_RE.test(input)) return { block: false };
+    return {
+      block: true,
+      reason: `Blocked: /gsd queue is a planning tool — it creates milestones, not executes work. ` +
+        `Cannot run "${input.slice(0, 80)}${input.length > 80 ? "…" : ""}" during queue mode. ` +
+        `Use read-only commands (cat, grep, git log, etc.) to investigate, then write planning artifacts.`,
+    };
+  }
+
+  // Unknown tools — allow by default (custom extension tools, etc.)
+  return { block: false };
+}
+
--- a/src/resources/extensions/gsd/captures.ts
+++ b/src/resources/extensions/gsd/captures.ts
@ -26,6 +26,7 @@ export interface CaptureEntry {
  resolution?: string;
  rationale?: string;
  resolvedAt?: string;
+  resolvedInMilestone?: string;
  executed?: boolean;
 }

@ -176,6 +177,7 @@ export function markCaptureResolved(
  classification: Classification,
  resolution: string,
  rationale: string,
+  milestoneId?: string,
 ): void {
  const filePath = resolveCapturesPath(basePath);
  if (!existsSync(filePath)) return;
@ -206,13 +208,17 @@ export function markCaptureResolved(
    `**Rationale:** ${rationale}`,
    `**Resolved:** ${resolvedAt}`,
  ];
+  if (milestoneId) {
+    newFields.push(`**Milestone:** ${milestoneId}`);
+  }

-  // Remove any existing classification/resolution/rationale/resolved fields
+  // Remove any existing classification/resolution/rationale/resolved/milestone fields
  // (in case of re-triage)
  section = section.replace(/\*\*Classification:\*\*\s*.+\n?/g, "");
  section = section.replace(/\*\*Resolution:\*\*\s*.+\n?/g, "");
  section = section.replace(/\*\*Rationale:\*\*\s*.+\n?/g, "");
  section = section.replace(/\*\*Resolved:\*\*\s*.+\n?/g, "");
+  section = section.replace(/\*\*Milestone:\*\*\s*.+\n?/g, "");

  // Add new fields after Status line
  section = section.trimEnd() + "\n" + newFields.join("\n") + "\n";
@ -255,18 +261,70 @@ export function markCaptureExecuted(basePath: string, captureId: string): void {
 * Load resolved captures that have actionable classifications (inject, replan,
 * quick-task) but have NOT yet been executed.
 * These are captures whose resolutions need to be carried out.
+ *
+ * When `currentMilestoneId` is provided, captures resolved in a *different*
+ * milestone are treated as stale and excluded.  This prevents quick-task
+ * captures from a prior milestone re-executing after the underlying issues
+ * were already fixed by planned milestone work (#2872).
+ *
+ * Captures that have no `resolvedInMilestone` (legacy captures resolved before
+ * this field was introduced) are always included for backward compatibility.
 */
-export function loadActionableCaptures(basePath: string): CaptureEntry[] {
+export function loadActionableCaptures(basePath: string, currentMilestoneId?: string): CaptureEntry[] {
  return loadAllCaptures(basePath).filter(
    c =>
      c.status === "resolved" &&
      !c.executed &&
      (c.classification === "inject" ||
        c.classification === "replan" ||
-        c.classification === "quick-task"),
+        c.classification === "quick-task") &&
+      // Staleness gate: exclude captures resolved in a different milestone (#2872)
+      (!currentMilestoneId ||
+        !c.resolvedInMilestone ||
+        c.resolvedInMilestone === currentMilestoneId),
  );
 }

+/**
+ * Retroactively stamp a capture with a milestone ID.
+ *
+ * Used by executeTriageResolutions() as a safety net when the triage LLM
+ * resolves a capture without writing the **Milestone:** field.  This ensures
+ * the staleness gate in loadActionableCaptures() works correctly even for
+ * captures resolved before the prompt was updated (#2872).
+ */
+export function stampCaptureMilestone(basePath: string, captureId: string, milestoneId: string): void {
+  const filePath = resolveCapturesPath(basePath);
+  if (!existsSync(filePath)) return;
+
+  const content = readFileSync(filePath, "utf-8");
+
+  const sectionRegex = new RegExp(
+    `(### ${escapeRegex(captureId)}\\n(?:(?!### ).)*?)(?=### |$)`,
+    "s",
+  );
+  const match = sectionRegex.exec(content);
+  if (!match) return;
+
+  let section = match[1];
+
+  // Only stamp if not already present
+  if (/\*\*Milestone:\*\*/.test(section)) return;
+
+  // Insert after the Resolved field (or at end of section)
+  const resolvedFieldEnd = section.search(/\*\*Resolved:\*\*\s*.+\n?/);
+  if (resolvedFieldEnd !== -1) {
+    const resolvedMatch = section.match(/\*\*Resolved:\*\*\s*.+\n?/);
+    const insertPos = resolvedFieldEnd + (resolvedMatch?.[0]?.length ?? 0);
+    section = section.slice(0, insertPos) + `**Milestone:** ${milestoneId}\n` + section.slice(insertPos);
+  } else {
+    section = section.trimEnd() + "\n" + `**Milestone:** ${milestoneId}` + "\n";
+  }
+
+  const updated = content.replace(sectionRegex, section);
+  writeFileSync(filePath, updated, "utf-8");
+}
+
 // ─── Parser ───────────────────────────────────────────────────────────────────

 /**
@ -291,6 +349,7 @@ function parseCapturesContent(content: string): CaptureEntry[] {
    const resolution = extractBoldField(body, "Resolution");
    const rationale = extractBoldField(body, "Rationale");
    const resolvedAt = extractBoldField(body, "Resolved");
+    const milestoneId = extractBoldField(body, "Milestone");
    const executedAt = extractBoldField(body, "Executed");

    if (!text || !timestamp) continue;
@ -308,6 +367,7 @@ function parseCapturesContent(content: string): CaptureEntry[] {
      ...(resolution ? { resolution } : {}),
      ...(rationale ? { rationale } : {}),
      ...(resolvedAt ? { resolvedAt } : {}),
+      ...(milestoneId ? { resolvedInMilestone: milestoneId } : {}),
      ...(executedAt ? { executed: true } : {}),
    });
  }
--- a/src/resources/extensions/gsd/codebase-generator.ts
+++ b/src/resources/extensions/gsd/codebase-generator.ts
@ -0,0 +1,351 @@
+/**
+ * GSD Codebase Map Generator
+ *
+ * Produces .gsd/CODEBASE.md — a structural table of contents for the project.
+ * Gives fresh agent contexts instant orientation without filesystem exploration.
+ *
+ * Generation: walk `git ls-files`, group by directory, output with descriptions.
+ * Maintenance: agent updates descriptions as it works; incremental update preserves them.
+ */
+
+import { existsSync, readFileSync, writeFileSync, mkdirSync } from "node:fs";
+import { join, dirname, extname } from "node:path";
+
+import { execSync } from "node:child_process";
+import { gsdRoot } from "./paths.js";
+
+// ─── Types ───────────────────────────────────────────────────────────────────
+
+export interface CodebaseMapOptions {
+  excludePatterns?: string[];
+  maxFiles?: number;
+  collapseThreshold?: number;
+}
+
+interface FileEntry {
+  path: string;
+  description: string;
+}
+
+interface DirectoryGroup {
+  path: string;
+  files: FileEntry[];
+  collapsed: boolean;
+}
+
+// ─── Defaults ────────────────────────────────────────────────────────────────
+
+const DEFAULT_EXCLUDES = [
+  ".gsd/",
+  ".planning/",
+  ".git/",
+  "node_modules/",
+  "dist/",
+  "build/",
+  ".next/",
+  "coverage/",
+  "__pycache__/",
+  ".venv/",
+  "vendor/",
+];
+
+const DEFAULT_MAX_FILES = 500;
+const DEFAULT_COLLAPSE_THRESHOLD = 20;
+
+// ─── Parsing ─────────────────────────────────────────────────────────────────
+
+/**
+ * Parse an existing CODEBASE.md to extract file → description mappings.
+ * Also scans <!-- gsd:collapsed-descriptions --> comment blocks to preserve
+ * descriptions for files in collapsed directories across incremental updates.
+ */
+export function parseCodebaseMap(content: string): Map<string, string> {
+  const descriptions = new Map<string, string>();
+  let inCollapsedBlock = false;
+
+  for (const line of content.split("\n")) {
+    // Track collapsed-description comment blocks
+    if (line.trimStart().startsWith("<!-- gsd:collapsed-descriptions")) {
+      inCollapsedBlock = true;
+      continue;
+    }
+    if (inCollapsedBlock && line.trimStart().startsWith("-->")) {
+      inCollapsedBlock = false;
+      continue;
+    }
+
+    // Match: - `path/to/file.ts` — Description here
+    const match = line.match(/^- `(.+?)` — (.+)$/);
+    if (match) {
+      descriptions.set(match[1], match[2]);
+      continue;
+    }
+
+    // Match: - `path/to/file.ts` (no description) — only outside collapsed blocks
+    if (!inCollapsedBlock) {
+      const bareMatch = line.match(/^- `(.+?)`\s*$/);
+      if (bareMatch) {
+        descriptions.set(bareMatch[1], "");
+      }
+    }
+  }
+  return descriptions;
+}
+
+// ─── File Enumeration ────────────────────────────────────────────────────────
+
+function shouldExclude(filePath: string, excludes: string[]): boolean {
+  for (const pattern of excludes) {
+    if (pattern.endsWith("/")) {
+      if (filePath.startsWith(pattern) || filePath.includes(`/${pattern}`)) return true;
+    } else if (filePath === pattern || filePath.endsWith(`/${pattern}`)) {
+      return true;
+    }
+  }
+  // Skip binary/lock files
+  const ext = extname(filePath).toLowerCase();
+  if ([".lock", ".png", ".jpg", ".jpeg", ".gif", ".ico", ".woff", ".woff2", ".ttf", ".eot", ".svg"].includes(ext)) {
+    return true;
+  }
+  return false;
+}
+
+function lsFiles(basePath: string): string[] {
+  try {
+    const result = execSync("git ls-files", { cwd: basePath, encoding: "utf-8", timeout: 10000 });
+    return result.split("\n").filter(Boolean);
+  } catch {
+    return [];
+  }
+}
+
+/**
+ * Enumerate tracked files, applying exclusions and the maxFiles cap.
+ * Returns both the file list and whether truncation occurred.
+ */
+function enumerateFiles(basePath: string, excludes: string[], maxFiles: number): { files: string[]; truncated: boolean } {
+  const allFiles = lsFiles(basePath);
+  const filtered = allFiles.filter((f) => !shouldExclude(f, excludes));
+  const truncated = filtered.length > maxFiles;
+  return { files: truncated ? filtered.slice(0, maxFiles) : filtered, truncated };
+}
+
+// ─── Grouping ────────────────────────────────────────────────────────────────
+
+function groupByDirectory(
+  files: string[],
+  descriptions: Map<string, string>,
+  collapseThreshold: number,
+): DirectoryGroup[] {
+  const dirMap = new Map<string, FileEntry[]>();
+
+  for (const file of files) {
+    const dir = dirname(file);
+    const dirKey = dir === "." ? "" : dir;
+    if (!dirMap.has(dirKey)) {
+      dirMap.set(dirKey, []);
+    }
+    dirMap.get(dirKey)!.push({
+      path: file,
+      description: descriptions.get(file) ?? "",
+    });
+  }
+
+  const groups: DirectoryGroup[] = [];
+  const sortedDirs = [...dirMap.keys()].sort();
+
+  for (const dir of sortedDirs) {
+    const dirFiles = dirMap.get(dir)!;
+    dirFiles.sort((a, b) => a.path.localeCompare(b.path));
+
+    groups.push({
+      path: dir,
+      files: dirFiles,
+      collapsed: dirFiles.length > collapseThreshold,
+    });
+  }
+
+  return groups;
+}
+
+// ─── Rendering ───────────────────────────────────────────────────────────────
+
+function renderCodebaseMap(groups: DirectoryGroup[], totalFiles: number, truncated: boolean): string {
+  const lines: string[] = [];
+  const now = new Date().toISOString().split(".")[0] + "Z";
+  const described = groups.reduce((sum, g) => sum + g.files.filter((f) => f.description).length, 0);
+
+  lines.push("# Codebase Map");
+  lines.push("");
+  lines.push(`Generated: ${now} | Files: ${totalFiles} | Described: ${described}/${totalFiles}`);
+  if (truncated) {
+    lines.push(`Note: Truncated to first ${totalFiles} files. Run with higher --max-files to include all.`);
+  }
+  lines.push("");
+
+  for (const group of groups) {
+    const heading = group.path || "(root)";
+    lines.push(`### ${heading}/`);
+
+    if (group.collapsed) {
+      // Summarize collapsed directories
+      const extensions = new Map<string, number>();
+      for (const f of group.files) {
+        const ext = extname(f.path) || "(no ext)";
+        extensions.set(ext, (extensions.get(ext) ?? 0) + 1);
+      }
+      const extSummary = [...extensions.entries()]
+        .sort((a, b) => b[1] - a[1])
+        .map(([ext, count]) => `${count} ${ext}`)
+        .join(", ");
+      lines.push(`- *(${group.files.length} files: ${extSummary})*`);
+
+      // Preserve any existing descriptions in a hidden comment block so
+      // incremental updates can recover them via parseCodebaseMap.
+      const descLines = group.files
+        .filter((f) => f.description)
+        .map((f) => `- \`${f.path}\` — ${f.description}`);
+      if (descLines.length > 0) {
+        lines.push("<!-- gsd:collapsed-descriptions");
+        lines.push(...descLines);
+        lines.push("-->");
+      }
+    } else {
+      for (const file of group.files) {
+        if (file.description) {
+          lines.push(`- \`${file.path}\` — ${file.description}`);
+        } else {
+          lines.push(`- \`${file.path}\``);
+        }
+      }
+    }
+    lines.push("");
+  }
+
+  return lines.join("\n");
+}
+
+// ─── Public API ──────────────────────────────────────────────────────────────
+
+/**
+ * Generate a fresh CODEBASE.md from scratch.
+ * Preserves existing descriptions if `existingDescriptions` is provided.
+ */
+export function generateCodebaseMap(
+  basePath: string,
+  options?: CodebaseMapOptions,
+  existingDescriptions?: Map<string, string>,
+): { content: string; fileCount: number; truncated: boolean; files: string[] } {
+  const excludes = [...DEFAULT_EXCLUDES, ...(options?.excludePatterns ?? [])];
+  const maxFiles = options?.maxFiles ?? DEFAULT_MAX_FILES;
+  const collapseThreshold = options?.collapseThreshold ?? DEFAULT_COLLAPSE_THRESHOLD;
+
+  const { files, truncated } = enumerateFiles(basePath, excludes, maxFiles);
+  const descriptions = existingDescriptions ?? new Map<string, string>();
+  const groups = groupByDirectory(files, descriptions, collapseThreshold);
+  const content = renderCodebaseMap(groups, files.length, truncated);
+
+  return { content, fileCount: files.length, truncated, files };
+}
+
+/**
+ * Incremental update: re-scan files, preserve existing descriptions,
+ * add new files, remove deleted files.
+ */
+export function updateCodebaseMap(
+  basePath: string,
+  options?: CodebaseMapOptions,
+): { content: string; added: number; removed: number; unchanged: number; fileCount: number; truncated: boolean } {
+  const codebasePath = join(gsdRoot(basePath), "CODEBASE.md");
+
+  // Load existing descriptions
+  let existingDescriptions = new Map<string, string>();
+  if (existsSync(codebasePath)) {
+    const existing = readFileSync(codebasePath, "utf-8");
+    existingDescriptions = parseCodebaseMap(existing);
+  }
+
+  const existingFiles = new Set(existingDescriptions.keys());
+
+  // Generate new map preserving descriptions — reuse the returned file list
+  // to avoid a second enumeration (prevents race between content and stats).
+  const result = generateCodebaseMap(basePath, options, existingDescriptions);
+  const currentSet = new Set(result.files);
+
+  // Count changes
+  let added = 0;
+  let removed = 0;
+
+  for (const f of result.files) {
+    if (!existingFiles.has(f)) added++;
+  }
+  for (const f of existingFiles) {
+    if (!currentSet.has(f)) removed++;
+  }
+
+  return {
+    content: result.content,
+    added,
+    removed,
+    unchanged: result.files.length - added,
+    fileCount: result.fileCount,
+    truncated: result.truncated,
+  };
+}
+
+/**
+ * Write CODEBASE.md to .gsd/ directory.
+ */
+export function writeCodebaseMap(basePath: string, content: string): string {
+  const root = gsdRoot(basePath);
+  mkdirSync(root, { recursive: true });
+  const outPath = join(root, "CODEBASE.md");
+  writeFileSync(outPath, content, "utf-8");
+  return outPath;
+}
+
+/**
+ * Read existing CODEBASE.md, or return null if it doesn't exist.
+ */
+export function readCodebaseMap(basePath: string): string | null {
+  const codebasePath = join(gsdRoot(basePath), "CODEBASE.md");
+  if (!existsSync(codebasePath)) return null;
+  try {
+    return readFileSync(codebasePath, "utf-8");
+  } catch {
+    return null;
+  }
+}
+
+/**
+ * Get stats about the codebase map.
+ */
+export function getCodebaseMapStats(basePath: string): {
+  exists: boolean;
+  fileCount: number;
+  describedCount: number;
+  undescribedCount: number;
+  generatedAt: string | null;
+} {
+  const content = readCodebaseMap(basePath);
+  if (!content) {
+    return { exists: false, fileCount: 0, describedCount: 0, undescribedCount: 0, generatedAt: null };
+  }
+
+  // Parse total file count from the header line (accurate even for collapsed dirs)
+  const fileCountMatch = content.match(/Files:\s*(\d+)/);
+  const totalFiles = fileCountMatch ? parseInt(fileCountMatch[1], 10) : 0;
+
+  // Use parseCodebaseMap to count described files (includes collapsed-description blocks)
+  const descriptions = parseCodebaseMap(content);
+  const described = [...descriptions.values()].filter((d) => d.length > 0).length;
+  const dateMatch = content.match(/Generated: (\S+)/);
+
+  return {
+    exists: true,
+    fileCount: totalFiles,
+    describedCount: described,
+    undescribedCount: totalFiles - described,
+    generatedAt: dateMatch?.[1] ?? null,
+  };
+}
--- a/src/resources/extensions/gsd/commands-codebase.ts
+++ b/src/resources/extensions/gsd/commands-codebase.ts
@ -0,0 +1,164 @@
+/**
+ * GSD Command — /gsd codebase
+ *
+ * Generate and manage the codebase map (.gsd/CODEBASE.md).
+ * Subcommands: generate, update, stats, help
+ */
+
+import type { ExtensionAPI, ExtensionCommandContext } from "@gsd/pi-coding-agent";
+
+import {
+  generateCodebaseMap,
+  updateCodebaseMap,
+  writeCodebaseMap,
+  getCodebaseMapStats,
+  readCodebaseMap,
+} from "./codebase-generator.js";
+
+const USAGE =
+  "Usage: /gsd codebase [generate|update|stats]\n\n" +
+  "  generate [--max-files N]  — Generate or regenerate CODEBASE.md\n" +
+  "  update                    — Incremental update (preserves descriptions)\n" +
+  "  stats                     — Show file count, coverage, and generation time\n" +
+  "  help                      — Show this help\n\n" +
+  "With no subcommand, shows stats if a map exists or help if not.";
+
+export async function handleCodebase(
+  args: string,
+  ctx: ExtensionCommandContext,
+  _pi: ExtensionAPI,
+): Promise<void> {
+  const basePath = process.cwd();
+  const parts = args.trim().split(/\s+/);
+  const sub = parts[0] ?? "";
+
+  switch (sub) {
+    case "generate": {
+      const maxFiles = parseMaxFiles(args, ctx);
+      if (maxFiles === false) return; // validation failed, message already shown
+
+      const existing = readCodebaseMap(basePath);
+      const existingDescriptions = existing
+        ? (await import("./codebase-generator.js")).parseCodebaseMap(existing)
+        : undefined;
+
+      const result = generateCodebaseMap(basePath, { maxFiles: maxFiles ?? undefined }, existingDescriptions);
+
+      if (result.fileCount === 0) {
+        ctx.ui.notify(
+          "Codebase map generated with 0 files.\n" +
+          "Is this a git repository? Run 'git ls-files' to verify.",
+          "warning",
+        );
+        return;
+      }
+
+      const outPath = writeCodebaseMap(basePath, result.content);
+      ctx.ui.notify(
+        `Codebase map generated: ${result.fileCount} files\n` +
+        `Written to: ${outPath}` +
+        (result.truncated ? `\n⚠ Truncated — increase --max-files to include all files` : ""),
+        "success",
+      );
+      return;
+    }
+
+    case "update": {
+      const existing = readCodebaseMap(basePath);
+      if (!existing) {
+        ctx.ui.notify(
+          "No codebase map found. Run /gsd codebase generate to create one.",
+          "warning",
+        );
+        return;
+      }
+
+      const maxFiles = parseMaxFiles(args, ctx);
+      if (maxFiles === false) return;
+
+      const result = updateCodebaseMap(basePath, { maxFiles: maxFiles ?? undefined });
+      writeCodebaseMap(basePath, result.content);
+
+      ctx.ui.notify(
+        `Codebase map updated: ${result.fileCount} files\n` +
+        `  Added: ${result.added} | Removed: ${result.removed} | Unchanged: ${result.unchanged}` +
+        (result.truncated ? `\n⚠ Truncated — increase --max-files to include all files` : ""),
+        "success",
+      );
+      return;
+    }
+
+    case "stats": {
+      showStats(basePath, ctx);
+      return;
+    }
+
+    case "help":
+      ctx.ui.notify(USAGE, "info");
+      return;
+
+    case "": {
+      // Safe default: show stats if map exists, help if not
+      const existing = readCodebaseMap(basePath);
+      if (existing) {
+        showStats(basePath, ctx);
+      } else {
+        ctx.ui.notify(USAGE, "info");
+      }
+      return;
+    }
+
+    default:
+      ctx.ui.notify(
+        `Unknown subcommand "${sub}".\n\n${USAGE}`,
+        "warning",
+      );
+  }
+}
+
+function showStats(basePath: string, ctx: ExtensionCommandContext): void {
+  const stats = getCodebaseMapStats(basePath);
+  if (!stats.exists) {
+    ctx.ui.notify("No codebase map found. Run /gsd codebase generate to create one.", "info");
+    return;
+  }
+
+  const coverage = stats.fileCount > 0
+    ? Math.round((stats.describedCount / stats.fileCount) * 100)
+    : 0;
+
+  ctx.ui.notify(
+    `Codebase Map Stats:\n` +
+    `  Files: ${stats.fileCount}\n` +
+    `  Described: ${stats.describedCount} (${coverage}%)\n` +
+    `  Undescribed: ${stats.undescribedCount}\n` +
+    `  Generated: ${stats.generatedAt ?? "unknown"}\n\n` +
+    (stats.undescribedCount > 0
+      ? `Tip: Run /gsd codebase update to refresh after file changes.`
+      : `Coverage is complete.`),
+    "info",
+  );
+}
+
+/**
+ * Parse and validate --max-files flag.
+ * Returns the parsed number, undefined if flag not present, or false if invalid.
+ */
+function parseMaxFiles(args: string, ctx: ExtensionCommandContext): number | undefined | false {
+  const maxFilesStr = extractFlag(args, "--max-files");
+  if (!maxFilesStr) return undefined;
+
+  const maxFiles = parseInt(maxFilesStr, 10);
+  if (isNaN(maxFiles) || maxFiles < 1) {
+    ctx.ui.notify("--max-files must be a positive integer (e.g. --max-files 200).", "warning");
+    return false;
+  }
+  return maxFiles;
+}
+
+function extractFlag(args: string, flag: string): string | undefined {
+  const escaped = flag.replace(/[.*+?^${}()|[\]\\]/g, "\\$&");
+  const regex = new RegExp(`${escaped}[=\\s]+(\\S+)`);
+  const match = args.match(regex);
+  return match?.[1];
+}
--- a/src/resources/extensions/gsd/commands/catalog.ts
+++ b/src/resources/extensions/gsd/commands/catalog.ts
@ -15,7 +15,7 @@ export interface GsdCommandDefinition {
 type CompletionMap = Record<string, readonly GsdCommandDefinition[]>;

 export const GSD_COMMAND_DESCRIPTION =
-  "GSD — Get Shit Done: /gsd help|start|templates|next|auto|stop|pause|status|widget|visualize|queue|quick|discuss|capture|triage|dispatch|history|undo|undo-task|reset-slice|rate|skip|export|cleanup|mode|prefs|config|keys|hooks|run-hook|skill-health|doctor|logs|forensics|changelog|migrate|remote|steer|knowledge|new-milestone|parallel|cmux|park|unpark|init|setup|inspect|extensions|update|fast|mcp|rethink";
+  "GSD — Get Shit Done: /gsd help|start|templates|next|auto|stop|pause|status|widget|visualize|queue|quick|discuss|capture|triage|dispatch|history|undo|undo-task|reset-slice|rate|skip|export|cleanup|mode|prefs|config|keys|hooks|run-hook|skill-health|doctor|logs|forensics|changelog|migrate|remote|steer|knowledge|new-milestone|parallel|cmux|park|unpark|init|setup|inspect|extensions|update|fast|mcp|rethink|codebase";

 export const TOP_LEVEL_SUBCOMMANDS: readonly GsdCommandDefinition[] = [
  { cmd: "help", desc: "Categorized command reference with descriptions" },
@ -71,6 +71,7 @@ export const TOP_LEVEL_SUBCOMMANDS: readonly GsdCommandDefinition[] = [
  { cmd: "mcp", desc: "MCP server status and connectivity check (status, check <server>)" },
  { cmd: "rethink", desc: "Conversational project reorganization — reorder, park, discard, add milestones" },
  { cmd: "workflow", desc: "Custom workflow lifecycle (new, run, list, validate, pause, resume)" },
+  { cmd: "codebase", desc: "Generate and manage codebase map (.gsd/CODEBASE.md)" },
 ];

 const NESTED_COMPLETIONS: CompletionMap = {
@ -225,6 +226,14 @@ const NESTED_COMPLETIONS: CompletionMap = {
    { cmd: "pause", desc: "Pause custom workflow auto-mode" },
    { cmd: "resume", desc: "Resume paused custom workflow auto-mode" },
  ],
+  codebase: [
+    { cmd: "generate", desc: "Generate or regenerate CODEBASE.md" },
+    { cmd: "generate --max-files", desc: "Generate with custom file limit (default: 500)" },
+    { cmd: "update", desc: "Incremental update (preserves descriptions)" },
+    { cmd: "update --max-files", desc: "Update with custom file limit" },
+    { cmd: "stats", desc: "Show file count, description coverage, and generation time" },
+    { cmd: "help", desc: "Show usage and available subcommands" },
+  ],
 };

 function filterOptions(
--- a/src/resources/extensions/gsd/commands/handlers/ops.ts
+++ b/src/resources/extensions/gsd/commands/handlers/ops.ts
@ -206,5 +206,10 @@ Examples:
    await handleRethink(trimmed, ctx, pi);
    return true;
  }
+  if (trimmed === "codebase" || trimmed.startsWith("codebase ")) {
+    const { handleCodebase } = await import("../../commands-codebase.js");
+    await handleCodebase(trimmed.replace(/^codebase\s*/, "").trim(), ctx, pi);
+    return true;
+  }
  return false;
 }
--- a/src/resources/extensions/gsd/complexity-classifier.ts
+++ b/src/resources/extensions/gsd/complexity-classifier.ts
@ -35,15 +35,17 @@ const UNIT_TYPE_TIERS: Record<string, ComplexityTier> = {
  "complete-slice": "light",
  "run-uat": "light",

-  // Tier 2 — Standard: research, routine planning, discussion
+  // Tier 2 — Standard: research, routine discussion
  "discuss-milestone": "standard",
  "discuss-slice": "standard",
  "research-milestone": "standard",
  "research-slice": "standard",
-  "plan-milestone": "standard",
-  "plan-slice": "standard",

-  // Tier 3 — Heavy: execution, replanning (requires deep reasoning)
+  // Tier 3 — Heavy: planning, execution, replanning (requires deep reasoning)
+  // Planning is heavy so it uses the best configured model (e.g. Opus) and is
+  // not downgraded by dynamic routing when a capable model is configured.
+  "plan-milestone": "heavy",
+  "plan-slice": "heavy",
  "execute-task": "standard",   // default standard, upgraded by metadata
  "replan-slice": "heavy",
  "reassess-roadmap": "heavy",
@ -185,8 +187,8 @@ function analyzePlanComplexity(
  // Check if this is a milestone-level plan (more complex) vs single slice
  const { milestone: mid, slice: sid } = parseUnitId(unitId);
  if (!sid) {
-    // Milestone-level planning is always at least standard
-    return { tier: "standard", reason: "milestone-level planning" };
+    // Milestone-level planning is always heavy — requires full context and best model
+    return { tier: "heavy", reason: "milestone-level planning" };
  }

  // For slice planning, try to read the context/research to gauge complexity
--- a/src/resources/extensions/gsd/db-writer.ts
+++ b/src/resources/extensions/gsd/db-writer.ts
@ -227,6 +227,122 @@ export async function nextDecisionId(): Promise<string> {
  }
 }

+// ─── Next Requirement ID ─────────────────────────────────────────────────
+
+/**
+ * Compute the next requirement ID from the current DB state.
+ * Queries MAX(CAST(SUBSTR(id, 2) AS INTEGER)) from requirements table.
+ * Returns R001 if no requirements exist. Zero-pads to 3 digits.
+ */
+export async function nextRequirementId(): Promise<string> {
+  try {
+    const db = await import('./gsd-db.js');
+    const adapter = db._getAdapter();
+    if (!adapter) return 'R001';
+
+    const row = adapter
+      .prepare('SELECT MAX(CAST(SUBSTR(id, 2) AS INTEGER)) as max_num FROM requirements')
+      .get();
+
+    const maxNum = row ? (row['max_num'] as number | null) : null;
+    if (maxNum == null || isNaN(maxNum)) return 'R001';
+
+    const next = maxNum + 1;
+    return `R${String(next).padStart(3, '0')}`;
+  } catch (err) {
+    logError('manifest', 'nextRequirementId failed', { fn: 'nextRequirementId', error: String((err as Error).message) });
+    return 'R001';
+  }
+}
+
+// ─── Save Requirement to DB + Regenerate Markdown ────────────────────────
+
+export interface SaveRequirementFields {
+  class: string;
+  status?: string;
+  description: string;
+  why: string;
+  source: string;
+  primary_owner?: string;
+  supporting_slices?: string;
+  validation?: string;
+  notes?: string;
+}
+
+/**
+ * Save a new requirement to DB and regenerate REQUIREMENTS.md.
+ * Auto-assigns the next ID via nextRequirementId().
+ * Returns the assigned ID.
+ */
+export async function saveRequirementToDb(
+  fields: SaveRequirementFields,
+  basePath: string,
+): Promise<{ id: string }> {
+  try {
+    const db = await import('./gsd-db.js');
+
+    const id = await nextRequirementId();
+
+    const requirement: Requirement = {
+      id,
+      class: fields.class,
+      status: fields.status ?? 'active',
+      description: fields.description,
+      why: fields.why,
+      source: fields.source,
+      primary_owner: fields.primary_owner ?? '',
+      supporting_slices: fields.supporting_slices ?? '',
+      validation: fields.validation ?? '',
+      notes: fields.notes ?? '',
+      full_content: '',
+      superseded_by: null,
+    };
+
+    db.upsertRequirement(requirement);
+
+    // Fetch all requirements for full file regeneration
+    const adapter = db._getAdapter();
+    let allRequirements: Requirement[] = [];
+    if (adapter) {
+      const rows = adapter.prepare('SELECT * FROM requirements ORDER BY id').all();
+      allRequirements = rows.map(row => ({
+        id: row['id'] as string,
+        class: row['class'] as string,
+        status: row['status'] as string,
+        description: row['description'] as string,
+        why: row['why'] as string,
+        source: row['source'] as string,
+        primary_owner: row['primary_owner'] as string,
+        supporting_slices: row['supporting_slices'] as string,
+        validation: row['validation'] as string,
+        notes: row['notes'] as string,
+        full_content: row['full_content'] as string,
+        superseded_by: (row['superseded_by'] as string) ?? null,
+      }));
+    }
+
+    const nonSuperseded = allRequirements.filter(r => r.superseded_by == null);
+    const md = generateRequirementsMd(nonSuperseded);
+    const filePath = resolveGsdRootFile(basePath, 'REQUIREMENTS');
+    try {
+      await saveFile(filePath, md);
+    } catch (diskErr) {
+      logError('manifest', 'disk write failed, rolling back DB row', { fn: 'saveRequirementToDb', error: String((diskErr as Error).message) });
+      const rollbackAdapter = db._getAdapter();
+      rollbackAdapter?.prepare('DELETE FROM requirements WHERE id = :id').run({ ':id': id });
+      throw diskErr;
+    }
+    invalidateStateCache();
+    clearPathCache();
+    clearParseCache();
+
+    return { id };
+  } catch (err) {
+    logError('manifest', 'saveRequirementToDb failed', { fn: 'saveRequirementToDb', error: String((err as Error).message) });
+    throw err;
+  }
+}
+
 // ─── Save Decision to DB + Regenerate Markdown ────────────────────────────

 export interface SaveDecisionFields {
@ -344,15 +460,30 @@ export async function updateRequirementInDb(
    const db = await import('./gsd-db.js');

    const existing = db.getRequirementById(id);
-    if (!existing) {
-      throw new GSDError(GSD_STALE_STATE, `Requirement ${id} not found`);
-    }

-    // Merge updates into existing
+    // If requirement doesn't exist in DB, create a skeleton and merge updates.
+    // This handles the case where requirements were written to REQUIREMENTS.md
+    // but never imported into the database (see #2919).
+    const base: Requirement = existing ?? {
+      id,
+      class: '',
+      status: 'active',
+      description: '',
+      why: '',
+      source: '',
+      primary_owner: '',
+      supporting_slices: '',
+      validation: '',
+      notes: '',
+      full_content: '',
+      superseded_by: null,
+    };
+
+    // Merge updates into existing (or skeleton)
    const merged: Requirement = {
-      ...existing,
+      ...base,
      ...updates,
-      id: existing.id, // ID cannot be changed
+      id: base.id, // ID cannot be changed
    };

    db.upsertRequirement(merged);
@ -388,7 +519,9 @@ export async function updateRequirementInDb(
      await saveFile(filePath, md);
    } catch (diskErr) {
      logError('manifest', 'disk write failed, reverting DB row', { fn: 'updateRequirementInDb', error: String((diskErr as Error).message) });
-      db.upsertRequirement(existing);
+      if (existing) {
+        db.upsertRequirement(existing);
+      }
      throw diskErr;
    }
    // Invalidate file-read caches so deriveState() sees the updated markdown.
--- a/src/resources/extensions/gsd/doctor-git-checks.ts
+++ b/src/resources/extensions/gsd/doctor-git-checks.ts
@ -14,6 +14,28 @@ import { nativeIsRepo, nativeWorktreeList, nativeWorktreeRemove, nativeBranchLis
 import { getAllWorktreeHealth } from "./worktree-health.js";
 import { loadEffectiveGSDPreferences } from "./preferences.js";

+/**
+ * Returns true if the directory contains only doctor artifacts
+ * (e.g. `.gsd/doctor-history.jsonl`). These dirs are created by
+ * appendDoctorHistory() writing to worktree-scoped paths during the audit
+ * and should not be flagged as orphaned worktrees (#3105).
+ */
+function isDoctorArtifactOnly(dirPath: string): boolean {
+  try {
+    const entries = readdirSync(dirPath);
+    // Empty dir — not a doctor artifact, still orphaned
+    if (entries.length === 0) return false;
+    // Only a .gsd subdirectory
+    if (entries.length === 1 && entries[0] === ".gsd") {
+      const gsdEntries = readdirSync(join(dirPath, ".gsd"));
+      return gsdEntries.length <= 1 && gsdEntries.every(e => e === "doctor-history.jsonl");
+    }
+    return false;
+  } catch {
+    return false;
+  }
+}
+
 export async function checkGitHealth(
  basePath: string,
  issues: DoctorIssue[],
@ -314,6 +336,10 @@ export async function checkGitHealth(
        } catch { continue; }
        const normalizedFullPath = normalizePath(fullPath);
        if (!registeredPaths.has(normalizedFullPath)) {
+          // Skip directories that only contain doctor artifacts (.gsd/doctor-history.jsonl).
+          // appendDoctorHistory() can recreate these dirs during the audit itself,
+          // causing a circular false positive (#3105 Bug 1).
+          if (isDoctorArtifactOnly(fullPath)) continue;
          issues.push({
            severity: "warning",
            code: "worktree_directory_orphaned",
--- a/src/resources/extensions/gsd/doctor-providers.ts
+++ b/src/resources/extensions/gsd/doctor-providers.ts
@ -181,7 +181,8 @@ function resolveKey(providerId: string): KeyLookup {
 */
 const PROVIDER_ROUTES: Record<string, string[]> = {
  anthropic: ["github-copilot"],
-  openai: ["github-copilot"],
+  openai: ["github-copilot", "openai-codex"],
+  google: ["google-gemini-cli"],
 };

 function checkLlmProviders(): ProviderCheckResult[] {
--- a/src/resources/extensions/gsd/doctor-runtime-checks.ts
+++ b/src/resources/extensions/gsd/doctor-runtime-checks.ts
@ -119,10 +119,11 @@ export async function checkRuntimeHealth(

      for (const key of keys) {
        // Key format: "unitType/unitId" e.g. "execute-task/M001/S01/T01"
-        const slashIdx = key.indexOf("/");
-        if (slashIdx === -1) continue;
-        const unitType = key.slice(0, slashIdx);
-        const unitId = key.slice(slashIdx + 1);
+        // Hook units have compound types: "hook/<hookName>/unitId"
+        const { splitCompletedKey } = await import("./forensics.js");
+        const parsed = splitCompletedKey(key);
+        if (!parsed) continue;
+        const { unitType, unitId } = parsed;

        // Only validate artifact-producing unit types
        const { verifyExpectedArtifact } = await import("./auto-recovery.js");
--- a/src/resources/extensions/gsd/doctor.ts
+++ b/src/resources/extensions/gsd/doctor.ts
@ -729,8 +729,10 @@ export async function runGSDDoctor(basePath: string, options?: { fix?: boolean;
      }

      // Blocker-without-replan detection
+      // Skip when all tasks are done — the blocker was implicitly resolved
+      // within the task and the slice is not stuck (#3105 Bug 2).
      const replanPath = resolveSliceFile(basePath, milestoneId, slice.id, "REPLAN");
-      if (!replanPath) {
+      if (!replanPath && !allTasksDone) {
        for (const task of plan.tasks) {
          if (!task.done) continue;
          const summaryPath = resolveTaskFile(basePath, milestoneId, slice.id, task.id, "SUMMARY");
--- a/src/resources/extensions/gsd/error-classifier.ts
+++ b/src/resources/extensions/gsd/error-classifier.ts
@ -60,9 +60,9 @@ const RESET_DELAY_RE = /reset in (\d+)s/i;
 *  1. Permanent (auth/billing/quota) — unless also rate-limited
 *  2. Rate limit (429, rate.?limit, too many requests)
 *  3. Network (ECONNRESET, ETIMEDOUT, socket hang up, fetch failed, dns)
- *  4. Server (500/502/503, overloaded, server_error)
- *  5. Connection (terminated, ECONNREFUSED, EPIPE, other side closed)
- *  6. Stream truncation (malformed JSON from mid-stream cut)
+ *  4. Stream truncation (malformed JSON from mid-stream cut)
+ *  5. Server (500/502/503, overloaded, server_error)
+ *  6. Connection (terminated, ECONNREFUSED, EPIPE, other side closed)
 *  7. Unknown
 */
 export function classifyError(errorMsg: string, retryAfterMs?: number): ErrorClass {
@ -92,21 +92,21 @@ export function classifyError(errorMsg: string, retryAfterMs?: number): ErrorCla
    return { kind: "network", retryAfterMs: retryAfterMs ?? 3_000 };
  }

-  // 4. Server errors — try fallback model
+  // 4. Stream truncation — downstream symptom of connection drop
+  if (STREAM_RE.test(errorMsg)) {
+    return { kind: "stream", retryAfterMs: retryAfterMs ?? 15_000 };
+  }
+
+  // 5. Server errors — try fallback model
  if (SERVER_RE.test(errorMsg)) {
    return { kind: "server", retryAfterMs: retryAfterMs ?? 30_000 };
  }

-  // 5. Connection errors — try fallback model
+  // 6. Connection errors — try fallback model
  if (CONNECTION_RE.test(errorMsg)) {
    return { kind: "connection", retryAfterMs: retryAfterMs ?? 15_000 };
  }

-  // 6. Stream truncation — downstream symptom of connection drop
-  if (STREAM_RE.test(errorMsg)) {
-    return { kind: "stream", retryAfterMs: retryAfterMs ?? 15_000 };
-  }
-
  // 7. Unknown
  return { kind: "unknown" };
 }
--- a/src/resources/extensions/gsd/extension-manifest.json
+++ b/src/resources/extensions/gsd/extension-manifest.json
@ -12,7 +12,22 @@
      "gsd_requirement_update", "gsd_milestone_generate_id"
    ],
    "commands": ["gsd", "kill", "worktree", "exit"],
-    "hooks": ["session_start", "session_switch"],
+    "hooks": [
+      "session_start",
+      "session_switch",
+      "bash_transform",
+      "session_fork",
+      "before_agent_start",
+      "agent_end",
+      "session_before_compact",
+      "session_shutdown",
+      "tool_call",
+      "tool_result",
+      "tool_execution_start",
+      "tool_execution_end",
+      "model_select",
+      "before_provider_request"
+    ],
    "shortcuts": ["Ctrl+Alt+G"]
  }
 }
--- a/src/resources/extensions/gsd/forensics.ts
+++ b/src/resources/extensions/gsd/forensics.ts
@ -28,6 +28,8 @@ import { deriveState } from "./state.js";
 import { isAutoActive } from "./auto.js";
 import { loadPrompt } from "./prompt-loader.js";
 import { gsdRoot } from "./paths.js";
+import { isDbAvailable, getAllMilestones, getMilestoneSlices, getSliceTasks } from "./gsd-db.js";
+import { isClosedStatus } from "./status-guards.js";
 import { formatDuration } from "../shared/format-utils.js";
 import { getAutoWorktreePath } from "./auto-worktree.js";
 import { loadEffectiveGSDPreferences, loadGlobalGSDPreferences, getGlobalGSDPreferencesPath } from "./preferences.js";
@ -85,6 +87,15 @@ interface JournalSummary {
  fileCount: number;
 }

+interface DbCompletionCounts {
+  milestones: number;
+  milestonesTotal: number;
+  slices: number;
+  slicesTotal: number;
+  tasks: number;
+  tasksTotal: number;
+}
+
 interface ForensicReport {
  gsdVersion: string;
  timestamp: string;
@ -95,6 +106,7 @@ interface ForensicReport {
  unitTraces: UnitTrace[];
  metrics: MetricsLedger | null;
  completedKeys: string[];
+  dbCompletionCounts: DbCompletionCounts | null;
  crashLock: LockData | null;
  doctorIssues: DoctorIssue[];
  anomalies: ForensicAnomaly[];
@ -106,13 +118,15 @@ interface ForensicReport {
 // ─── Duplicate Detection ──────────────────────────────────────────────────────

 const DEDUP_PROMPT_SECTION = `
-## Duplicate Detection (REQUIRED before issue creation)
+## Pre-Investigation: Duplicate Check (REQUIRED)

-Before offering to create a GitHub issue, you MUST search for existing issues and PRs that may already address this bug. This step uses the user's AI tokens for analysis.
+Before reading GSD source code or performing deep analysis, you MUST search for existing issues and PRs that may already address this bug. This avoids wasting tokens on already-fixed bugs.

 ### Search Steps

-1. **Search closed issues** for similar keywords from your diagnosis:
+Use keywords from the user's problem description and the anomaly summaries in the forensic report above.
+
+1. **Search closed issues** for similar keywords:
   \`\`\`
   gh issue list --repo gsd-build/gsd-2 --state closed --search "<keywords from root cause>" --limit 20
   \`\`\`
@ -129,20 +143,16 @@ Before offering to create a GitHub issue, you MUST search for existing issues an

 ### Analysis

-For each result, compare it against your root-cause diagnosis:
+For each result, compare it against the user's reported symptoms and the forensic anomalies:
 - Does the issue describe the same code path or file?
- Does the PR modify the same file:line you identified?
+- Does the PR modify the area related to the reported symptoms?
 - Is the symptom description semantically similar even if keywords differ?

-### Present Findings
+### Decision Gate

-If you find potential matches, present them to the user:
-
-1. **"Already fixed by PR #X — skip issue creation"** — when a merged PR or closed issue clearly addresses the same root cause. Explain why you believe it matches.
-2. **"Add my findings to existing issue #Y"** — when an open issue exists for the same bug. Use \`gh issue comment #Y --repo gsd-build/gsd-2\` to add forensic evidence.
-3. **"Create new issue anyway"** — when existing results do not cover this specific failure.
-
-Only proceed to issue creation if no matches were found OR the user explicitly chooses "Create new issue anyway".
+- **Merged PR clearly fixes the described symptom** → Report "Already fixed by PR #X" with brief explanation. Skip full investigation.
+- **Open issue matches** → Report "Existing issue #Y covers this." Offer to add forensic evidence. Skip full investigation unless user asks for deeper analysis.
+- **No matches** → Proceed to full investigation below.
 `;

 async function writeForensicsDedupPref(ctx: ExtensionCommandContext, enabled: boolean): Promise<void> {
@ -250,6 +260,9 @@ export async function handleForensics(
    { customType: "gsd-forensics", content, display: false },
    { triggerTurn: true },
  );
+
+  // Persist forensics context so follow-up turns can re-inject it (#2941)
+  writeForensicsMarker(basePath, savedPath, content);
 }

 // ─── Report Builder ───────────────────────────────────────────────────────────
@ -275,8 +288,9 @@ export async function buildForensicReport(basePath: string): Promise<ForensicRep
  // 3. Load metrics
  const metrics = loadLedgerFromDisk(basePath);

-  // 4. Load completed keys
+  // 4. Load completed keys (legacy) and DB completion counts
  const completedKeys = loadCompletedKeys(basePath);
+  const dbCompletionCounts = getDbCompletionCounts();

  // 5. Check crash lock
  const crashLock = readCrashLock(basePath);
@ -335,6 +349,7 @@ export async function buildForensicReport(basePath: string): Promise<ForensicRep
    unitTraces,
    metrics,
    completedKeys,
+    dbCompletionCounts,
    crashLock,
    doctorIssues,
    anomalies,
@ -585,6 +600,44 @@ function loadCompletedKeys(basePath: string): string[] {
  return [];
 }

+// ─── DB Completion Counts ────────────────────────────────────────────────────
+
+function getDbCompletionCounts(): DbCompletionCounts | null {
+  if (!isDbAvailable()) return null;
+
+  const milestones = getAllMilestones();
+  let completedMilestones = 0;
+  let totalSlices = 0;
+  let completedSlices = 0;
+  let totalTasks = 0;
+  let completedTasks = 0;
+
+  for (const m of milestones) {
+    if (isClosedStatus(m.status)) completedMilestones++;
+
+    const slices = getMilestoneSlices(m.id);
+    for (const s of slices) {
+      totalSlices++;
+      if (isClosedStatus(s.status)) completedSlices++;
+
+      const tasks = getSliceTasks(m.id, s.id);
+      for (const t of tasks) {
+        totalTasks++;
+        if (isClosedStatus(t.status)) completedTasks++;
+      }
+    }
+  }
+
+  return {
+    milestones: completedMilestones,
+    milestonesTotal: milestones.length,
+    slices: completedSlices,
+    slicesTotal: totalSlices,
+    tasks: completedTasks,
+    tasksTotal: totalTasks,
+  };
+}
+
 // ─── Anomaly Detectors ───────────────────────────────────────────────────────

 function detectStuckLoops(units: UnitMetrics[], anomalies: ForensicAnomaly[]): void {
@ -649,15 +702,42 @@ function detectTimeouts(traces: UnitTrace[], anomalies: ForensicAnomaly[]): void
  }
 }

+/**
+ * Parse a completed-unit key into its unitType and unitId.
+ *
+ * Hook units use a compound slash-delimited type ("hook/<hookName>"), so a
+ * naive `key.indexOf("/")` would split "hook/telegram-progress/M007/S01" into
+ * unitType="hook" (wrong) instead of "hook/telegram-progress".
+ *
+ * Returns `null` for malformed keys that cannot be split.
+ */
+export function splitCompletedKey(key: string): { unitType: string; unitId: string } | null {
+  if (key.startsWith("hook/")) {
+    // Hook unit types are two segments: "hook/<hookName>/<unitId...>"
+    const secondSlash = key.indexOf("/", 5); // skip past "hook/"
+    if (secondSlash === -1) return null;     // malformed — no unitId after hook name
+    return {
+      unitType: key.slice(0, secondSlash),
+      unitId: key.slice(secondSlash + 1),
+    };
+  }
+
+  const slashIdx = key.indexOf("/");
+  if (slashIdx === -1) return null;
+  return {
+    unitType: key.slice(0, slashIdx),
+    unitId: key.slice(slashIdx + 1),
+  };
+}
+
 function detectMissingArtifacts(completedKeys: string[], basePath: string, activeMilestone: string | null, anomalies: ForensicAnomaly[]): void {
  // Also check the worktree path for artifacts — they may exist there but not at root
  const wtBasePath = activeMilestone ? getAutoWorktreePath(basePath, activeMilestone) : null;

  for (const key of completedKeys) {
-    const slashIdx = key.indexOf("/");
-    if (slashIdx === -1) continue;
-    const unitType = key.slice(0, slashIdx);
-    const unitId = key.slice(slashIdx + 1);
+    const parsed = splitCompletedKey(key);
+    if (!parsed) continue;
+    const { unitType, unitId } = parsed;

    const rootHasArtifact = verifyExpectedArtifact(unitType, unitId, basePath);
    const wtHasArtifact = wtBasePath ? verifyExpectedArtifact(unitType, unitId, wtBasePath) : false;
@ -896,6 +976,42 @@ function saveForensicReport(basePath: string, report: ForensicReport, problemDes
  return filePath;
 }

+// ─── Forensics Session Marker ────────────────────────────────────────────────
+
+export interface ForensicsMarker {
+  reportPath: string;
+  promptContent: string;
+  createdAt: string;
+}
+
+/**
+ * Write a marker file so that buildBeforeAgentStartResult() can re-inject
+ * the forensics prompt on follow-up turns.  (#2941)
+ */
+export function writeForensicsMarker(basePath: string, reportPath: string, promptContent: string): void {
+  const dir = join(gsdRoot(basePath), "runtime");
+  mkdirSync(dir, { recursive: true });
+  const marker: ForensicsMarker = {
+    reportPath,
+    promptContent,
+    createdAt: new Date().toISOString(),
+  };
+  writeFileSync(join(dir, "active-forensics.json"), JSON.stringify(marker), "utf-8");
+}
+
+/**
+ * Read the active forensics marker, or null if none exists.
+ */
+export function readForensicsMarker(basePath: string): ForensicsMarker | null {
+  const markerPath = join(gsdRoot(basePath), "runtime", "active-forensics.json");
+  if (!existsSync(markerPath)) return null;
+  try {
+    return JSON.parse(readFileSync(markerPath, "utf-8")) as ForensicsMarker;
+  } catch {
+    return null;
+  }
+}
+
 // ─── Prompt Formatter ─────────────────────────────────────────────────────────

 function formatReportForPrompt(report: ForensicReport): string {
@ -1008,8 +1124,16 @@ function formatReportForPrompt(report: ForensicReport): string {
    sections.push("");
  }

-  // Completed keys count
-  sections.push(`### Completed Keys: ${report.completedKeys.length}`);
+  // Completion status — prefer DB counts, fall back to legacy completed-units.json
+  if (report.dbCompletionCounts) {
+    const c = report.dbCompletionCounts;
+    sections.push(`### Completion Status (from DB)`);
+    sections.push(`- ${c.milestones}/${c.milestonesTotal} milestones complete`);
+    sections.push(`- ${c.slices}/${c.slicesTotal} slices complete`);
+    sections.push(`- ${c.tasks}/${c.tasksTotal} tasks complete`);
+  } else {
+    sections.push(`### Completed Keys: ${report.completedKeys.length}`);
+  }
  sections.push(`### GSD Version: ${report.gsdVersion}`);
  sections.push(`### Active Milestone: ${report.activeMilestone ?? "none"}`);
  sections.push(`### Active Slice: ${report.activeSlice ?? "none"}`);
--- a/src/resources/extensions/gsd/git-service.ts
+++ b/src/resources/extensions/gsd/git-service.ts
@ -9,7 +9,7 @@
 */

 import { execFileSync, execSync } from "node:child_process";
-import { existsSync, mkdirSync, readFileSync, writeFileSync } from "node:fs";
+import { existsSync, mkdirSync, readFileSync, readdirSync, writeFileSync } from "node:fs";
 import { join } from "node:path";
 import { gsdRoot } from "./paths.js";
 import { GIT_NO_PROMPT_ENV } from "./git-constants.js";
@ -50,9 +50,9 @@ export interface GitPreferences {
  main_branch?: string;
  merge_strategy?: "squash" | "merge";
  /** Controls auto-mode git isolation strategy.
-   *  - "worktree": (default) creates a milestone worktree for isolated work
+   *  - "worktree": creates a milestone worktree for isolated work
   *  - "branch": works directly in the project root (for submodule-heavy repos)
-   *  - "none": no git isolation — commits land on the user's current branch directly
+   *  - "none": (default) no git isolation — commits land on the user's current branch directly
   */
  isolation?: "worktree" | "branch" | "none";
  /** When false, GSD will not modify .gitignore at all — no baseline patterns
@ -488,6 +488,29 @@ export class GitServiceImpl {
    // If .gsd/ IS in .gitignore (the default for external state projects),
    // git add -A already skips it and the exclusions are harmless no-ops.
    const allExclusions = [...RUNTIME_EXCLUSION_PATHS, ...extraExclusions];
+
+    // ── Parallel worker milestone scope (#1991) ──
+    // When GSD_MILESTONE_LOCK is set, this process is a parallel worker that
+    // must only commit files belonging to its own milestone. Exclude all other
+    // milestone directories from staging to prevent cross-milestone pollution
+    // (e.g., an M033 worker fabricating M032 artifacts in the same commit).
+    const milestoneLock = process.env.GSD_MILESTONE_LOCK;
+    if (milestoneLock) {
+      const msDir = join(gsdRoot(this.basePath), "milestones");
+      if (existsSync(msDir)) {
+        try {
+          const entries = readdirSync(msDir, { withFileTypes: true });
+          for (const entry of entries) {
+            if (entry.isDirectory() && entry.name !== milestoneLock) {
+              allExclusions.push(`.gsd/milestones/${entry.name}/`);
+            }
+          }
+        } catch {
+          // Best-effort — if we can't read the milestones dir, proceed without scoping
+        }
+      }
+    }
+
    nativeAddAllWithExclusions(this.basePath, allExclusions);
  }

--- a/src/resources/extensions/gsd/gitignore.ts
+++ b/src/resources/extensions/gsd/gitignore.ts
@ -41,6 +41,7 @@ const GSD_RUNTIME_PATTERNS = [
 const BASELINE_PATTERNS = [
  // ── GSD state directory (symlink to external storage) ──
  ".gsd",
+  ".gsd-id",

  // ── OS junk ──
  ".DS_Store",
@ -84,6 +85,38 @@ const BASELINE_PATTERNS = [
  "tmp/",
 ];

+/**
+ * Check whether `.gsd` is covered by the project's `.gitignore`.
+ *
+ * Uses `git check-ignore` for accurate evaluation — this respects nested
+ * .gitignore files, global gitignore, and negation patterns. Returns true
+ * only when git would actually ignore `.gsd/`.
+ *
+ * Returns false (not ignored) if:
+ *   - No `.gitignore` exists
+ *   - `.gsd` is not listed in any active ignore rule
+ *   - Not a git repo or git is unavailable
+ */
+export function isGsdGitignored(basePath: string): boolean {
+  // Check both `.gsd` and `.gsd/` because `.gsd/` in .gitignore (trailing
+  // slash = directory-only pattern) only matches the directory form. Using
+  // both paths covers all gitignore pattern variants.
+  for (const path of [".gsd", ".gsd/"]) {
+    try {
+      // git check-ignore exits 0 when the path IS ignored, 1 when it is NOT.
+      execFileSync("git", ["check-ignore", "-q", path], {
+        cwd: basePath,
+        stdio: "pipe",
+        env: GIT_NO_PROMPT_ENV,
+      });
+      return true; // exit 0 → .gsd is ignored
+    } catch {
+      // exit 1 → this form is NOT ignored, try the other
+    }
+  }
+  return false; // neither form is ignored (or git unavailable)
+}
+
 /**
 * Check whether `.gsd/` contains files tracked by git.
 * If so, the project intentionally keeps `.gsd/` in version control
--- a/src/resources/extensions/gsd/gsd-db.ts
+++ b/src/resources/extensions/gsd/gsd-db.ts
@ -10,6 +10,7 @@ import { existsSync, copyFileSync, mkdirSync, realpathSync } from "node:fs";
 import { dirname } from "node:path";
 import type { Decision, Requirement, GateRow, GateId, GateScope, GateStatus, GateVerdict } from "./types.js";
 import { GSDError, GSD_STALE_STATE } from "./errors.js";
+import { logError } from "./workflow-logger.js";

 const _require = createRequire(import.meta.url);

@ -778,8 +779,21 @@ export function openDatabase(path: string): boolean {
  try {
    initSchema(adapter, fileBacked);
  } catch (err) {
-    try { adapter.close(); } catch { /* swallow */ }
-    throw err;
+    // Corrupt freelist: DDL fails with "malformed" but VACUUM can rebuild.
+    // Attempt VACUUM recovery before giving up (see #2519).
+    if (fileBacked && err instanceof Error && err.message?.includes("malformed")) {
+      try {
+        adapter.exec("VACUUM");
+        initSchema(adapter, fileBacked);
+        process.stderr.write("gsd-db: recovered corrupt database via VACUUM\n");
+      } catch (retryErr) {
+        try { adapter.close(); } catch { /* swallow */ }
+        throw retryErr;
+      }
+    } else {
+      try { adapter.close(); } catch { /* swallow */ }
+      throw err;
+    }
  }

  currentDb = adapter;
@ -1124,10 +1138,11 @@ export function insertMilestone(m: {
  });
 }

-export function upsertMilestonePlanning(milestoneId: string, planning: Partial<MilestonePlanningRecord>): void {
+export function upsertMilestonePlanning(milestoneId: string, planning: Partial<MilestonePlanningRecord>, title?: string): void {
  if (!currentDb) throw new GSDError(GSD_STALE_STATE, "gsd-db: No database open");
  currentDb.prepare(
    `UPDATE milestones SET
+      title = COALESCE(:title, title),
      vision = COALESCE(:vision, vision),
      success_criteria = COALESCE(:success_criteria, success_criteria),
      key_risks = COALESCE(:key_risks, key_risks),
@ -1142,6 +1157,7 @@ export function upsertMilestonePlanning(milestoneId: string, planning: Partial<M
     WHERE id = :id`,
  ).run({
    ":id": milestoneId,
+    ":title": title ?? null,
    ":vision": planning.vision ?? null,
    ":success_criteria": planning.successCriteria ? JSON.stringify(planning.successCriteria) : null,
    ":key_risks": planning.keyRisks ? JSON.stringify(planning.keyRisks) : null,
@ -1519,6 +1535,26 @@ export function insertVerificationEvidence(e: {
  });
 }

+export interface VerificationEvidenceRow {
+  id: number;
+  task_id: string;
+  slice_id: string;
+  milestone_id: string;
+  command: string;
+  exit_code: number;
+  verdict: string;
+  duration_ms: number;
+  created_at: string;
+}
+
+export function getVerificationEvidence(milestoneId: string, sliceId: string, taskId: string): VerificationEvidenceRow[] {
+  if (!currentDb) return [];
+  const rows = currentDb.prepare(
+    "SELECT * FROM verification_evidence WHERE milestone_id = :mid AND slice_id = :sid AND task_id = :tid ORDER BY id",
+  ).all({ ":mid": milestoneId, ":sid": sliceId, ":tid": taskId });
+  return rows as unknown as VerificationEvidenceRow[];
+}
+
 export interface MilestoneRow {
  id: string;
  title: string;
@ -1738,7 +1774,7 @@ export function copyWorktreeDb(srcDbPath: string, destDbPath: string): boolean {
    copyFileSync(srcDbPath, destDbPath);
    return true;
  } catch (err) {
-    process.stderr.write(`gsd-db: failed to copy DB to worktree: ${(err as Error).message}\n`);
+    logError("db", "failed to copy DB to worktree", { error: (err as Error).message });
    return false;
  }
 }
@ -1770,13 +1806,13 @@ export function reconcileWorktreeDb(
  // ATTACH DATABASE doesn't support parameterized paths in all providers,
  // so we use strict allowlist validation instead.
  if (/['";\x00]/.test(worktreeDbPath)) {
-    process.stderr.write("gsd-db: worktree DB reconciliation failed: path contains unsafe characters\n");
+    logError("db", "worktree DB reconciliation failed: path contains unsafe characters");
    return zero;
  }
  if (!currentDb) {
    const opened = openDatabase(mainDbPath);
    if (!opened) {
-      process.stderr.write("gsd-db: worktree DB reconciliation failed: cannot open main DB\n");
+      logError("db", "worktree DB reconciliation failed: cannot open main DB");
      return zero;
    }
  }
@ -1910,7 +1946,7 @@ export function reconcileWorktreeDb(
      try { adapter.exec("DETACH DATABASE wt"); } catch { /* best effort */ }
    }
  } catch (err) {
-    process.stderr.write(`gsd-db: worktree DB reconciliation failed: ${(err as Error).message}\n`);
+    logError("db", "worktree DB reconciliation failed", { error: (err as Error).message });
    return { ...zero, conflicts };
  }
 }
--- a/Show more
+++ b/Show more