fix: resolve merge conflicts with main for PR #594

Merge main into feat/context-window-budget, combining: - Budget fields (contextWindowTokens, truncationSections, continueHereFired) from the PR with routing fields (tier, modelDowngraded) from main in UnitMetrics interface - Unified opts parameter pattern in snapshotUnitMetrics - KNOWLEDGE.md step from main with template path references from the PR in execute-task.md Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-16 11:04:58 -06:00 · 2026-03-16 11:04:58 -06:00 · daf36d2b7a
commit daf36d2b7a
parent fc657878c1 da1a77d723
109 changed files with 12473 additions and 235 deletions
--- a/.plans/dynamic-model-discovery.md
+++ b/.plans/dynamic-model-discovery.md
@ -0,0 +1,27 @@
+# Dynamic Model Discovery
+
+## Overview
+Runtime model discovery from provider APIs with caching, TUI management, and CLI flags.
+
+## Components
+1. **model-discovery.ts** — Provider adapters (OpenAI, Ollama, OpenRouter, Google) + static adapters
+2. **discovery-cache.ts** — Disk cache at `{agentDir}/discovery-cache.json` with per-provider TTLs
+3. **models-json-writer.ts** — Safe read-modify-write for `models.json` with file locking
+4. **provider-manager.ts** — TUI component for provider management (`/provider` command)
+5. **model-registry.ts** — Extended with `discoverModels()`, `getAllWithDiscovered()`, cache integration
+6. **settings-manager.ts** — `modelDiscovery` settings (enabled, providers, ttlMinutes, autoRefreshOnModelSelect)
+7. **args.ts** — `--discover`, `--add-provider`, `--base-url`, `--discover-models` CLI flags
+8. **list-models.ts** — Rewritten with `[discovered]` badge support
+9. **main.ts** — CLI handlers for new flags
+10. **interactive-mode.ts** — `/provider` command handler
+11. **preferences.ts** — `updatePreferencesModels()` and `validateModelId()` helpers
+
+## TTL Strategy
+- Ollama: 5 min (local, models change often)
+- OpenAI / Google / OpenRouter: 1 hour
+- Default: 24 hours
+
+## Merge Rules
+- Discovered models never override existing built-in or custom models
+- Discovered models are appended to the registry with `[discovered]` badge
+- Background discovery is opt-in via `modelDiscovery.enabled` setting
--- a/.plans/issue-575-dynamic-model-routing.md
+++ b/.plans/issue-575-dynamic-model-routing.md
@ -0,0 +1,364 @@
+# Plan: Dynamic Model Routing for Token Optimization
+
+**Issue:** #575 — Token Consumption Optimization through Dynamic Model Selection
+**Status:** Draft
+**Date:** 2025-03-15
+
+## Problem Statement
+
+Users on capped plans (e.g., Claude Pro) exhaust weekly token limits in 15-20 hours of GSD usage. Currently, GSD uses a single model per phase (research/planning/execution/completion), configured statically in preferences. Simple tasks consume the same tokens as complex ones.
+
+## Current Architecture
+
+### What Exists
+- **Phase-based model config:** Users can set different models per phase via `preferences.md` (research, planning, execution, completion)
+- **Fallback chains:** Each phase supports `fallbacks: [model1, model2]` for error recovery
+- **Pre-dispatch hooks:** `PreDispatchResult` has a `model` field but it's **never applied** in `auto.ts` — this is a ready-made extension point
+- **Model registry:** `ModelRegistry.getAvailable()` provides all configured models with metadata
+- **Per-unit metrics:** Token counts (input/output/cacheRead/cacheWrite), cost, and model tracked per unit
+- **Budget enforcement:** Real-time cost tracking with alerts at 75%/90%/100%
+
+### Key Files
+| File | Role |
+|------|------|
+| `src/resources/extensions/gsd/auto.ts` | Dispatch logic, model switching (lines 1791-1879) |
+| `src/resources/extensions/gsd/preferences.ts` | Model resolution, `resolveModelWithFallbacksForUnit()` |
+| `src/resources/extensions/gsd/post-unit-hooks.ts` | Pre-dispatch hooks (model field defined but unused) |
+| `src/resources/extensions/gsd/types.ts` | Type definitions for hooks and model config |
+| `src/resources/extensions/gsd/metrics.ts` | Token tracking, aggregation, cost projection |
+| `src/resources/extensions/gsd/auto-prompts.ts` | Prompt builders per unit type |
+| `packages/pi-coding-agent/src/core/model-registry.ts` | Model availability and metadata |
+
+## Proposed Design
+
+### Core Concept: Task Complexity Classification
+
+Before each unit dispatch, classify the task into a complexity tier and route to an appropriate model. This sits between preference resolution and model dispatch — it can **downgrade** but never **upgrade** beyond the user's configured model.
+
+### Complexity Tiers
+
+| Tier | Complexity | Example Tasks | Default Model |
+|------|-----------|---------------|---------------|
+| **Tier 1 — Light** | Low cognitive load, structured output | File reads, search aggregation, simple summaries, completion/summary units | Haiku / cheapest available |
+| **Tier 2 — Standard** | Moderate reasoning, some creativity | Research synthesis, plan formatting, routine code generation, UAT checks | Sonnet / mid-tier |
+| **Tier 3 — Heavy** | Complex reasoning, architecture, novel code | Complex execution tasks, replanning, multi-file refactors, debugging | Opus / user's configured model |
+
+### Classification Signals
+
+The classifier uses **heuristic signals** available before dispatch (no LLM call needed):
+
+1. **Unit type** (strongest signal):
+   - `complete-slice`, `run-uat` → Tier 1 (structured summarization)
+   - `research-milestone`, `research-slice` → Tier 2 (synthesis)
+   - `plan-milestone`, `plan-slice` → Tier 2-3 (depends on scope)
+   - `execute-task` → Tier 2-3 (depends on task complexity)
+   - `replan-slice` → Tier 3 (requires understanding of failure)
+
+2. **Task metadata** (for execution units):
+   - Lines of code estimated to change (from task plan)
+   - Number of files involved
+   - Dependency count
+   - Whether task involves new file creation vs. modification
+   - Tags/labels if present (e.g., "refactor", "test", "docs")
+
+3. **Historical performance** (adaptive, Phase 2):
+   - If a Tier 2 model failed and escalated on similar tasks before, default to Tier 3
+   - Track success rate per tier per unit-type pattern
+
+### Architecture
+
+```
+User Preferences (phase → model)
+        │
+        ▼
+resolveModelWithFallbacksForUnit()     ← existing
+        │
+        ▼
+classifyUnitComplexity()               ← NEW: returns Tier 1/2/3
+        │
+        ▼
+resolveModelForTier()                  ← NEW: maps tier → model from available set
+        │
+        ▼
+maybeDowngradeModel()                  ← NEW: only downgrades from user's configured model
+        │
+        ▼
+Model dispatch (existing auto.ts logic)
+```
+
+### Key Design Decisions
+
+1. **Downgrade-only:** The classifier can select a cheaper model than configured, never a more expensive one. The user's preference is the ceiling.
+
+2. **Opt-in with easy override:** New preference key `dynamic_model_routing: true|false` (default: `false`). Users who want token savings enable it explicitly.
+
+3. **Escalation on failure:** If a lower-tier model fails (tool errors, incomplete output, exceeds retries), automatically escalate to the next tier and retry the unit.
+
+4. **No LLM call for classification:** Uses heuristics only — adding an LLM call to save tokens would be counterproductive.
+
+5. **Respects existing fallback chains:** Dynamic routing integrates with existing `fallbacks` — if the dynamically selected model fails, it tries the fallback chain before escalating tiers.
+
+6. **Transparent to user:** Dashboard shows which model was selected and why (tier badge in progress widget).
+
+## Implementation Phases
+
+### Phase 1: Foundation — Complexity Classifier & Routing (Core)
+
+**Goal:** Build the classification and routing system, wire it into dispatch.
+
+#### 1a. Define types and configuration
+
+**File:** `src/resources/extensions/gsd/types.ts`
+- Add `ComplexityTier` type: `'light' | 'standard' | 'heavy'`
+- Add `DynamicRoutingConfig` interface:
+  ```typescript
+  interface DynamicRoutingConfig {
+    enabled: boolean;
+    tier_models?: {
+      light?: string;    // model ID for light tasks
+      standard?: string; // model ID for standard tasks
+      heavy?: string;    // model ID for heavy tasks (default: user's configured model)
+    };
+    escalate_on_failure?: boolean; // default: true
+  }
+  ```
+
+**File:** `src/resources/extensions/gsd/preferences.ts`
+- Add `dynamic_routing` to preference schema
+- Add validation for the new config
+- Add `loadDynamicRoutingConfig()` function
+
+#### 1b. Build complexity classifier
+
+**New file:** `src/resources/extensions/gsd/complexity-classifier.ts`
+- `classifyUnitComplexity(unitType, unitId, metadata?)` → `ComplexityTier`
+- Heuristic rules:
+  - Unit type mapping (see Tiers table above)
+  - Task plan analysis: parse task plan file for file count, estimated scope
+  - Dependency analysis: tasks with 3+ dependencies → bump to heavy
+- Export `getClassificationReason()` for dashboard display
+
+#### 1c. Build model router
+
+**New file:** `src/resources/extensions/gsd/model-router.ts`
+- `resolveModelForComplexity(tier, phaseConfig, availableModels)` → `ResolvedModelConfig`
+- Logic:
+  1. Get user's configured model for phase (ceiling)
+  2. If `tier_models` configured, use tier-specific model
+  3. If not configured, use smart defaults from available models (cheapest for light, mid for standard, configured for heavy)
+  4. Validate selected model is available
+  5. Return with fallback chain: `[tier_model, ...configured_fallbacks, configured_primary]`
+
+#### 1d. Wire into dispatch
+
+**File:** `src/resources/extensions/gsd/auto.ts`
+- In the model resolution block (lines 1791-1879):
+  1. After `resolveModelWithFallbacksForUnit()`, call classifier
+  2. If dynamic routing enabled, call router to potentially downgrade
+  3. Log tier and model selection to metrics
+  4. On unit failure: if using downgraded model, escalate tier and retry
+
+#### 1e. Wire the unused pre-dispatch hook model field
+
+**File:** `src/resources/extensions/gsd/auto.ts`
+- Apply `preDispatchResult.model` when returned — this is already defined but unused
+- Allows hooks to override dynamic routing decisions
+
+#### Tests
+
+**New file:** `src/resources/extensions/gsd/tests/complexity-classifier.test.ts`
+- Test tier assignment for each unit type
+- Test metadata-based adjustments (file count, dependency count)
+- Test edge cases (missing metadata, unknown unit types)
+
+**New file:** `src/resources/extensions/gsd/tests/model-router.test.ts`
+- Test downgrade-only behavior (never exceeds configured model)
+- Test tier-to-model mapping with various available model sets
+- Test fallback chain construction
+- Test when dynamic routing is disabled (passthrough)
+
+**New file:** `src/resources/extensions/gsd/tests/dynamic-routing-integration.test.ts`
+- Test full flow: unit → classify → route → dispatch
+- Test escalation on failure
+- Test preference loading and validation
+
+---
+
+### Phase 2: Observability & Dashboard
+
+**Goal:** Make routing decisions visible to users.
+
+#### 2a. Metrics tracking
+
+**File:** `src/resources/extensions/gsd/metrics.ts`
+- Add `tier` field to `UnitMetrics`
+- Add `model_downgraded: boolean` field
+- Add `escalation_count` field
+- Add `aggregateByTier()` function
+- Add `formatTierSavings()` — show estimated savings from downgrades
+
+#### 2b. Dashboard integration
+
+**File:** `src/resources/extensions/gsd/auto-dashboard.ts`
+- Add tier badge to unit progress display (e.g., `[L]`, `[S]`, `[H]`)
+- Add savings summary to completion stats: "Dynamic routing saved ~$X.XX (N units downgraded)"
+- Color-code tier in token widget
+
+#### Tests
+- Test metrics aggregation by tier
+- Test savings calculation
+- Test dashboard formatting
+
+---
+
+### Phase 3: Adaptive Learning (Future)
+
+**Goal:** Improve classification accuracy over time based on outcomes.
+
+#### 3a. Outcome tracking
+
+**File:** `src/resources/extensions/gsd/complexity-classifier.ts`
+- Track success/failure per tier per unit-type pattern
+- Store in `.gsd/routing-history.json` (project-level)
+- Simple structure: `{ "execute-task:docs": { light: { success: 12, fail: 1 }, ... } }`
+
+#### 3b. Adaptive thresholds
+
+- If a tier has >20% failure rate for a pattern, auto-bump default tier
+- Decay old data (rolling window of last 50 units)
+- User can reset learning: `dynamic_routing_reset: true` in preferences
+
+#### Tests
+- Test learning updates on success/failure
+- Test threshold bumping
+- Test decay logic
+- Test reset behavior
+
+---
+
+### Phase 4: Task Plan Introspection (Future)
+
+**Goal:** Deeper classification using task plan content analysis.
+
+- Parse task plan markdown for complexity signals:
+  - "Create new file" vs. "modify existing"
+  - Number of code blocks in plan
+  - Presence of keywords: "refactor", "migration", "architecture", "test", "docs", "config"
+  - Estimated lines of change (if specified)
+- Weight these signals alongside unit-type heuristics
+
+---
+
+## Preference Configuration (User-Facing)
+
+```yaml
+---
+version: 1
+models:
+  research: claude-sonnet-4-6
+  planning: claude-opus-4-6
+  execution: claude-sonnet-4-6
+  completion: claude-sonnet-4-6
+dynamic_routing:
+  enabled: true
+  tier_models:
+    light: claude-haiku-4-5
+    standard: claude-sonnet-4-6
+    # heavy: inherits from phase config (ceiling)
+  escalate_on_failure: true
+---
+```
+
+## Risk Mitigation
+
+| Risk | Mitigation |
+|------|-----------|
+| Cheaper model produces low-quality output | Downgrade-only design; escalation on failure; user can disable |
+| Classification overhead adds latency | Heuristics-only, no LLM call; <1ms classification time |
+| Complex preferences confuse users | Disabled by default; works with zero config if enabled (uses smart defaults) |
+| Model not available in user's provider | Validation at preference load; falls back to configured model |
+| Escalation loops | Max 1 escalation per unit; after that, use configured model |
+
+## Estimated Token Savings
+
+Based on typical GSD session patterns:
+- ~30% of units are completion/summary (Tier 1 candidates)
+- ~40% are research/standard planning (Tier 2 candidates)
+- ~30% are complex execution (Tier 3, no downgrade)
+
+If Haiku is ~10x cheaper than Opus and Sonnet is ~5x cheaper:
+- **Conservative estimate:** 20-30% cost reduction with dynamic routing enabled
+- **Aggressive estimate:** 40-50% for projects with many small tasks
+
+## Resolved Design Decisions
+
+All four open questions resolved as **yes** — folded into the plan as additional scope:
+
+### 1. Post-unit hook classification — YES
+Hooks get their own complexity classification. Most hooks are lightweight (validation, file checks) and should default to Tier 1. The existing `model` field on `PostUnitHookConfig` becomes the ceiling, same as phase models for units.
+
+**Implementation:** Add to Phase 1d — extend `classifyUnitComplexity()` to accept hook metadata. Wire into hook dispatch at `auto.ts` lines 936-946.
+
+### 2. Budget-pressure-aware routing — YES
+As budget usage increases, the classifier becomes more aggressive about downgrading:
+- **<50% budget used:** Normal classification
+- **50-75% budget used:** Bump Tier 2 candidates down to Tier 1 where possible
+- **75-90% budget used:** Only Tier 3 tasks get the configured model; everything else goes to cheapest available
+- **>90% budget used:** Everything except `replan-slice` gets downgraded to cheapest
+
+**Implementation:** Add to Phase 1b — `classifyUnitComplexity()` takes `budgetPct` parameter from existing `getBudgetAlertLevel()` logic. New function `applyBudgetPressure(tier, budgetPct)` adjusts the tier.
+
+### 3. Multi-provider cost routing — YES
+When multiple providers are configured, the router should consider cost differences. If a user has both Anthropic and OpenRouter, pick the cheapest option for the resolved tier.
+
+**Implementation:**
+- Add `cost_per_1k_tokens` metadata to model registry (or maintain a lookup table for known models)
+- New file: `src/resources/extensions/gsd/model-cost-table.ts` — static cost table for known models, updatable via preferences
+- `resolveModelForComplexity()` ranks available models by cost within a tier's capability range
+- Preference key: `dynamic_routing.cross_provider: true|false` (default: true when enabled)
+
+**Risk:** Cost data goes stale. Mitigate with a bundled cost table that gets updated with GSD releases + user override capability.
+
+### 4. User feedback loop — YES
+After each unit completes, users can flag the output quality to improve future classification.
+
+**Implementation (Phase 3 — Adaptive Learning):**
+- Post-unit prompt option: user can react with `/gsd:rate-unit [over|under|ok]`
+  - `over` = "this could have used a simpler model" → records downgrade signal
+  - `under` = "this needed a better model" → records upgrade signal
+  - `ok` = confirms current tier was appropriate
+- Feedback stored alongside outcome data in `.gsd/routing-history.json`
+- Classifier weights feedback signals 2x vs. automatic success/failure detection
+- Skill: `gsd:rate-unit` — simple command that tags the last completed unit
+
+### Updated Preference Configuration
+
+```yaml
+---
+version: 1
+models:
+  research: claude-sonnet-4-6
+  planning: claude-opus-4-6
+  execution: claude-sonnet-4-6
+  completion: claude-sonnet-4-6
+dynamic_routing:
+  enabled: true
+  tier_models:
+    light: claude-haiku-4-5
+    standard: claude-sonnet-4-6
+    # heavy: inherits from phase config (ceiling)
+  escalate_on_failure: true
+  budget_pressure: true        # more aggressive downgrading as budget fills
+  cross_provider: true          # consider cost across providers
+  hooks: true                   # classify hooks too
+---
+```
+
+### Updated Phase Summary
+
+| Phase | Scope | Includes |
+|-------|-------|----------|
+| **1 — Foundation** | Classifier, router, dispatch, hook classification, budget pressure | Decisions 1 & 2 |
+| **2 — Observability** | Dashboard, tier badges, savings tracking, cost table | Decision 3 |
+| **3 — Adaptive Learning** | Outcome tracking, user feedback (`/gsd:rate-unit`), adaptive thresholds | Decision 4 |
+| **4 — Task Introspection** | Parse task plans for deeper complexity signals | — |
--- a/.plans/preferences-wizard-completeness.md
+++ b/.plans/preferences-wizard-completeness.md
@ -0,0 +1,49 @@
+# Preferences Wizard Completeness
+
+## Problem
+The `/gsd prefs wizard` currently only configures 6 of 18+ preference fields. Users must hand-edit YAML for the rest.
+
+## Current Wizard Coverage
+1. Models (per phase) ✓
+2. Auto-supervisor timeouts ✓
+3. Git main_branch ✓
+4. Skill discovery mode ✓
+5. Unique milestone IDs ✓
+
+## Missing Fields to Add
+
+### Group 1: Git Settings (expand existing section)
+- `auto_push` (boolean) — auto-push commits ✓
+- `push_branches` (boolean) — push milestone branches ✓
+- `remote` (string) — git remote name ✓
+- `snapshots` (boolean) — WIP snapshot commits ✓
+- `pre_merge_check` (boolean | "auto") — pre-merge validation ✓
+- `commit_type` (select) — conventional commit prefix ✓
+- `merge_strategy` (select) — squash vs merge ✓
+- `isolation` (select) — worktree vs branch ✓
+
+### Group 2: Budget & Cost Control ✓
+- `budget_ceiling` (number) — dollar limit
+- `budget_enforcement` (select: warn/pause/halt)
+- `context_pause_threshold` (number 0-100)
+
+### Group 3: Notifications ✓
+- `notifications.enabled` (boolean)
+- `notifications.on_complete` (boolean)
+- `notifications.on_error` (boolean)
+- `notifications.on_budget` (boolean)
+- `notifications.on_milestone` (boolean)
+- `notifications.on_attention` (boolean)
+
+### Group 4: Behavior Toggles ✓
+- `uat_dispatch` (boolean)
+
+### Group 5: Update Serialization Order ✓
+- Added missing keys to `orderedKeys` in `serializePreferencesToFrontmatter()`
+
+### Group 6: Update Template & Docs ✓
+- Updated `templates/preferences.md` with new fields
+- Updated `docs/preferences-reference.md` with budget, notifications, git, hooks
+
+### Group 7: Tests ✓
+- Added `preferences-wizard-fields.test.ts` covering all new fields
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@ -6,6 +6,45 @@ Format based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/).

 ## [Unreleased]

+## [2.19.0] - 2026-03-16
+
+### Added
+- **Workflow visualizer** — `/gsd visualize` opens a full-screen TUI overlay with four tabs: Progress (milestone/slice/task tree), Dependencies (ASCII dep graph), Metrics (cost/token bar charts), and Timeline (chronological execution history). Supports Tab/1-4 switching, per-tab scrolling, auto-refresh every 2s, and optional auto-trigger after milestone completion via `auto_visualize` preference (#626)
+- **Mid-execution capture & triage** — `/gsd capture` lets you fire-and-forget thoughts during auto-mode. The system triages accumulated captures at natural seams between tasks, classifies impact into five types (quick-task, inject, defer, replan, note), and proposes action with user confirmation. Dashboard shows pending capture count badge. Capture context injected into replan and reassess prompts (#512)
+- **Dynamic model routing** — complexity-based model routing classifies units into light/standard/heavy tiers and routes to cheaper models when appropriate, reducing token consumption 20-50% on capped plans. Includes budget-pressure-aware routing, cross-provider cost comparison, escalation on failure, adaptive learning from routing history (rolling 50-entry window with user feedback support), and task plan introspection (code block counting, complexity keyword detection) (#579)
+- **Feature-branch lifecycle integration test** — proves milestone worktrees branch from and merge back to feature branches, never touching main (#624)
+- **Discord integration parity with Slack** — plus new remote-questions documentation (#620)
+
+### Fixed
+- **Absolute paths in auto-mode prompts** — write-target variables now passed as absolute paths, eliminating LLM path confusion in worktree contexts that caused artifacts written to wrong location and loop detection (#627)
+- **Worktree lifecycle on mid-session milestone transitions** (#616, #618)
+- **Eager template cache warming** — prevents version-skew crash in long auto-mode sessions (#621)
+
+## [2.18.0] - 2026-03-16
+
+### Added
+- **Milestone queue reorder** — `/gsd queue` supports reordering milestone execution priority with dependency-aware validation, persistent ordering via `.gsd/QUEUE-ORDER.json` (#460)
+- **`.gsd/KNOWLEDGE.md`** — persistent project-specific context file loaded into agent prompts. New `/gsd knowledge` command with `rule`, `pattern`, and `lesson` subcommands for adding entries (#585)
+- **Dynamic model discovery** — runtime model enumeration from provider APIs (Ollama, OpenAI, Google, OpenRouter) with per-provider TTL caching and discovery adapters. New `ProviderManagerComponent` TUI for managing providers with auth status and model counts (#581)
+- **Expanded preferences wizard** — all configurable fields now exposed in the setup wizard, model ID validation, and `updatePreferencesModels()` for safe read-modify-write of model config (#580)
+- **Comprehensive documentation** — 12 new docs covering getting started, auto-mode, commands, configuration, token optimization, cost management, git strategy, team workflows, skills, migration, troubleshooting, and architecture (#605)
+- **`resolveProjectRoot()`** — all GSD commands resolve the effective project root from worktree paths instead of using raw `process.cwd()`, preventing path confusion across worktree boundaries (#602)
+- **1,813 lines of new tests** — 13 new test files covering discovery cache, model discovery, model registry, models-json-writer, auto-worktree, derive-state-deps, in-flight tool tracking, knowledge, memory leak guards, preferences wizard fields, queue order, queue reorder E2E, and stale worktree cwd
+
+### Fixed
+- **Heap OOM during long-running auto-mode sessions** — four sources of unbounded memory growth: activity log serialized all entries for SHA1 dedup (now streaming writes with lightweight fingerprint), uncleaned `activityLogState` Map between sessions, unbounded `completedUnits` array (now capped at 200), and `dirEntryCache`/`dirListCache` growing without bounds (now evicted at 200 entries) (#611)
+- **Stale worktree cwd after milestone completion** — three-layer fix: `escapeStaleWorktree()` at auto-mode entry, unconditional cwd restore in `stopAuto()`, and cwd restore on partial merge failure (#608)
+- **Worktree created from integration branch instead of main** — `createAutoWorktree` reads integration branch from META.json, merge targets integration branch not hardcoded main (#606)
+- **Milestone merge skipped in branch isolation mode** — branch-mode fallback detects `milestone/*` branch and performs squash-merge (#603)
+- **`parseContextDependsOn()` destroys unique milestone ID case** — was lowercasing IDs, breaking dependency resolution (#604)
+- **Tool-aware idle detection** — prevents false interruption of long-running tasks in auto-mode (#596)
+- **Remote questions onboarding crash** — extracted `saveRemoteQuestionsConfig` into compiled src/ helper to avoid cross-boundary .ts import (#592)
+- **`showNextAction` crash** — falls back to `select()` when `custom()` returns undefined (#447, #615)
+
+### Changed
+- Comprehensive update to preferences reference and configuration guide (#614)
+- Auto-mode artifact writes scoped to active milestone worktree, preventing cross-milestone pollution (#590)
+
 ## [2.17.0] - 2026-03-15

 ### Added
@ -738,7 +777,9 @@ Format based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/).
 ### Changed
 - License updated to MIT

-[Unreleased]: https://github.com/gsd-build/gsd-2/compare/v2.17.0...HEAD
+[Unreleased]: https://github.com/gsd-build/gsd-2/compare/v2.19.0...HEAD
+[2.19.0]: https://github.com/gsd-build/gsd-2/compare/v2.18.0...v2.19.0
+[2.18.0]: https://github.com/gsd-build/gsd-2/compare/v2.17.0...v2.18.0
 [2.17.0]: https://github.com/gsd-build/gsd-2/compare/v2.16.0...v2.17.0
 [2.16.0]: https://github.com/gsd-build/gsd-2/compare/v2.15.1...v2.16.0
 [2.15.1]: https://github.com/gsd-build/gsd-2/releases/tag/v2.15.1
--- a/README.md
+++ b/README.md
@ -21,6 +21,25 @@ One command. Walk away. Come back to a built project with clean git history.

 ---

+## Documentation
+
+Full documentation is available in the [`docs/`](./docs/) directory:
+
+- **[Getting Started](./docs/getting-started.md)** — install, first run, basic usage
+- **[Auto Mode](./docs/auto-mode.md)** — autonomous execution deep-dive
+- **[Configuration](./docs/configuration.md)** — all preferences, models, git, and hooks
+- **[Token Optimization](./docs/token-optimization.md)** — profiles, context compression, complexity routing (v2.17)
+- **[Cost Management](./docs/cost-management.md)** — budgets, tracking, projections
+- **[Git Strategy](./docs/git-strategy.md)** — worktree isolation, branching, merge behavior
+- **[Working in Teams](./docs/working-in-teams.md)** — unique IDs, shared artifacts
+- **[Skills](./docs/skills.md)** — bundled skills, discovery, custom authoring
+- **[Commands Reference](./docs/commands.md)** — all commands and keyboard shortcuts
+- **[Architecture](./docs/architecture.md)** — system design and dispatch pipeline
+- **[Troubleshooting](./docs/troubleshooting.md)** — common issues, doctor, recovery
+- **[Migration from v1](./docs/migration.md)** — `.planning` → `.gsd` migration
+
+---
+
 ## What Changed From v1

 The original GSD was a collection of markdown prompts installed into `~/.claude/commands/`. It relied entirely on the LLM reading those prompts and doing the right thing. That worked surprisingly well — but it had hard limits:
@ -334,6 +353,26 @@ unique_milestone_ids: true
 | `skill_rules`          | Situational rules for skill routing                                                                   |
 | `unique_milestone_ids` | Uses unique milestone names to avoid clashes when working in teams of people                          |

+### Token Optimization (v2.17)
+
+GSD 2.17 introduced a coordinated token optimization system that reduces usage by 40-60% on cost-sensitive workloads. Set a single preference to coordinate model selection, phase skipping, and context compression:
+
+```yaml
+token_profile: budget      # or balanced (default), quality
+```
+
+| Profile | Savings | What It Does |
+|---------|---------|-------------|
+| `budget` | 40-60% | Cheap models, skip research/reassess, minimal context inlining |
+| `balanced` | 10-20% | Default models, skip slice research, standard context |
+| `quality` | 0% | All phases, all context, full model power |
+
+**Complexity-based routing** automatically classifies tasks as simple/standard/complex and routes to appropriate models. Simple docs tasks get Haiku; complex architectural work gets Opus. The classification is heuristic (sub-millisecond, no LLM calls) and learns from outcomes via a persistent routing history.
+
+**Budget pressure** graduates model downgrading as you approach your budget ceiling — 50%, 75%, and 90% thresholds progressively shift work to cheaper tiers.
+
+See the full [Token Optimization Guide](./docs/token-optimization.md) for details.
+
 ### Bundled Tools

 GSD ships with 14 extensions, all loaded automatically:
--- a/docs/README.md
+++ b/docs/README.md
@ -0,0 +1,45 @@
+# GSD Documentation
+
+Welcome to the GSD documentation. This covers everything from getting started to advanced configuration, auto-mode internals, and extending GSD with the Pi SDK.
+
+## User Documentation
+
+| Guide | Description |
+|-------|-------------|
+| [Getting Started](./getting-started.md) | Installation, first run, and basic usage |
+| [Auto Mode](./auto-mode.md) | How autonomous execution works — the state machine, crash recovery, and steering |
+| [Commands Reference](./commands.md) | All commands, keyboard shortcuts, and CLI flags |
+| [Remote Questions](./remote-questions.md) | Discord and Slack integration for headless auto-mode |
+| [Configuration](./configuration.md) | Preferences, model selection, git settings, and token profiles |
+| [Token Optimization](./token-optimization.md) | Token profiles, context compression, complexity routing, and adaptive learning (v2.17) |
+| [Cost Management](./cost-management.md) | Budget ceilings, cost tracking, projections, and enforcement modes |
+| [Git Strategy](./git-strategy.md) | Worktree isolation, branching model, and merge behavior |
+| [Working in Teams](./working-in-teams.md) | Unique milestone IDs, `.gitignore` setup, and shared planning artifacts |
+| [Skills](./skills.md) | Bundled skills, skill discovery, and custom skill authoring |
+| [Migration from v1](./migration.md) | Migrating `.planning` directories from the original GSD |
+| [Troubleshooting](./troubleshooting.md) | Common issues, `/gsd doctor`, and recovery procedures |
+
+## Architecture & Internals
+
+| Guide | Description |
+|-------|-------------|
+| [Architecture Overview](./architecture.md) | System design, extension model, state-on-disk, and dispatch pipeline |
+| [Native Engine](../native/README.md) | Rust N-API modules for performance-critical operations |
+| [ADR-001: Branchless Worktree Architecture](./ADR-001-branchless-worktree-architecture.md) | Decision record for the v2.14 git architecture |
+
+## Pi SDK Documentation
+
+These guides cover the underlying Pi SDK that GSD is built on. Useful if you want to extend GSD or build your own agent application.
+
+| Guide | Description |
+|-------|-------------|
+| [What is Pi](./what-is-pi/README.md) | Core concepts — modes, agent loop, sessions, tools, providers |
+| [Extending Pi](./extending-pi/README.md) | Building extensions — tools, commands, UI, events, state |
+| [Context & Hooks](./context-and-hooks/README.md) | Context pipeline, hook reference, inter-extension communication |
+| [Pi UI / TUI](./pi-ui-tui/README.md) | Terminal UI components, theming, keyboard input, rendering |
+
+## Research
+
+| Guide | Description |
+|-------|-------------|
+| [Building Coding Agents](./building-coding-agents/README.md) | Research notes on agent design — decomposition, context engineering, cost/quality tradeoffs |
--- a/docs/architecture.md
+++ b/docs/architecture.md
@ -0,0 +1,108 @@
+# Architecture Overview
+
+GSD is a TypeScript application built on the [Pi SDK](https://github.com/badlogic/pi-mono). It embeds the Pi coding agent and extends it with the GSD workflow engine, auto mode state machine, and project management primitives.
+
+## System Structure
+
+```
+gsd (CLI binary)
+  └─ loader.ts          Sets PI_PACKAGE_DIR, GSD env vars, dynamic-imports cli.ts
+      └─ cli.ts         Wires SDK managers, loads extensions, starts InteractiveMode
+          ├─ onboarding.ts   First-run setup wizard (LLM provider + tool keys)
+          ├─ wizard.ts       Env hydration from stored auth.json credentials
+          ├─ app-paths.ts    ~/.gsd/agent/, ~/.gsd/sessions/, auth.json
+          ├─ resource-loader.ts  Syncs bundled extensions + agents to ~/.gsd/agent/
+          └─ src/resources/
+              ├─ extensions/gsd/    Core GSD extension
+              ├─ extensions/...     12 supporting extensions
+              ├─ agents/            scout, researcher, worker
+              ├─ AGENTS.md          Agent routing instructions
+              └─ GSD-WORKFLOW.md    Manual bootstrap protocol
+```
+
+## Key Design Decisions
+
+### State Lives on Disk
+
+`.gsd/` is the sole source of truth. Auto mode reads it, writes it, and advances based on what it finds. No in-memory state survives across sessions. This enables crash recovery, multi-terminal steering, and session resumption.
+
+### Two-File Loader Pattern
+
+`loader.ts` sets all environment variables with zero SDK imports, then dynamically imports `cli.ts` which does static SDK imports. This ensures `PI_PACKAGE_DIR` is set before any SDK code evaluates.
+
+### `pkg/` Shim Directory
+
+`PI_PACKAGE_DIR` points to `pkg/` (not project root) to avoid Pi's theme resolution colliding with GSD's `src/` directory. Contains only `piConfig` and theme assets.
+
+### Always-Overwrite Sync
+
+Bundled extensions and agents are synced to `~/.gsd/agent/` on every launch, not just first run. This means `npm update -g` takes effect immediately.
+
+### Fresh Session Per Unit
+
+Every dispatch creates a new agent session. The LLM starts with a clean context window containing only the pre-inlined artifacts it needs. This prevents quality degradation from context accumulation.
+
+## Bundled Extensions
+
+| Extension | What It Provides |
+|-----------|-----------------|
+| **GSD** | Core workflow engine — auto mode, state machine, commands, dashboard |
+| **Browser Tools** | Playwright-based browser with form intelligence and semantic actions |
+| **Search the Web** | Brave Search, Tavily, or Jina page extraction |
+| **Google Search** | Gemini-powered web search with AI-synthesized answers |
+| **Context7** | Up-to-date library/framework documentation |
+| **Background Shell** | Long-running process management with readiness detection |
+| **Subagent** | Delegated tasks with isolated context windows |
+| **Mac Tools** | macOS native app automation via Accessibility APIs |
+| **MCPorter** | Lazy on-demand MCP server integration |
+| **Voice** | Real-time speech-to-text (macOS, Linux) |
+| **Slash Commands** | Custom command creation |
+| **LSP** | Language Server Protocol — diagnostics, definitions, references, hover, rename |
+| **Ask User Questions** | Structured user input with single/multi-select |
+| **Secure Env Collect** | Masked secret collection |
+
+## Bundled Agents
+
+| Agent | Role |
+|-------|------|
+| **Scout** | Fast codebase recon — compressed context for handoff |
+| **Researcher** | Web research — finds and synthesizes current information |
+| **Worker** | General-purpose execution in an isolated context window |
+
+## Native Engine
+
+Performance-critical operations use a Rust N-API engine:
+
+- **grep** — ripgrep-backed content search
+- **glob** — gitignore-aware file discovery
+- **ps** — cross-platform process tree management
+- **highlight** — syntect-based syntax highlighting
+- **ast** — structural code search via ast-grep
+- **diff** — fuzzy text matching and unified diff generation
+- **text** — ANSI-aware text measurement and wrapping
+- **html** — HTML-to-Markdown conversion
+- **image** — decode, encode, resize images
+- **fd** — fuzzy file path discovery
+- **clipboard** — native clipboard access
+- **git** — libgit2-backed git read operations (v2.16+)
+- **parser** — GSD file parsing and frontmatter extraction
+
+## Dispatch Pipeline
+
+The auto mode dispatch pipeline:
+
+```
+1. Read disk state (STATE.md, roadmap, plans)
+2. Determine next unit type and ID
+3. Classify complexity → select model tier
+4. Apply budget pressure adjustments
+5. Check routing history for adaptive adjustments
+6. Resolve effective model (with fallbacks)
+7. Build dispatch prompt (applying inline level compression)
+8. Create fresh agent session
+9. Inject prompt and let LLM execute
+10. On completion: snapshot metrics, verify artifacts, persist state
+11. Loop to step 1
+```
+
+Phase skipping (from token profile) gates steps 2-3: if a phase is skipped, the corresponding unit type is never dispatched.
--- a/docs/auto-mode.md
+++ b/docs/auto-mode.md
@ -0,0 +1,143 @@
+# Auto Mode
+
+Auto mode is GSD's autonomous execution engine. Run `/gsd auto`, walk away, come back to built software with clean git history.
+
+## How It Works
+
+Auto mode is a **state machine driven by files on disk**. It reads `.gsd/STATE.md`, determines the next unit of work, creates a fresh agent session, injects a focused prompt with all relevant context pre-inlined, and lets the LLM execute. When the LLM finishes, auto mode reads disk state again and dispatches the next unit.
+
+### The Loop
+
+Each slice flows through phases automatically:
+
+```
+Research → Plan → Execute (per task) → Complete → Reassess Roadmap → Next Slice
+```
+
+- **Research** — scouts the codebase and relevant docs
+- **Plan** — decomposes the slice into tasks with must-haves
+- **Execute** — runs each task in a fresh context window
+- **Complete** — writes summary, UAT script, marks roadmap, commits
+- **Reassess** — checks if the roadmap still makes sense
+
+## Key Properties
+
+### Fresh Session Per Unit
+
+Every task, research phase, and planning step gets a clean context window. No accumulated garbage. No degraded quality from context bloat. The dispatch prompt includes everything needed — task plans, prior summaries, dependency context, decisions register — so the LLM starts oriented instead of spending tool calls reading files.
+
+### Context Pre-Loading
+
+The dispatch prompt is carefully constructed with:
+
+| Inlined Artifact | Purpose |
+|------------------|---------|
+| Task plan | What to build |
+| Slice plan | Where this task fits |
+| Prior task summaries | What's already done |
+| Dependency summaries | Cross-slice context |
+| Roadmap excerpt | Overall direction |
+| Decisions register | Architectural context |
+
+The amount of context inlined is controlled by your [token profile](./token-optimization.md). Budget mode inlines minimal context; quality mode inlines everything.
+
+### Git Worktree Isolation
+
+Each milestone runs in its own git worktree with a `milestone/<MID>` branch. All slice work commits sequentially — no branch switching, no merge conflicts mid-milestone. When the milestone completes, it's squash-merged to main as one clean commit.
+
+See [Git Strategy](./git-strategy.md) for details.
+
+### Crash Recovery
+
+A lock file tracks the current unit. If the session dies, the next `/gsd auto` reads the surviving session file, synthesizes a recovery briefing from every tool call that made it to disk, and resumes with full context.
+
+### Stuck Detection
+
+If the same unit dispatches twice (the LLM didn't produce the expected artifact), GSD retries once with a deep diagnostic prompt. If it fails again, auto mode stops with the exact file it expected, so you can intervene.
+
+### Timeout Supervision
+
+Three timeout tiers prevent runaway sessions:
+
+| Timeout | Default | Behavior |
+|---------|---------|----------|
+| Soft | 20 min | Warns the LLM to wrap up |
+| Idle | 10 min | Detects stalls, intervenes |
+| Hard | 30 min | Pauses auto mode |
+
+Recovery steering nudges the LLM to finish durable output before timing out. Configure in preferences:
+
+```yaml
+auto_supervisor:
+  soft_timeout_minutes: 20
+  idle_timeout_minutes: 10
+  hard_timeout_minutes: 30
+```
+
+### Cost Tracking
+
+Every unit's token usage and cost is captured, broken down by phase, slice, and model. The dashboard shows running totals and projections. Budget ceilings can pause auto mode before overspending.
+
+See [Cost Management](./cost-management.md).
+
+### Adaptive Replanning
+
+After each slice completes, the roadmap is reassessed. If the work revealed new information that changes the plan, slices are reordered, added, or removed before continuing. This can be skipped with the `balanced` or `budget` token profiles.
+
+## Controlling Auto Mode
+
+### Start
+
+```
+/gsd auto
+```
+
+### Pause
+
+Press **Escape**. The conversation is preserved. You can interact with the agent, inspect state, or resume.
+
+### Resume
+
+```
+/gsd auto
+```
+
+Auto mode reads disk state and picks up where it left off.
+
+### Stop
+
+```
+/gsd stop
+```
+
+Stops auto mode gracefully. Can be run from a different terminal.
+
+### Steer
+
+```
+/gsd steer
+```
+
+Hard-steer plan documents during execution without stopping the pipeline. Changes are picked up at the next phase boundary.
+
+## Dashboard
+
+`Ctrl+Alt+G` or `/gsd status` shows real-time progress:
+
+- Current milestone, slice, and task
+- Auto mode elapsed time and phase
+- Per-unit cost and token breakdown
+- Cost projections
+- Completed and in-progress units
+
+## Phase Skipping
+
+Token profiles can skip certain phases to reduce cost:
+
+| Phase | `budget` | `balanced` | `quality` |
+|-------|----------|------------|-----------|
+| Milestone Research | Skipped | Runs | Runs |
+| Slice Research | Skipped | Skipped | Runs |
+| Reassess Roadmap | Skipped | Runs | Runs |
+
+See [Token Optimization](./token-optimization.md) for details.
--- a/docs/commands.md
+++ b/docs/commands.md
@ -0,0 +1,54 @@
+# Commands Reference
+
+## Session Commands
+
+| Command | Description |
+|---------|-------------|
+| `/gsd` | Step mode — execute one unit at a time, pause between each |
+| `/gsd next` | Explicit step mode (same as `/gsd`) |
+| `/gsd auto` | Autonomous mode — research, plan, execute, commit, repeat |
+| `/gsd stop` | Stop auto mode gracefully |
+| `/gsd steer` | Hard-steer plan documents during execution |
+| `/gsd discuss` | Discuss architecture and decisions (works alongside auto mode) |
+| `/gsd status` | Progress dashboard |
+| `/gsd queue` | Queue future milestones (safe during auto mode) |
+| `/gsd prefs` | Model selection, timeouts, budget ceiling |
+| `/gsd migrate` | Migrate a v1 `.planning` directory to `.gsd` format |
+| `/gsd doctor` | Validate `.gsd/` integrity, find and fix issues |
+
+## Git Commands
+
+| Command | Description |
+|---------|-------------|
+| `/worktree` (`/wt`) | Git worktree lifecycle — create, switch, merge, remove |
+
+## Session Management
+
+| Command | Description |
+|---------|-------------|
+| `/clear` | Start a new session (alias for `/new`) |
+| `/exit` | Graceful shutdown — saves session state before exiting |
+| `/kill` | Kill GSD process immediately |
+| `/model` | Switch the active model |
+| `/login` | Log in to an LLM provider |
+| `/thinking` | Toggle thinking level during sessions |
+| `/voice` | Toggle real-time speech-to-text (macOS, Linux) |
+
+## Keyboard Shortcuts
+
+| Shortcut | Action |
+|----------|--------|
+| `Ctrl+Alt+G` | Toggle dashboard overlay |
+| `Ctrl+Alt+V` | Toggle voice transcription |
+| `Ctrl+Alt+B` | Show background shell processes |
+| `Escape` | Pause auto mode (preserves conversation) |
+
+> **Note:** In terminals without Kitty keyboard protocol support (macOS Terminal.app, JetBrains IDEs), slash-command fallbacks are shown instead of `Ctrl+Alt` shortcuts.
+
+## CLI Flags
+
+| Flag | Description |
+|------|-------------|
+| `gsd` | Start a new interactive session |
+| `gsd --continue` (`-c`) | Resume the most recent session for the current directory |
+| `gsd config` | Re-run the setup wizard (LLM provider + tool keys) |
--- a/docs/configuration.md
+++ b/docs/configuration.md
@ -0,0 +1,397 @@
+# Configuration
+
+GSD preferences live in `~/.gsd/preferences.md` (global) or `.gsd/preferences.md` (project-local). Manage interactively with `/gsd prefs`.
+
+## `/gsd prefs` Commands
+
+| Command | Description |
+|---------|-------------|
+| `/gsd prefs` | Open the global preferences wizard (default) |
+| `/gsd prefs global` | Interactive wizard for global preferences (`~/.gsd/preferences.md`) |
+| `/gsd prefs project` | Interactive wizard for project preferences (`.gsd/preferences.md`) |
+| `/gsd prefs status` | Show current preference files, merged values, and skill resolution status |
+| `/gsd prefs wizard` | Alias for `/gsd prefs global` |
+| `/gsd prefs setup` | Alias for `/gsd prefs wizard` — creates preferences file if missing |
+
+## Preferences File Format
+
+Preferences use YAML frontmatter in a markdown file:
+
+```yaml
+---
+version: 1
+models:
+  research: claude-sonnet-4-6
+  planning: claude-opus-4-6
+  execution: claude-sonnet-4-6
+  completion: claude-sonnet-4-6
+skill_discovery: suggest
+auto_supervisor:
+  soft_timeout_minutes: 20
+  idle_timeout_minutes: 10
+  hard_timeout_minutes: 30
+budget_ceiling: 50.00
+token_profile: balanced
+---
+```
+
+## Global vs Project Preferences
+
+| Scope | Path | Applies to |
+|-------|------|-----------|
+| Global | `~/.gsd/preferences.md` | All projects |
+| Project | `.gsd/preferences.md` | Current project only |
+
+**Merge behavior:**
+- **Scalar fields** (`skill_discovery`, `budget_ceiling`): project wins if defined
+- **Array fields** (`always_use_skills`, etc.): concatenated (global first, then project)
+- **Object fields** (`models`, `git`, `auto_supervisor`): shallow-merged, project overrides per-key
+
+## All Settings
+
+### `models`
+
+Per-phase model selection. Each key accepts a model string or an object with fallbacks.
+
+```yaml
+models:
+  research: claude-sonnet-4-6
+  planning:
+    model: claude-opus-4-6
+    fallbacks:
+      - openrouter/z-ai/glm-5
+  execution: claude-sonnet-4-6
+  execution_simple: claude-haiku-4-5-20250414
+  completion: claude-sonnet-4-6
+  subagent: claude-sonnet-4-6
+```
+
+**Phases:** `research`, `planning`, `execution`, `execution_simple`, `completion`, `subagent`
+
+- `execution_simple` — used for tasks classified as "simple" by the [complexity router](./token-optimization.md#complexity-based-task-routing)
+- `subagent` — model for delegated subagent tasks (scout, researcher, worker)
+- Provider targeting: use `provider/model` format (e.g., `bedrock/claude-sonnet-4-6`) or the `provider` field in object format
+- Omit a key to use whatever model is currently active
+
+**With fallbacks:**
+
+```yaml
+models:
+  planning:
+    model: claude-opus-4-6
+    fallbacks:
+      - openrouter/z-ai/glm-5
+      - openrouter/moonshotai/kimi-k2.5
+    provider: bedrock    # optional: target a specific provider
+```
+
+When a model fails to switch (provider unavailable, rate limited, credits exhausted), GSD automatically tries the next model in the `fallbacks` list.
+
+### `token_profile`
+
+Coordinates model selection, phase skipping, and context compression. See [Token Optimization](./token-optimization.md).
+
+Values: `budget`, `balanced` (default), `quality`
+
+| Profile | Behavior |
+|---------|----------|
+| `budget` | Skips research + reassessment phases, uses cheaper models |
+| `balanced` | Default behavior — all phases run, standard model selection |
+| `quality` | All phases run, prefers higher-quality models |
+
+### `phases`
+
+Fine-grained control over which phases run in auto mode:
+
+```yaml
+phases:
+  skip_research: false        # skip milestone-level research
+  skip_reassess: false        # skip roadmap reassessment after each slice
+  skip_slice_research: true   # skip per-slice research
+```
+
+These are usually set automatically by `token_profile`, but can be overridden explicitly.
+
+### `skill_discovery`
+
+Controls how GSD finds and applies skills during auto mode.
+
+| Value | Behavior |
+|-------|----------|
+| `auto` | Skills found and applied automatically |
+| `suggest` | Skills identified during research but not auto-installed (default) |
+| `off` | Skill discovery disabled |
+
+### `auto_supervisor`
+
+Timeout thresholds for auto mode supervision:
+
+```yaml
+auto_supervisor:
+  model: claude-sonnet-4-6    # optional: model for supervisor (defaults to active model)
+  soft_timeout_minutes: 20    # warn LLM to wrap up
+  idle_timeout_minutes: 10    # detect stalls
+  hard_timeout_minutes: 30    # pause auto mode
+```
+
+### `budget_ceiling`
+
+Maximum USD to spend during auto mode. No `$` sign — just the number.
+
+```yaml
+budget_ceiling: 50.00
+```
+
+### `budget_enforcement`
+
+How the budget ceiling is enforced:
+
+| Value | Behavior |
+|-------|----------|
+| `warn` | Log a warning but continue |
+| `pause` | Pause auto mode (default when ceiling is set) |
+| `halt` | Stop auto mode entirely |
+
+### `context_pause_threshold`
+
+Context window usage percentage (0-100) at which auto mode pauses for checkpointing. Set to `0` to disable.
+
+```yaml
+context_pause_threshold: 80   # pause at 80% context usage
+```
+
+Default: `0` (disabled)
+
+### `uat_dispatch`
+
+Enable automatic UAT (User Acceptance Test) runs after slice completion:
+
+```yaml
+uat_dispatch: true
+```
+
+### `unique_milestone_ids`
+
+Generate milestone IDs with a random suffix to avoid collisions in team workflows:
+
+```yaml
+unique_milestone_ids: true
+# Produces: M001-eh88as instead of M001
+```
+
+### `git`
+
+Git behavior configuration. All fields optional:
+
+```yaml
+git:
+  auto_push: false            # push commits to remote after committing
+  push_branches: false        # push milestone branch to remote
+  remote: origin              # git remote name
+  snapshots: false            # WIP snapshot commits during long tasks
+  pre_merge_check: false      # run checks before worktree merge (true/false/"auto")
+  commit_type: feat           # override conventional commit prefix
+  main_branch: main           # primary branch name
+  merge_strategy: squash      # how worktree branches merge: "squash" or "merge"
+  isolation: worktree         # git isolation: "worktree" or "branch"
+  commit_docs: true           # commit .gsd/ artifacts to git (set false to keep local)
+```
+
+| Field | Type | Default | Description |
+|-------|------|---------|-------------|
+| `auto_push` | boolean | `false` | Push commits to remote after committing |
+| `push_branches` | boolean | `false` | Push milestone branch to remote |
+| `remote` | string | `"origin"` | Git remote name |
+| `snapshots` | boolean | `false` | WIP snapshot commits during long tasks |
+| `pre_merge_check` | bool/string | `false` | Run checks before merge (`true`/`false`/`"auto"`) |
+| `commit_type` | string | (inferred) | Override conventional commit prefix (`feat`, `fix`, `refactor`, `docs`, `test`, `chore`, `perf`, `ci`, `build`, `style`) |
+| `main_branch` | string | `"main"` | Primary branch name |
+| `merge_strategy` | string | `"squash"` | How worktree branches merge: `"squash"` (combine all commits) or `"merge"` (preserve individual commits) |
+| `isolation` | string | `"worktree"` | Auto-mode isolation: `"worktree"` (separate directory) or `"branch"` (work in project root — useful for submodule-heavy repos) |
+| `commit_docs` | boolean | `true` | Commit `.gsd/` planning artifacts to git. Set `false` to keep local-only |
+
+### `notifications`
+
+Control what notifications GSD sends during auto mode:
+
+```yaml
+notifications:
+  enabled: true
+  on_complete: true           # notify on unit completion
+  on_error: true              # notify on errors
+  on_budget: true             # notify on budget thresholds
+  on_milestone: true          # notify when milestone finishes
+  on_attention: true          # notify when manual attention needed
+```
+
+### `remote_questions`
+
+Route interactive questions to Slack or Discord for headless auto mode:
+
+```yaml
+remote_questions:
+  channel: slack              # or discord
+  channel_id: "C1234567890"
+  timeout_minutes: 15         # question timeout (1-30 minutes)
+  poll_interval_seconds: 10   # poll interval (2-30 seconds)
+```
+
+### `post_unit_hooks`
+
+Custom hooks that fire after specific unit types complete:
+
+```yaml
+post_unit_hooks:
+  - name: code-review
+    after: [execute-task]
+    prompt: "Review the code changes for quality and security issues."
+    model: claude-opus-4-6          # optional: model override
+    max_cycles: 1                   # max fires per trigger (1-10, default: 1)
+    artifact: REVIEW.md             # optional: skip if this file exists
+    retry_on: NEEDS-REWORK.md       # optional: re-run trigger unit if this file appears
+    agent: review-agent             # optional: agent definition to use
+    enabled: true                   # optional: disable without removing
+```
+
+**Known unit types for `after`:** `research-milestone`, `plan-milestone`, `research-slice`, `plan-slice`, `execute-task`, `complete-slice`, `replan-slice`, `reassess-roadmap`, `run-uat`
+
+**Prompt substitutions:** `{milestoneId}`, `{sliceId}`, `{taskId}` are replaced with current context values.
+
+### `pre_dispatch_hooks`
+
+Hooks that intercept units before dispatch. Three actions available:
+
+**Modify** — prepend/append text to the unit prompt:
+
+```yaml
+pre_dispatch_hooks:
+  - name: add-standards
+    before: [execute-task]
+    action: modify
+    prepend: "Follow our coding standards document."
+    append: "Run linting after changes."
+```
+
+**Skip** — skip the unit entirely:
+
+```yaml
+pre_dispatch_hooks:
+  - name: skip-research
+    before: [research-slice]
+    action: skip
+    skip_if: RESEARCH.md            # optional: only skip if this file exists
+```
+
+**Replace** — replace the unit prompt entirely:
+
+```yaml
+pre_dispatch_hooks:
+  - name: custom-execute
+    before: [execute-task]
+    action: replace
+    prompt: "Execute the task using TDD methodology."
+    unit_type: execute-task-tdd     # optional: override unit type label
+    model: claude-opus-4-6          # optional: model override
+```
+
+All pre-dispatch hooks support `enabled: true/false` to toggle without removing.
+
+### `always_use_skills` / `prefer_skills` / `avoid_skills`
+
+Skill routing preferences:
+
+```yaml
+always_use_skills:
+  - debug-like-expert
+prefer_skills:
+  - frontend-design
+avoid_skills: []
+```
+
+Skills can be bare names (looked up in `~/.gsd/agent/skills/`) or absolute paths.
+
+### `skill_rules`
+
+Situational skill routing with human-readable triggers:
+
+```yaml
+skill_rules:
+  - when: task involves authentication
+    use: [clerk]
+  - when: frontend styling work
+    prefer: [frontend-design]
+  - when: working with legacy code
+    avoid: [aggressive-refactor]
+```
+
+### `custom_instructions`
+
+Durable instructions appended to every session:
+
+```yaml
+custom_instructions:
+  - "Always use TypeScript strict mode"
+  - "Prefer functional patterns over classes"
+```
+
+For project-specific knowledge (patterns, gotchas, lessons learned), use `.gsd/KNOWLEDGE.md` instead — it's injected into every agent prompt automatically.
+
+## Full Example
+
+```yaml
+---
+version: 1
+
+# Model selection
+models:
+  research: openrouter/deepseek/deepseek-r1
+  planning:
+    model: claude-opus-4-6
+    fallbacks:
+      - openrouter/z-ai/glm-5
+  execution: claude-sonnet-4-6
+  execution_simple: claude-haiku-4-5-20250414
+  completion: claude-sonnet-4-6
+
+# Token optimization
+token_profile: balanced
+
+# Budget
+budget_ceiling: 25.00
+budget_enforcement: pause
+context_pause_threshold: 80
+
+# Supervision
+auto_supervisor:
+  soft_timeout_minutes: 15
+  hard_timeout_minutes: 25
+
+# Git
+git:
+  auto_push: true
+  merge_strategy: squash
+  isolation: worktree
+  commit_docs: true
+
+# Skills
+skill_discovery: suggest
+always_use_skills:
+  - debug-like-expert
+skill_rules:
+  - when: task involves authentication
+    use: [clerk]
+
+# Notifications
+notifications:
+  on_complete: false
+  on_milestone: true
+  on_attention: true
+
+# Hooks
+post_unit_hooks:
+  - name: code-review
+    after: [execute-task]
+    prompt: "Review {sliceId}/{taskId} for quality and security."
+    artifact: REVIEW.md
+---
+```
--- a/docs/cost-management.md
+++ b/docs/cost-management.md
@ -0,0 +1,91 @@
+# Cost Management
+
+GSD tracks token usage and cost for every unit of work dispatched during auto mode. This data powers the dashboard, budget enforcement, and cost projections.
+
+## Cost Tracking
+
+Every unit's metrics are captured automatically:
+
+- **Token counts** — input, output, cache read, cache write, total
+- **Cost** — USD cost per unit
+- **Duration** — wall-clock time
+- **Tool calls** — number of tool invocations
+- **Message counts** — assistant and user messages
+
+Data is stored in `.gsd/metrics.json` and survives across sessions.
+
+### Viewing Costs
+
+**Dashboard:** `Ctrl+Alt+G` or `/gsd status` shows real-time cost breakdown.
+
+**Aggregations available:**
+- By phase (research, planning, execution, completion, reassessment)
+- By slice (M001/S01, M001/S02, ...)
+- By model (which models consumed the most budget)
+- Project totals
+
+## Budget Ceiling
+
+Set a maximum spend for a project:
+
+```yaml
+---
+version: 1
+budget_ceiling: 50.00
+---
+```
+
+### Enforcement Modes
+
+Control what happens when the ceiling is reached:
+
+```yaml
+budget_enforcement: pause    # default when ceiling is set
+```
+
+| Mode | Behavior |
+|------|----------|
+| `warn` | Log a warning, continue executing |
+| `pause` | Pause auto mode, wait for user action |
+| `halt` | Stop auto mode entirely |
+
+## Cost Projections
+
+Once at least two slices have completed, GSD projects the remaining cost:
+
+```
+Projected remaining: $12.40 ($6.20/slice avg × 2 remaining)
+```
+
+Projections use per-slice averages from completed work. If the budget ceiling has been reached, a warning is appended.
+
+## Budget Pressure & Model Downgrading
+
+When approaching the budget ceiling, the [complexity router](./token-optimization.md#budget-pressure) automatically downgrades model assignments to cheaper tiers. This is graduated:
+
+- **< 50% used** — no adjustment
+- **50-75% used** — standard tasks downgrade to light
+- **75-90% used** — same, more aggressive
+- **> 90% used** — nearly everything downgrades; only heavy tasks stay at standard
+
+This ensures the budget is spread across remaining work instead of being exhausted early on complex tasks.
+
+## Token Profiles & Cost
+
+The `token_profile` preference directly affects cost:
+
+| Profile | Typical Savings | How |
+|---------|----------------|-----|
+| `budget` | 40-60% | Cheaper models, phase skipping, minimal context |
+| `balanced` | 10-20% | Default models, skip slice research, standard context |
+| `quality` | 0% (baseline) | Full models, all phases, full context |
+
+See [Token Optimization](./token-optimization.md) for details.
+
+## Tips
+
+- Start with `balanced` profile and a generous `budget_ceiling` to establish baseline costs
+- Check `/gsd status` after a few slices to see per-slice cost averages
+- Switch to `budget` profile for well-understood, repetitive work
+- Use `quality` only when architectural decisions are being made
+- Per-phase model selection lets you use Opus only for planning while keeping execution on Sonnet
--- a/docs/getting-started.md
+++ b/docs/getting-started.md
@ -0,0 +1,133 @@
+# Getting Started
+
+## Install
+
+```bash
+npm install -g gsd-pi
+```
+
+Requires Node.js ≥ 20.6.0 (22+ recommended) and Git.
+
+## First Launch
+
+Run `gsd` in any directory:
+
+```bash
+gsd
+```
+
+On first launch, GSD runs a setup wizard:
+
+1. **LLM Provider** — select from 20+ providers (Anthropic, OpenAI, Google, OpenRouter, GitHub Copilot, Amazon Bedrock, Azure, and more). OAuth flows handle Claude Max and Copilot subscriptions automatically; otherwise paste an API key.
+2. **Tool API Keys** (optional) — Brave Search, Context7, Jina, Slack, Discord. Press Enter to skip any.
+
+If you have an existing Pi installation, provider credentials are imported automatically.
+
+Re-run the wizard anytime with:
+
+```bash
+gsd config
+```
+
+## Choose a Model
+
+GSD auto-selects a default model after login. Switch later with:
+
+```
+/model
+```
+
+Or configure per-phase models in preferences — see [Configuration](./configuration.md).
+
+## Two Ways to Work
+
+### Step Mode — `/gsd`
+
+Type `/gsd` inside a session. GSD executes one unit of work at a time, pausing between each with a wizard showing what completed and what's next.
+
+- **No `.gsd/` directory** → starts a discussion flow to capture your project vision
+- **Milestone exists, no roadmap** → discuss or research the milestone
+- **Roadmap exists, slices pending** → plan the next slice or execute a task
+- **Mid-task** → resume where you left off
+
+Step mode is the on-ramp. You stay in the loop, reviewing output between each step.
+
+### Auto Mode — `/gsd auto`
+
+Type `/gsd auto` and walk away. GSD autonomously researches, plans, executes, verifies, commits, and advances through every slice until the milestone is complete.
+
+```
+/gsd auto
+```
+
+See [Auto Mode](./auto-mode.md) for full details.
+
+## Two Terminals, One Project
+
+The recommended workflow: auto mode in one terminal, steering from another.
+
+**Terminal 1 — let it build:**
+
+```bash
+gsd
+/gsd auto
+```
+
+**Terminal 2 — steer while it works:**
+
+```bash
+gsd
+/gsd discuss    # talk through architecture decisions
+/gsd status     # check progress
+/gsd queue      # queue the next milestone
+```
+
+Both terminals read and write the same `.gsd/` files. Decisions in terminal 2 are picked up at the next phase boundary automatically.
+
+## Project Structure
+
+GSD organizes work into a hierarchy:
+
+```
+Milestone  →  a shippable version (4-10 slices)
+  Slice    →  one demoable vertical capability (1-7 tasks)
+    Task   →  one context-window-sized unit of work
+```
+
+The iron rule: **a task must fit in one context window.** If it can't, it's two tasks.
+
+All state lives on disk in `.gsd/`:
+
+```
+.gsd/
+  PROJECT.md          — what the project is right now
+  REQUIREMENTS.md     — requirement contract (active/validated/deferred)
+  DECISIONS.md        — append-only architectural decisions
+  STATE.md            — quick-glance status
+  milestones/
+    M001/
+      M001-ROADMAP.md — slice plan with risk levels and dependencies
+      M001-CONTEXT.md — scope and goals from discussion
+      slices/
+        S01/
+          S01-PLAN.md     — task decomposition
+          S01-SUMMARY.md  — what happened
+          S01-UAT.md      — human test script
+          tasks/
+            T01-PLAN.md
+            T01-SUMMARY.md
+```
+
+## Resume a Session
+
+```bash
+gsd --continue    # or gsd -c
+```
+
+Resumes the most recent session for the current directory.
+
+## Next Steps
+
+- [Auto Mode](./auto-mode.md) — deep dive into autonomous execution
+- [Configuration](./configuration.md) — model selection, timeouts, budgets
+- [Commands Reference](./commands.md) — all commands and shortcuts
--- a/docs/git-strategy.md
+++ b/docs/git-strategy.md
@ -0,0 +1,92 @@
+# Git Strategy
+
+GSD uses git worktrees for milestone isolation and sequential commits within each milestone. The strategy is fully automated — you don't need to manage branches manually.
+
+## Branching Model
+
+```
+main ─────────────────────────────────────────────────────────
+  │                                                     ↑
+  └── milestone/M001 (worktree) ────────────────────────┘
+       commit: feat(S01/T01): core types
+       commit: feat(S01/T02): markdown parser
+       commit: feat(S01/T03): file writer
+       commit: docs(M001/S01): workflow docs
+       ...
+       → squash-merged to main as single commit
+```
+
+### Key Properties
+
+- **One worktree per milestone** — all work happens in `.gsd/worktrees/<MID>/`
+- **Sequential commits on one branch** — no per-slice branches, no merge conflicts within a milestone
+- **Squash merge to main** — when the milestone completes, all commits are squashed into one clean commit on main
+- **Worktree teardown** — after merge, the worktree and branch are cleaned up
+
+### Commit Format
+
+Commits use conventional commit format with scope:
+
+```
+feat(S01/T01): core type definitions
+feat(S01/T02): markdown parser for plan files
+fix(M001/S03): bug fixes and doc corrections
+docs(M001/S04): workflow documentation
+```
+
+## Worktree Management
+
+### Automatic (Auto Mode)
+
+Auto mode creates and manages worktrees automatically:
+
+1. When a milestone starts, a worktree is created at `.gsd/worktrees/<MID>/` on branch `milestone/<MID>`
+2. Planning artifacts from `.gsd/milestones/` are copied into the worktree
+3. All execution happens inside the worktree
+4. On milestone completion, the worktree is squash-merged to the integration branch
+5. The worktree and branch are removed
+
+### Manual
+
+Use the `/worktree` (or `/wt`) command for manual worktree management:
+
+```
+/worktree create
+/worktree switch
+/worktree merge
+/worktree remove
+```
+
+## Git Preferences
+
+Configure git behavior in preferences:
+
+```yaml
+git:
+  auto_push: false            # push after commits
+  push_branches: false        # push milestone branch
+  remote: origin
+  snapshots: false            # WIP snapshot commits
+  pre_merge_check: false      # pre-merge validation
+  commit_type: feat           # override commit type prefix
+  main_branch: main           # primary branch name
+  commit_docs: true           # commit .gsd/ to git
+```
+
+### `commit_docs: false`
+
+When set to `false`, GSD adds `.gsd/` to `.gitignore` and keeps all planning artifacts local-only. Useful for teams where only some members use GSD, or when company policy requires a clean repository.
+
+## Self-Healing
+
+GSD includes automatic recovery for common git issues:
+
+- **Detached HEAD** — automatically reattaches to the correct branch
+- **Stale lock files** — removes `index.lock` files from crashed processes
+- **Orphaned worktrees** — detects and offers to clean up abandoned worktrees
+
+Run `/gsd doctor` to check git health manually.
+
+## Native Git Operations
+
+Since v2.16, GSD uses libgit2 via native bindings for read-heavy operations in the dispatch hot path. This eliminates ~70 process spawns per dispatch cycle, improving auto-mode throughput.
--- a/docs/migration.md
+++ b/docs/migration.md
@ -0,0 +1,48 @@
+# Migration from v1
+
+If you have projects with `.planning` directories from the original Get Shit Done (v1), you can migrate them to GSD-2's `.gsd` format.
+
+## Running the Migration
+
+```bash
+# From within the project directory
+/gsd migrate
+
+# Or specify a path
+/gsd migrate ~/projects/my-old-project
+```
+
+## What Gets Migrated
+
+The migration tool:
+
+- Parses your old `PROJECT.md`, `ROADMAP.md`, `REQUIREMENTS.md`, phase directories, plans, summaries, and research
+- Maps phases → slices, plans → tasks, milestones → milestones
+- Preserves completion state (`[x]` phases stay done, summaries carry over)
+- Consolidates research files into the new structure
+- Shows a preview before writing anything
+- Optionally runs an agent-driven review of the output for quality assurance
+
+## Supported Formats
+
+The migration handles various v1 format variations:
+
+- Milestone-sectioned roadmaps with `<details>` blocks
+- Bold phase entries
+- Bullet-format requirements
+- Decimal phase numbering
+- Duplicate phase numbers across milestones
+
+## Requirements
+
+Migration works best with a `ROADMAP.md` file for milestone structure. Without one, milestones are inferred from the `phases/` directory.
+
+## Post-Migration
+
+After migrating, verify the output with:
+
+```
+/gsd doctor
+```
+
+This checks `.gsd/` integrity and flags any structural issues.
--- a/docs/remote-questions.md
+++ b/docs/remote-questions.md
@ -0,0 +1,131 @@
+# Remote Questions
+
+Remote questions allow GSD to ask for user input via Slack or Discord when running in headless auto-mode. When GSD encounters a decision point that needs human input, it posts the question to your configured channel and polls for a response.
+
+## Setup
+
+### Discord
+
+```
+/gsd remote discord
+```
+
+The setup wizard:
+1. Prompts for your Discord bot token
+2. Validates the token against the Discord API
+3. Lists servers the bot belongs to (or lets you pick)
+4. Lists text channels in the selected server
+5. Sends a test message to confirm permissions
+6. Saves the configuration to `~/.gsd/preferences.md`
+
+**Bot requirements:**
+- A Discord bot application with a token (from [Discord Developer Portal](https://discord.com/developers/applications))
+- Bot must be invited to the target server with these permissions:
+  - Send Messages
+  - Read Message History
+  - Add Reactions
+  - View Channel
+- The `DISCORD_BOT_TOKEN` environment variable must be set (the setup wizard handles this)
+
+### Slack
+
+```
+/gsd remote slack
+```
+
+The setup wizard:
+1. Prompts for your Slack bot token (`xoxb-...`)
+2. Validates the token
+3. Prompts for a channel ID
+4. Sends a test message to confirm permissions
+5. Saves the configuration
+
+**Bot requirements:**
+- A Slack app with a bot token (from [Slack API](https://api.slack.com/apps))
+- Bot must be invited to the target channel
+- Required scopes: `chat:write`, `reactions:read`, `channels:history`
+
+## Configuration
+
+Remote questions are configured in `~/.gsd/preferences.md`:
+
+```yaml
+remote_questions:
+  channel: discord          # or slack
+  channel_id: "1234567890123456789"
+  timeout_minutes: 5        # 1-30, default 5
+  poll_interval_seconds: 5  # 2-30, default 5
+```
+
+## How It Works
+
+1. GSD encounters a decision point during auto-mode
+2. The question is posted to your configured channel as a rich embed (Discord) or Block Kit message (Slack)
+3. GSD polls for a response at the configured interval
+4. You respond by:
+   - **Reacting** with a number emoji (1️⃣, 2️⃣, etc.) for single-question prompts
+   - **Replying** to the message with a number (`1`), comma-separated numbers (`1,3`), or free text
+5. GSD picks up the response and continues execution
+6. On Discord, a ✅ reaction is added to the prompt message to confirm receipt
+
+### Response Formats
+
+**Single question:**
+- React with a number emoji (Discord only, single-question prompts)
+- Reply with a number: `2`
+- Reply with free text (captured as a user note)
+
+**Multiple questions:**
+- Reply with semicolons: `1;2;custom text`
+- Reply with newlines (one answer per line)
+
+### Timeouts
+
+If no response is received within `timeout_minutes`, the prompt times out and GSD continues with a timeout result. The LLM handles timeouts according to the task context — typically by making a conservative default choice or pausing auto-mode.
+
+## Commands
+
+| Command | Description |
+|---------|-------------|
+| `/gsd remote` | Show remote questions menu and current status |
+| `/gsd remote slack` | Set up Slack integration |
+| `/gsd remote discord` | Set up Discord integration |
+| `/gsd remote status` | Show current configuration and last prompt status |
+| `/gsd remote disconnect` | Remove remote questions configuration |
+
+## Discord vs Slack Feature Comparison
+
+| Feature | Discord | Slack |
+|---------|---------|-------|
+| Rich message format | Embeds with fields | Block Kit |
+| Reaction-based answers | ✅ (single-question) | ❌ |
+| Thread-based replies | Message replies | Thread replies |
+| Message URL in logs | ✅ | ✅ |
+| Answer acknowledgement | ✅ reaction on receipt | Thread context |
+| Multi-question support | Text replies (semicolons/newlines) | Text replies (semicolons/newlines) |
+| Context source in prompt | ✅ (footer) | ❌ |
+| Server/channel picker | ✅ (interactive) | Manual channel ID |
+| Token validation | ✅ | ✅ |
+| Test message on setup | ✅ | ✅ |
+
+## Troubleshooting
+
+### "Remote auth failed"
+- Verify your bot token is correct and not expired
+- For Discord: ensure the bot is still in the server
+- For Slack: ensure the bot token starts with `xoxb-`
+
+### "Could not send to channel"
+- Verify the bot has Send Messages permission in the target channel
+- For Discord: check the bot's role permissions in Server Settings
+- For Slack: ensure the bot is invited to the channel (`/invite @botname`)
+
+### No response detected
+- Ensure you're **replying to** the prompt message (not posting a new message)
+- For reactions: only number emojis (1️⃣-5️⃣) on single-question prompts are detected
+- Check that `timeout_minutes` is long enough for your response time
+
+### Channel ID format
+- **Slack:** 9-12 uppercase alphanumeric characters (e.g., `C0123456789`)
+- **Discord:** 17-20 digit numeric snowflake ID (e.g., `1234567890123456789`)
+- Enable Developer Mode in Discord (Settings → Advanced) to copy channel IDs
--- a/docs/skills.md
+++ b/docs/skills.md
@ -0,0 +1,84 @@
+# Skills
+
+Skills are specialized instruction sets that GSD loads when the task matches. They provide domain-specific guidance for the LLM — coding patterns, framework idioms, testing strategies, and tool usage.
+
+## Bundled Skills
+
+GSD ships with these skills, installed to `~/.gsd/agent/skills/`:
+
+| Skill | Trigger | Description |
+|-------|---------|-------------|
+| `frontend-design` | Web UI work — components, pages, dashboards, styling | Production-grade frontend with high design quality |
+| `swiftui` | macOS/iOS apps — SwiftUI, Xcode, App Store | Full lifecycle from creation to shipping |
+| `debug-like-expert` | Complex debugging — after standard approaches fail | Methodical investigation with evidence gathering |
+| `rust-core` | Rust code — ownership, lifetimes, traits, async | Idiomatic, safe, performant Rust patterns |
+| `axum-web-framework` | Axum web apps — routing, middleware, extractors | Complete Axum development guide |
+| `axum-tests` | Testing Axum apps — integration tests, mock state | Test patterns for Axum applications |
+| `tauri` | Tauri v2 desktop apps — setup, plugins, bundling | Cross-platform desktop app development |
+| `tauri-ipc-developer` | Tauri IPC — React-Rust type-safe communication | Command scaffolding and serialization |
+| `tauri-devtools` | Tauri debugging — CrabNebula DevTools integration | Profiling and monitoring |
+| `github-workflows` | GitHub Actions — CI/CD, workflow debugging | Live syntax, run monitoring, failure diagnosis |
+| `security-audit` | Security auditing — dependency scanning, OWASP | Comprehensive security assessment |
+| `security-review` | Code security review — injection, XSS, auth flaws | Vulnerability-focused code review |
+| `security-docker` | Docker security — Dockerfile, runtime hardening | Container security best practices |
+
+## Skill Discovery
+
+The `skill_discovery` preference controls how GSD finds skills during auto mode:
+
+| Mode | Behavior |
+|------|----------|
+| `auto` | Skills are found and applied automatically |
+| `suggest` | Skills are identified but require confirmation (default) |
+| `off` | No skill discovery |
+
+## Skill Preferences
+
+Control which skills are used via preferences:
+
+```yaml
+---
+version: 1
+always_use_skills:
+  - debug-like-expert
+prefer_skills:
+  - frontend-design
+avoid_skills:
+  - security-docker
+skill_rules:
+  - when: task involves Clerk authentication
+    use: [clerk]
+  - when: frontend styling work
+    prefer: [frontend-design]
+---
+```
+
+### Resolution Order
+
+Skills can be referenced by:
+1. **Bare name** — e.g., `frontend-design` → scans `~/.gsd/agent/skills/` and project skills
+2. **Absolute path** — e.g., `/Users/you/.gsd/agent/skills/my-skill/SKILL.md`
+3. **Directory path** — e.g., `~/custom-skills/my-skill` → looks for `SKILL.md` inside
+
+User skills (`~/.gsd/agent/skills/`) take precedence over project skills.
+
+## Custom Skills
+
+Create your own skills by adding a directory with a `SKILL.md` file:
+
+```
+~/.gsd/agent/skills/my-skill/
+  SKILL.md           — instructions for the LLM
+  references/        — optional reference files
+```
+
+The `SKILL.md` file contains instructions the LLM follows when the skill is active. Reference files can be loaded by the skill instructions as needed.
+
+### Project-Local Skills
+
+Place skills in your project for project-specific guidance:
+
+```
+.pi/agent/skills/my-project-skill/
+  SKILL.md
+```
--- a/docs/token-optimization.md
+++ b/docs/token-optimization.md
@ -0,0 +1,266 @@
+# Token Optimization
+
+*Introduced in v2.17.0*
+
+GSD 2.17 introduces a coordinated token optimization system that can reduce token usage by 40-60% without sacrificing output quality for most workloads. The system has three pillars: **token profiles**, **context compression**, and **complexity-based task routing**.
+
+## Token Profiles
+
+A token profile is a single preference that coordinates model selection, phase skipping, and context compression level. Set it in your preferences:
+
+```yaml
+---
+version: 1
+token_profile: balanced
+---
+```
+
+Three profiles are available:
+
+### `budget` — Maximum Savings (40-60% reduction)
+
+Optimized for cost-sensitive workflows. Uses cheaper models, skips optional phases, and compresses dispatch context to the minimum needed.
+
+| Dimension | Setting |
+|-----------|---------|
+| Planning model | Sonnet |
+| Execution model | Sonnet |
+| Simple task model | Haiku |
+| Completion model | Haiku |
+| Subagent model | Haiku |
+| Milestone research | **Skipped** |
+| Slice research | **Skipped** |
+| Roadmap reassessment | **Skipped** |
+| Context inline level | **Minimal** — drops decisions, requirements, extra templates |
+
+Best for: prototyping, small projects, well-understood codebases, cost-conscious iteration.
+
+### `balanced` — Smart Defaults (default)
+
+The default profile. Keeps the important phases, skips the ones with diminishing returns for most projects, and uses standard context compression.
+
+| Dimension | Setting |
+|-----------|---------|
+| Planning model | User's default |
+| Execution model | User's default |
+| Simple task model | User's default |
+| Completion model | User's default |
+| Subagent model | Sonnet |
+| Milestone research | Runs |
+| Slice research | **Skipped** |
+| Roadmap reassessment | Runs |
+| Context inline level | **Standard** — includes key context, drops low-signal extras |
+
+Best for: most projects, day-to-day development.
+
+### `quality` — Full Context (no compression)
+
+Every phase runs. Every context artifact is inlined. No shortcuts.
+
+| Dimension | Setting |
+|-----------|---------|
+| All models | User's configured defaults |
+| All phases | Run |
+| Context inline level | **Full** — everything inlined |
+
+Best for: complex architectures, greenfield projects requiring deep research, critical production work.
+
+## Context Compression
+
+Each token profile maps to an **inline level** that controls how much context is pre-loaded into dispatch prompts:
+
+| Profile | Inline Level | What's Included |
+|---------|-------------|-----------------|
+| `budget` | `minimal` | Task plan, essential prior summaries (truncated). Drops decisions register, requirements, UAT template, secrets manifest. |
+| `balanced` | `standard` | Task plan, prior summaries, slice plan, roadmap excerpt. Drops some supplementary templates. |
+| `quality` | `full` | Everything — all plans, summaries, decisions, requirements, templates, and root files. |
+
+### How Compression Works
+
+Dispatch prompt builders accept an `inlineLevel` parameter. At each level, specific artifacts are gated:
+
+**Minimal level reductions:**
+- `buildExecuteTaskPrompt` — drops the decisions template, truncates prior summaries to the most recent one
+- `buildPlanMilestonePrompt` — drops `PROJECT.md`, `REQUIREMENTS.md`, decisions, and supplementary templates like `secrets-manifest`
+- `buildCompleteSlicePrompt` — drops requirements and UAT template inlining
+- `buildCompleteMilestonePrompt` — drops root GSD file inlining
+- `buildReassessRoadmapPrompt` — drops project, requirements, and decisions files
+
+These are cumulative — `standard` drops a subset, `minimal` drops more. The `full` level preserves all context (the pre-2.17 behavior).
+
+### Overriding Inline Level
+
+The inline level is derived from your `token_profile`. To control phases independently of the profile, use the `phases` preference:
+
+```yaml
+---
+version: 1
+token_profile: budget
+phases:
+  skip_research: false    # override: run research even on budget
+---
+```
+
+Explicit `phases` settings always override the profile defaults.
+
+## Complexity-Based Task Routing
+
+GSD automatically classifies each task by complexity and routes it to an appropriate model tier. This means simple documentation fixes don't burn expensive Opus tokens, while complex architectural work gets the reasoning power it needs.
+
+### How Classification Works
+
+Tasks are classified by analyzing the task plan:
+
+| Signal | Simple | Standard | Complex |
+|--------|--------|----------|---------|
+| Step count | ≤ 3 | 4-7 | ≥ 8 |
+| File count | ≤ 3 | 4-7 | ≥ 8 |
+| Description length | < 500 chars | 500-2000 | > 2000 chars |
+| Code blocks | — | — | ≥ 5 |
+| Signal words | None | Any present | — |
+
+**Signal words** that prevent simple classification: `research`, `investigate`, `refactor`, `migrate`, `integrate`, `complex`, `architect`, `redesign`, `security`, `performance`, `concurrent`, `parallel`, `distributed`, `backward compat`, `migration`, `architecture`, `concurrency`, `compatibility`.
+
+Empty or malformed plans default to `standard` (conservative).
+
+### Unit Type Defaults
+
+Non-task units have built-in tier assignments:
+
+| Unit Type | Default Tier |
+|-----------|-------------|
+| `complete-slice`, `run-uat` | Light |
+| `research-*`, `plan-*`, `execute-task`, `complete-milestone` | Standard |
+| `replan-slice`, `reassess-roadmap` | Heavy |
+| `hook/*` | Light |
+
+### Model Routing
+
+Each tier maps to a model configuration:
+
+| Tier | Model Phase Key | Typical Model |
+|------|----------------|---------------|
+| Light | `completion` | Haiku (budget) / user default |
+| Standard | `execution` | Sonnet / user default |
+| Heavy | `execution` | Opus / user default |
+
+Simple tasks use the `execution_simple` model key when configured. This is set automatically by the `budget` profile to Haiku.
+
+### Budget Pressure
+
+When approaching your budget ceiling, the classifier automatically downgrades tiers:
+
+| Budget Used | Effect |
+|------------|--------|
+| < 50% | No adjustment |
+| 50-75% | Standard → Light |
+| 75-90% | Standard → Light |
+| > 90% | Everything except Heavy → Light; Heavy → Standard |
+
+This graduated approach preserves model quality for the most complex work while progressively reducing cost as the ceiling approaches.
+
+## Adaptive Learning (Routing History)
+
+GSD tracks the success and failure of each tier assignment over time and adjusts future classifications accordingly. This is opt-in — it happens automatically and persists in `.gsd/routing-history.json`.
+
+### How It Works
+
+1. After each unit completes, the outcome (success/failure) is recorded against the unit type and tier used
+2. Outcomes are tracked per-pattern (e.g., `execute-task`, `execute-task:docs`) with a rolling window of the last 50 entries
+3. If a tier's failure rate exceeds 20% for a given pattern, future classifications for that pattern are bumped up one tier
+4. The system also accepts tag-specific patterns (e.g., `execute-task:test` vs `execute-task:frontend`) for more granular routing
+
+### User Feedback
+
+GSD accepts manual feedback to accelerate learning:
+
+- **"over"** — the model was overpowered for this task (encourages downgrading)
+- **"under"** — the model wasn't capable enough (encourages upgrading)
+- **"ok"** — correct assignment (no adjustment)
+
+Feedback signals are weighted 2× compared to automatic outcomes.
+
+### Data Management
+
+```bash
+# Routing history is stored per-project
+.gsd/routing-history.json
+
+# Clear history to reset adaptive learning
+# (happens via the routing-history module API)
+```
+
+The feedback array is capped at 200 entries. Per-pattern outcome counts use a rolling window of 50 to prevent stale data from dominating.
+
+## Configuration Examples
+
+### Cost-Optimized Setup
+
+```yaml
+---
+version: 1
+token_profile: budget
+budget_ceiling: 25.00
+models:
+  execution_simple: claude-haiku-4-5-20250414
+---
+```
+
+### Balanced with Custom Models
+
+```yaml
+---
+version: 1
+token_profile: balanced
+models:
+  planning:
+    model: claude-opus-4-6
+    fallbacks:
+      - openrouter/z-ai/glm-5
+  execution: claude-sonnet-4-6
+---
+```
+
+### Full Quality for Critical Work
+
+```yaml
+---
+version: 1
+token_profile: quality
+models:
+  planning: claude-opus-4-6
+  execution: claude-opus-4-6
+---
+```
+
+### Per-Phase Overrides
+
+The `token_profile` sets defaults, but explicit preferences always win:
+
+```yaml
+---
+version: 1
+token_profile: budget
+phases:
+  skip_research: false     # override: keep milestone research
+models:
+  planning: claude-opus-4-6  # override: use Opus for planning despite budget profile
+---
+```
+
+## How the Pieces Fit Together
+
+```
+preferences.md
+  └─ token_profile: balanced
+       ├─ resolveProfileDefaults() → model defaults + phase skip defaults
+       ├─ resolveInlineLevel() → standard
+       │    └─ prompt builders gate context inclusion by level
+       └─ classifyUnitComplexity() → routes to execution/execution_simple model
+            ├─ task plan analysis (steps, files, signals)
+            ├─ unit type defaults
+            ├─ budget pressure adjustment
+            └─ adaptive learning from routing-history.json
+```
+
+The profile is resolved once and flows through the entire dispatch pipeline. Explicit preferences override profile defaults at every layer.
--- a/docs/troubleshooting.md
+++ b/docs/troubleshooting.md
@ -0,0 +1,114 @@
+# Troubleshooting
+
+## `/gsd doctor`
+
+The built-in diagnostic tool validates `.gsd/` integrity:
+
+```
+/gsd doctor
+```
+
+It checks:
+- File structure and naming conventions
+- Roadmap ↔ slice ↔ task referential integrity
+- Completion state consistency
+- Git worktree health
+- Stale lock files and orphaned runtime records
+
+## Common Issues
+
+### Auto mode loops on the same unit
+
+**Symptoms:** The same unit (e.g., `research-slice` or `plan-slice`) dispatches repeatedly until hitting the dispatch limit.
+
+**Causes:**
+- Stale cache after a crash — the in-memory file listing doesn't reflect new artifacts
+- The LLM didn't produce the expected artifact file
+
+**Fix:** Run `/gsd doctor` to repair state, then resume with `/gsd auto`. If the issue persists, check that the expected artifact file exists on disk.
+
+### Auto mode stops with "Loop detected"
+
+**Cause:** A unit failed to produce its expected artifact twice in a row.
+
+**Fix:** Check the task plan for clarity. If the plan is ambiguous, refine it manually, then `/gsd auto` to resume.
+
+### Wrong files in worktree
+
+**Symptoms:** Planning artifacts or code appear in the wrong directory.
+
+**Cause:** The LLM wrote to the main repo instead of the worktree.
+
+**Fix:** This was fixed in v2.14+. If you're on an older version, update. The dispatch prompt now includes explicit working directory instructions.
+
+### `npm install -g gsd-pi` fails
+
+**Common causes:**
+- Missing workspace packages — fixed in v2.10.4+
+- `postinstall` hangs on Linux (Playwright `--with-deps` triggering sudo) — fixed in v2.3.6+
+- Node.js version too old — requires ≥ 20.6.0
+
+### Provider errors during auto mode
+
+**Symptoms:** Auto mode pauses with a provider error (rate limit, auth failure, etc.).
+
+**Fix:** GSD automatically tries fallback models if configured. To add fallbacks:
+
+```yaml
+models:
+  execution:
+    model: claude-sonnet-4-6
+    fallbacks:
+      - openrouter/minimax/minimax-m2.5
+```
+
+### Budget ceiling reached
+
+**Symptoms:** Auto mode pauses with "Budget ceiling reached."
+
+**Fix:** Increase `budget_ceiling` in preferences, or switch to `budget` token profile to reduce per-unit cost, then resume with `/gsd auto`.
+
+### Stale lock file
+
+**Symptoms:** Auto mode won't start, says another session is running.
+
+**Fix:** If no other session is actually running, delete `.gsd/auto.lock` manually. GSD includes stale lock detection (checks if the PID is still alive), but edge cases exist.
+
+### Git merge conflicts
+
+**Symptoms:** Worktree merge fails on `.gsd/` files.
+
+**Fix:** GSD auto-resolves conflicts on `.gsd/` runtime files. For content conflicts in code files, the LLM is given an opportunity to resolve them via a fix-merge session. If that fails, manual resolution is needed.
+
+## Recovery Procedures
+
+### Reset auto mode state
+
+```bash
+rm .gsd/auto.lock
+rm .gsd/completed-units.json
+```
+
+Then `/gsd auto` to restart from current disk state.
+
+### Reset routing history
+
+If adaptive model routing is producing bad results, clear the routing history:
+
+```bash
+rm .gsd/routing-history.json
+```
+
+### Full state rebuild
+
+```
+/gsd doctor
+```
+
+Doctor rebuilds `STATE.md` from plan and roadmap files on disk and fixes detected inconsistencies.
+
+## Getting Help
+
+- **GitHub Issues:** [github.com/gsd-build/GSD-2/issues](https://github.com/gsd-build/GSD-2/issues)
+- **Dashboard:** `Ctrl+Alt+G` or `/gsd status` for real-time diagnostics
+- **Session logs:** `.gsd/activity/` contains JSONL session dumps for crash forensics
--- a/docs/working-in-teams.md
+++ b/docs/working-in-teams.md
@ -0,0 +1,99 @@
+# Working in Teams
+
+GSD supports multi-user workflows where several developers work on the same repository concurrently.
+
+## Setup
+
+### 1. Enable Unique Milestone IDs
+
+Prevent ID collisions when multiple developers create milestones:
+
+```yaml
+# .gsd/preferences.md (project-level, committed to git)
+---
+version: 1
+unique_milestone_ids: true
+---
+```
+
+This generates milestone IDs like `M001-eh88as` instead of plain `M001`. The random suffix ensures no two developers clash.
+
+### 2. Configure `.gitignore`
+
+Share planning artifacts (milestones, roadmaps, decisions) while keeping runtime files local:
+
+```bash
+# ── GSD: Runtime / Ephemeral (per-developer, per-session) ──────
+.gsd/auto.lock
+.gsd/completed-units.json
+.gsd/STATE.md
+.gsd/metrics.json
+.gsd/activity/
+.gsd/runtime/
+.gsd/worktrees/
+.gsd/milestones/**/continue.md
+.gsd/milestones/**/*-CONTINUE.md
+```
+
+**What gets shared** (committed to git):
+- `.gsd/preferences.md` — project preferences
+- `.gsd/PROJECT.md` — living project description
+- `.gsd/REQUIREMENTS.md` — requirement contract
+- `.gsd/DECISIONS.md` — architectural decisions
+- `.gsd/milestones/` — roadmaps, plans, summaries, research
+
+**What stays local** (gitignored):
+- Lock files, metrics, state cache, runtime records, worktrees, activity logs
+
+### 3. Commit the Preferences
+
+```bash
+git add .gsd/preferences.md
+git commit -m "chore: enable GSD team workflow"
+```
+
+## `commit_docs: false`
+
+For teams where only some members use GSD, or when company policy requires a clean repo:
+
+```yaml
+git:
+  commit_docs: false
+```
+
+This adds `.gsd/` to `.gitignore` entirely and keeps all artifacts local. The developer gets the benefits of structured planning without affecting teammates who don't use GSD.
+
+## Migrating an Existing Project
+
+If you have an existing project with `.gsd/` blanket-ignored:
+
+1. Ensure no milestones are in progress (clean state)
+2. Update `.gitignore` to use the selective pattern above
+3. Add `unique_milestone_ids: true` to `.gsd/preferences.md`
+4. Optionally rename existing milestones to use unique IDs:
+   ```
+   I have turned on unique milestone ids, please update all old milestone
+   ids to use this new format e.g. M001-abc123 where abc123 is a random
+   6 char lowercase alpha numeric string. Update all references in all
+   .gsd file contents, file names and directory names. Validate your work
+   once done to ensure referential integrity.
+   ```
+5. Commit
+
+## Parallel Development
+
+Multiple developers can run auto mode simultaneously on different milestones. Each developer:
+
+- Gets their own worktree (`.gsd/worktrees/<MID>/`, gitignored)
+- Works on a unique `milestone/<MID>` branch
+- Squash-merges to main independently
+
+Milestone dependencies can be declared in `M00X-CONTEXT.md` frontmatter:
+
+```yaml
+---
+depends_on: [M001-eh88as]
+---
+```
+
+GSD enforces that dependent milestones complete before starting downstream work.
--- a/native/npm/darwin-arm64/package.json
+++ b/native/npm/darwin-arm64/package.json
@ -1,6 +1,6 @@
 {
  "name": "@gsd-build/engine-darwin-arm64",
-  "version": "2.17.0",
+  "version": "2.19.0",
  "description": "GSD native engine binary for macOS ARM64",
  "os": [
    "darwin"
--- a/native/npm/darwin-x64/package.json
+++ b/native/npm/darwin-x64/package.json
@ -1,6 +1,6 @@
 {
  "name": "@gsd-build/engine-darwin-x64",
-  "version": "2.17.0",
+  "version": "2.19.0",
  "description": "GSD native engine binary for macOS Intel",
  "os": [
    "darwin"
--- a/native/npm/linux-arm64-gnu/package.json
+++ b/native/npm/linux-arm64-gnu/package.json
@ -1,6 +1,6 @@
 {
  "name": "@gsd-build/engine-linux-arm64-gnu",
-  "version": "2.17.0",
+  "version": "2.19.0",
  "description": "GSD native engine binary for Linux ARM64 (glibc)",
  "os": [
    "linux"
--- a/native/npm/linux-x64-gnu/package.json
+++ b/native/npm/linux-x64-gnu/package.json
@ -1,6 +1,6 @@
 {
  "name": "@gsd-build/engine-linux-x64-gnu",
-  "version": "2.17.0",
+  "version": "2.19.0",
  "description": "GSD native engine binary for Linux x64 (glibc)",
  "os": [
    "linux"
--- a/native/npm/win32-x64-msvc/package.json
+++ b/native/npm/win32-x64-msvc/package.json
@ -1,6 +1,6 @@
 {
  "name": "@gsd-build/engine-win32-x64-msvc",
-  "version": "2.17.0",
+  "version": "2.19.0",
  "description": "GSD native engine binary for Windows x64 (MSVC)",
  "os": [
    "win32"
--- a/package-lock.json
+++ b/package-lock.json
@ -1,12 +1,12 @@
 {
  "name": "gsd-pi",
-  "version": "2.16.0",
+  "version": "2.19.0",
  "lockfileVersion": 3,
  "requires": true,
  "packages": {
    "": {
      "name": "gsd-pi",
-      "version": "2.16.0",
+      "version": "2.19.0",
      "hasInstallScript": true,
      "license": "MIT",
      "workspaces": [
--- a/package.json
+++ b/package.json
@ -1,6 +1,6 @@
 {
  "name": "gsd-pi",
-  "version": "2.17.0",
+  "version": "2.19.0",
  "description": "GSD — Get Shit Done coding agent",
  "license": "MIT",
  "repository": {
--- a/packages/pi-coding-agent/src/cli/args.ts
+++ b/packages/pi-coding-agent/src/cli/args.ts
@ -38,6 +38,11 @@ export interface Args {
 	themes?: string[];
 	noThemes?: boolean;
 	listModels?: string | true;
+	discover?: boolean;
+	addProvider?: string;
+	addProviderBaseUrl?: string;
+	addProviderApiKey?: string;
+	discoverModels?: string | true;
 	offline?: boolean;
 	verbose?: boolean;
 	messages: string[];
@ -150,6 +155,18 @@ export function parseArgs(args: string[], extensionFlags?: Map<string, { type: "
 			} else {
 				result.listModels = true;
 			}
+		} else if (arg === "--discover") {
+			result.discover = true;
+		} else if (arg === "--add-provider" && i + 1 < args.length) {
+			result.addProvider = args[++i];
+		} else if (arg === "--base-url" && i + 1 < args.length) {
+			result.addProviderBaseUrl = args[++i];
+		} else if (arg === "--discover-models") {
+			if (i + 1 < args.length && !args[i + 1].startsWith("-") && !args[i + 1].startsWith("@")) {
+				result.discoverModels = args[++i];
+			} else {
+				result.discoverModels = true;
+			}
 		} else if (arg === "--verbose") {
 			result.verbose = true;
 		} else if (arg === "--offline") {
@ -219,6 +236,10 @@ ${chalk.bold("Options:")}
  --no-themes                    Disable theme discovery and loading
  --export <file>                Export session file to HTML and exit
  --list-models [search]         List available models (with optional fuzzy search)
+  --discover                     Include discovered models in --list-models output
+  --discover-models [provider]   Discover models from provider APIs (all or specific)
+  --add-provider <name>          Add a provider to models.json (use with --base-url, --api-key)
+  --base-url <url>               Base URL for --add-provider
  --verbose                      Force verbose startup (overrides quietStartup setting)
  --offline                      Disable startup network operations (same as PI_OFFLINE=1)
  --help, -h                     Show this help
--- a/packages/pi-coding-agent/src/cli/list-models.ts
+++ b/packages/pi-coding-agent/src/cli/list-models.ts
@ -1,11 +1,18 @@
 /**
- * List available models with optional fuzzy search
+ * List available models with optional fuzzy search and discovery support
 */

 import type { Api, Model } from "@gsd/pi-ai";
 import { fuzzyFilter } from "@gsd/pi-tui";
 import type { ModelRegistry } from "../core/model-registry.js";

+export interface ListModelsOptions {
+	/** Include discovered models in output */
+	discover?: boolean;
+	/** Search pattern for fuzzy filtering */
+	searchPattern?: string;
+}
+
 /**
 * Format a number as human-readable (e.g., 200000 -> "200K", 1000000 -> "1M")
 */
@ -22,10 +29,48 @@ function formatTokenCount(count: number): string {
 }

 /**
- * List available models, optionally filtered by search pattern
+ * Discover models from provider APIs and print results.
 */
-export async function listModels(modelRegistry: ModelRegistry, searchPattern?: string): Promise<void> {
-	const models = modelRegistry.getAvailable();
+export async function discoverAndPrintModels(
+	modelRegistry: ModelRegistry,
+	provider?: string,
+): Promise<void> {
+	const providers = provider ? [provider] : undefined;
+
+	console.log("Discovering models...");
+	const results = await modelRegistry.discoverModels(providers);
+
+	for (const result of results) {
+		if (result.error) {
+			console.log(`  ${result.provider}: error - ${result.error}`);
+		} else {
+			console.log(`  ${result.provider}: ${result.models.length} models found`);
+		}
+	}
+}
+
+/**
+ * List available models, optionally filtered by search pattern.
+ * Accepts either a string (backward compat) or ListModelsOptions.
+ */
+export async function listModels(
+	modelRegistry: ModelRegistry,
+	optionsOrSearch?: string | ListModelsOptions,
+): Promise<void> {
+	const options: ListModelsOptions =
+		typeof optionsOrSearch === "string"
+			? { searchPattern: optionsOrSearch }
+			: optionsOrSearch ?? {};
+
+	// If discover flag is set, run discovery first
+	if (options.discover) {
+		await modelRegistry.discoverModels();
+	}
+
+	// Get models — include discovered if discovery was run
+	const models = options.discover
+		? modelRegistry.getAllWithDiscovered()
+		: modelRegistry.getAvailable();

 	if (models.length === 0) {
 		console.log("No models available. Set API keys in environment variables.");
@ -34,12 +79,12 @@ export async function listModels(modelRegistry: ModelRegistry, searchPattern?: s

 	// Apply fuzzy filter if search pattern provided
 	let filteredModels: Model<Api>[] = models;
-	if (searchPattern) {
-		filteredModels = fuzzyFilter(models, searchPattern, (m) => `${m.provider} ${m.id}`);
+	if (options.searchPattern) {
+		filteredModels = fuzzyFilter(models, options.searchPattern, (m) => `${m.provider} ${m.id}`);
 	}

 	if (filteredModels.length === 0) {
-		console.log(`No models matching "${searchPattern}"`);
+		console.log(`No models matching "${options.searchPattern}"`);
 		return;
 	}

@ -53,15 +98,19 @@ export async function listModels(modelRegistry: ModelRegistry, searchPattern?: s
 	});

 	// Calculate column widths
-	const rows = filteredModels.map((m) => ({
-		provider: m.provider,
-		model: m.id,
-		name: m.name,
-		context: formatTokenCount(m.contextWindow),
-		maxOut: formatTokenCount(m.maxTokens),
-		thinking: m.reasoning ? "yes" : "no",
-		images: m.input.includes("image") ? "yes" : "no",
-	}));
+	const rows = filteredModels.map((m) => {
+		const isDiscovered = options.discover && modelRegistry.isDiscovered(m);
+		return {
+			provider: m.provider,
+			model: m.id,
+			name: m.name,
+			context: formatTokenCount(m.contextWindow),
+			maxOut: formatTokenCount(m.maxTokens),
+			thinking: m.reasoning ? "yes" : "no",
+			images: m.input.includes("image") ? "yes" : "no",
+			badge: isDiscovered ? "[discovered]" : "",
+		};
+	});

 	const headers = {
 		provider: "provider",
@ -71,6 +120,7 @@ export async function listModels(modelRegistry: ModelRegistry, searchPattern?: s
 		maxOut: "max-out",
 		thinking: "thinking",
 		images: "images",
+		badge: "",
 	};

 	const widths = {
@ -105,7 +155,10 @@ export async function listModels(modelRegistry: ModelRegistry, searchPattern?: s
 			row.maxOut.padEnd(widths.maxOut),
 			row.thinking.padEnd(widths.thinking),
 			row.images.padEnd(widths.images),
-		].join("  ");
+			row.badge,
+		]
+			.join("  ")
+			.trimEnd();
 		console.log(line);
 	}
 }
--- a/packages/pi-coding-agent/src/core/discovery-cache.test.ts
+++ b/packages/pi-coding-agent/src/core/discovery-cache.test.ts
@ -0,0 +1,170 @@
+import assert from "node:assert/strict";
+import { existsSync, mkdirSync, rmSync, writeFileSync } from "node:fs";
+import { tmpdir } from "node:os";
+import { join } from "node:path";
+import { afterEach, beforeEach, describe, it } from "node:test";
+import { ModelDiscoveryCache } from "./discovery-cache.js";
+
+let testDir: string;
+let cachePath: string;
+
+beforeEach(() => {
+	testDir = join(tmpdir(), `discovery-cache-test-${Date.now()}-${Math.random().toString(36).slice(2)}`);
+	mkdirSync(testDir, { recursive: true });
+	cachePath = join(testDir, "discovery-cache.json");
+});
+
+afterEach(() => {
+	try {
+		rmSync(testDir, { recursive: true, force: true });
+	} catch {
+		// Cleanup best-effort
+	}
+});
+
+// ─── basic operations ────────────────────────────────────────────────────────
+
+describe("ModelDiscoveryCache — basic operations", () => {
+	it("starts with no entries", () => {
+		const cache = new ModelDiscoveryCache(cachePath);
+		assert.equal(cache.get("openai"), undefined);
+	});
+
+	it("stores and retrieves models", () => {
+		const cache = new ModelDiscoveryCache(cachePath);
+		const models = [{ id: "gpt-4o", name: "GPT-4o" }];
+		cache.set("openai", models);
+
+		const entry = cache.get("openai");
+		assert.ok(entry);
+		assert.deepEqual(entry.models, models);
+		assert.ok(entry.fetchedAt > 0);
+		assert.ok(entry.ttlMs > 0);
+	});
+
+	it("persists to disk and reloads", () => {
+		const cache1 = new ModelDiscoveryCache(cachePath);
+		cache1.set("openai", [{ id: "gpt-4o" }]);
+
+		const cache2 = new ModelDiscoveryCache(cachePath);
+		const entry = cache2.get("openai");
+		assert.ok(entry);
+		assert.equal(entry.models[0].id, "gpt-4o");
+	});
+
+	it("clear removes a specific provider", () => {
+		const cache = new ModelDiscoveryCache(cachePath);
+		cache.set("openai", [{ id: "gpt-4o" }]);
+		cache.set("google", [{ id: "gemini-pro" }]);
+
+		cache.clear("openai");
+		assert.equal(cache.get("openai"), undefined);
+		assert.ok(cache.get("google"));
+	});
+
+	it("clear without provider removes all entries", () => {
+		const cache = new ModelDiscoveryCache(cachePath);
+		cache.set("openai", [{ id: "gpt-4o" }]);
+		cache.set("google", [{ id: "gemini-pro" }]);
+
+		cache.clear();
+		assert.equal(cache.get("openai"), undefined);
+		assert.equal(cache.get("google"), undefined);
+	});
+});
+
+// ─── staleness ───────────────────────────────────────────────────────────────
+
+describe("ModelDiscoveryCache — staleness", () => {
+	it("newly set entries are not stale", () => {
+		const cache = new ModelDiscoveryCache(cachePath);
+		cache.set("openai", [{ id: "gpt-4o" }]);
+		assert.equal(cache.isStale("openai"), false);
+	});
+
+	it("missing providers are stale", () => {
+		const cache = new ModelDiscoveryCache(cachePath);
+		assert.equal(cache.isStale("unknown"), true);
+	});
+
+	it("entries with expired TTL are stale", () => {
+		const cache = new ModelDiscoveryCache(cachePath);
+		cache.set("openai", [{ id: "gpt-4o" }], 1); // 1ms TTL
+
+		// Wait for TTL to expire
+		const start = Date.now();
+		while (Date.now() - start < 5) {
+			// busy wait
+		}
+
+		assert.equal(cache.isStale("openai"), true);
+	});
+});
+
+// ─── getAll ──────────────────────────────────────────────────────────────────
+
+describe("ModelDiscoveryCache — getAll", () => {
+	it("returns non-stale entries by default", () => {
+		const cache = new ModelDiscoveryCache(cachePath);
+		cache.set("openai", [{ id: "gpt-4o" }]);
+		cache.set("stale", [{ id: "old" }], 1);
+
+		// Wait for stale TTL
+		const start = Date.now();
+		while (Date.now() - start < 5) {
+			// busy wait
+		}
+
+		const all = cache.getAll();
+		assert.ok(all.has("openai"));
+		assert.ok(!all.has("stale"));
+	});
+
+	it("returns all entries when includeStale is true", () => {
+		const cache = new ModelDiscoveryCache(cachePath);
+		cache.set("openai", [{ id: "gpt-4o" }]);
+		cache.set("stale", [{ id: "old" }], 1);
+
+		// Wait for stale TTL
+		const start = Date.now();
+		while (Date.now() - start < 5) {
+			// busy wait
+		}
+
+		const all = cache.getAll(true);
+		assert.ok(all.has("openai"));
+		assert.ok(all.has("stale"));
+	});
+});
+
+// ─── edge cases ──────────────────────────────────────────────────────────────
+
+describe("ModelDiscoveryCache — edge cases", () => {
+	it("handles corrupted cache file gracefully", () => {
+		writeFileSync(cachePath, "not valid json", "utf-8");
+		const cache = new ModelDiscoveryCache(cachePath);
+		assert.equal(cache.get("openai"), undefined);
+	});
+
+	it("handles wrong version gracefully", () => {
+		writeFileSync(cachePath, JSON.stringify({ version: 99, entries: {} }), "utf-8");
+		const cache = new ModelDiscoveryCache(cachePath);
+		assert.equal(cache.get("openai"), undefined);
+	});
+
+	it("handles missing cache file", () => {
+		const cache = new ModelDiscoveryCache(join(testDir, "nonexistent", "cache.json"));
+		assert.equal(cache.get("openai"), undefined);
+	});
+
+	it("overwrites existing entry for same provider", () => {
+		const cache = new ModelDiscoveryCache(cachePath);
+		cache.set("openai", [{ id: "gpt-4o" }]);
+		cache.set("openai", [{ id: "gpt-4o-mini" }]);
+
+		const entry = cache.get("openai");
+		assert.ok(entry);
+		assert.equal(entry.models.length, 1);
+		assert.equal(entry.models[0].id, "gpt-4o-mini");
+	});
+});
--- a/packages/pi-coding-agent/src/core/discovery-cache.ts
+++ b/packages/pi-coding-agent/src/core/discovery-cache.ts
@ -0,0 +1,97 @@
+/**
+ * Disk-based cache for discovered models.
+ * Stores results at {agentDir}/discovery-cache.json with per-provider TTLs.
+ */
+
+import { existsSync, mkdirSync, readFileSync, writeFileSync } from "fs";
+import { dirname, join } from "path";
+import { getAgentDir } from "../config.js";
+import { type DiscoveredModel, getDefaultTTL } from "./model-discovery.js";
+
+export interface DiscoveryCacheEntry {
+	models: DiscoveredModel[];
+	fetchedAt: number;
+	ttlMs: number;
+}
+
+export interface DiscoveryCacheData {
+	version: 1;
+	entries: Record<string, DiscoveryCacheEntry>;
+}
+
+export class ModelDiscoveryCache {
+	private data: DiscoveryCacheData;
+	private cachePath: string;
+
+	constructor(cachePath?: string) {
+		this.cachePath = cachePath ?? join(getAgentDir(), "discovery-cache.json");
+		this.data = { version: 1, entries: {} };
+		this.load();
+	}
+
+	get(provider: string): DiscoveryCacheEntry | undefined {
+		const entry = this.data.entries[provider];
+		return entry;
+	}
+
+	set(provider: string, models: DiscoveredModel[], ttlMs?: number): void {
+		this.data.entries[provider] = {
+			models,
+			fetchedAt: Date.now(),
+			ttlMs: ttlMs ?? getDefaultTTL(provider),
+		};
+		this.save();
+	}
+
+	isStale(provider: string): boolean {
+		const entry = this.data.entries[provider];
+		if (!entry) return true;
+		return Date.now() - entry.fetchedAt > entry.ttlMs;
+	}
+
+	clear(provider?: string): void {
+		if (provider) {
+			delete this.data.entries[provider];
+		} else {
+			this.data.entries = {};
+		}
+		this.save();
+	}
+
+	getAll(includeStale = false): Map<string, DiscoveryCacheEntry> {
+		const result = new Map<string, DiscoveryCacheEntry>();
+		for (const [provider, entry] of Object.entries(this.data.entries)) {
+			if (includeStale || !this.isStale(provider)) {
+				result.set(provider, entry);
+			}
+		}
+		return result;
+	}
+
+	load(): void {
+		try {
+			if (existsSync(this.cachePath)) {
+				const content = readFileSync(this.cachePath, "utf-8");
+				const parsed = JSON.parse(content) as DiscoveryCacheData;
+				if (parsed.version === 1 && parsed.entries) {
+					this.data = parsed;
+				}
+			}
+		} catch {
+			// Corrupted or unreadable cache — start fresh
+			this.data = { version: 1, entries: {} };
+		}
+	}
+
+	save(): void {
+		try {
+			const dir = dirname(this.cachePath);
+			if (!existsSync(dir)) {
+				mkdirSync(dir, { recursive: true });
+			}
+			writeFileSync(this.cachePath, JSON.stringify(this.data, null, 2), "utf-8");
+		} catch {
+			// Silently ignore write failures (read-only FS, permissions, etc.)
+		}
+	}
+}
--- a/packages/pi-coding-agent/src/core/model-discovery.test.ts
+++ b/packages/pi-coding-agent/src/core/model-discovery.test.ts
@ -0,0 +1,125 @@
+import assert from "node:assert/strict";
+import { describe, it } from "node:test";
+import {
+	DISCOVERY_TTLS,
+	getDefaultTTL,
+	getDiscoverableProviders,
+	getDiscoveryAdapter,
+} from "./model-discovery.js";
+
+// ─── getDiscoveryAdapter ─────────────────────────────────────────────────────
+
+describe("getDiscoveryAdapter", () => {
+	it("returns an adapter for openai", () => {
+		const adapter = getDiscoveryAdapter("openai");
+		assert.equal(adapter.provider, "openai");
+		assert.equal(adapter.supportsDiscovery, true);
+	});
+
+	it("returns an adapter for ollama", () => {
+		const adapter = getDiscoveryAdapter("ollama");
+		assert.equal(adapter.provider, "ollama");
+		assert.equal(adapter.supportsDiscovery, true);
+	});
+
+	it("returns an adapter for openrouter", () => {
+		const adapter = getDiscoveryAdapter("openrouter");
+		assert.equal(adapter.provider, "openrouter");
+		assert.equal(adapter.supportsDiscovery, true);
+	});
+
+	it("returns an adapter for google", () => {
+		const adapter = getDiscoveryAdapter("google");
+		assert.equal(adapter.provider, "google");
+		assert.equal(adapter.supportsDiscovery, true);
+	});
+
+	it("returns a static adapter for anthropic", () => {
+		const adapter = getDiscoveryAdapter("anthropic");
+		assert.equal(adapter.provider, "anthropic");
+		assert.equal(adapter.supportsDiscovery, false);
+	});
+
+	it("returns a static adapter for bedrock", () => {
+		const adapter = getDiscoveryAdapter("bedrock");
+		assert.equal(adapter.provider, "bedrock");
+		assert.equal(adapter.supportsDiscovery, false);
+	});
+
+	it("returns a static adapter for unknown providers", () => {
+		const adapter = getDiscoveryAdapter("unknown-provider");
+		assert.equal(adapter.provider, "unknown-provider");
+		assert.equal(adapter.supportsDiscovery, false);
+	});
+
+	it("static adapter fetchModels returns empty array", async () => {
+		const adapter = getDiscoveryAdapter("anthropic");
+		const models = await adapter.fetchModels("key");
+		assert.deepEqual(models, []);
+	});
+});
+
+// ─── getDiscoverableProviders ────────────────────────────────────────────────
+
+describe("getDiscoverableProviders", () => {
+	it("returns only providers that support discovery", () => {
+		const providers = getDiscoverableProviders();
+		assert.ok(providers.includes("openai"));
+		assert.ok(providers.includes("ollama"));
+		assert.ok(providers.includes("openrouter"));
+		assert.ok(providers.includes("google"));
+		assert.ok(!providers.includes("anthropic"));
+		assert.ok(!providers.includes("bedrock"));
+	});
+
+	it("returns an array of strings", () => {
+		const providers = getDiscoverableProviders();
+		assert.ok(Array.isArray(providers));
+		for (const p of providers) {
+			assert.equal(typeof p, "string");
+		}
+	});
+});
+
+// ─── getDefaultTTL ───────────────────────────────────────────────────────────
+
+describe("getDefaultTTL", () => {
+	it("returns 5 minutes for ollama", () => {
+		assert.equal(getDefaultTTL("ollama"), 5 * 60 * 1000);
+	});
+
+	it("returns 1 hour for openai", () => {
+		assert.equal(getDefaultTTL("openai"), 60 * 60 * 1000);
+	});
+
+	it("returns 1 hour for google", () => {
+		assert.equal(getDefaultTTL("google"), 60 * 60 * 1000);
+	});
+
+	it("returns 1 hour for openrouter", () => {
+		assert.equal(getDefaultTTL("openrouter"), 60 * 60 * 1000);
+	});
+
+	it("returns 24 hours for unknown providers", () => {
+		assert.equal(getDefaultTTL("some-custom"), 24 * 60 * 60 * 1000);
+	});
+});
+
+// ─── DISCOVERY_TTLS ──────────────────────────────────────────────────────────
+
+describe("DISCOVERY_TTLS", () => {
+	it("has expected keys", () => {
+		assert.ok("ollama" in DISCOVERY_TTLS);
+		assert.ok("openai" in DISCOVERY_TTLS);
+		assert.ok("google" in DISCOVERY_TTLS);
+		assert.ok("openrouter" in DISCOVERY_TTLS);
+		assert.ok("default" in DISCOVERY_TTLS);
+	});
+
+	it("all values are positive numbers", () => {
+		for (const [, value] of Object.entries(DISCOVERY_TTLS)) {
+			assert.equal(typeof value, "number");
+			assert.ok(value > 0);
+		}
+	});
+});
--- a/packages/pi-coding-agent/src/core/model-discovery.ts
+++ b/packages/pi-coding-agent/src/core/model-discovery.ts
@ -0,0 +1,231 @@
+/**
+ * Provider discovery adapters for runtime model enumeration.
+ * Each adapter implements ProviderDiscoveryAdapter to fetch models from provider APIs.
+ */
+
+export interface DiscoveredModel {
+	id: string;
+	name?: string;
+	contextWindow?: number;
+	maxTokens?: number;
+	reasoning?: boolean;
+	input?: ("text" | "image")[];
+	cost?: { input: number; output: number; cacheRead: number; cacheWrite: number };
+}
+
+export interface DiscoveryResult {
+	provider: string;
+	models: DiscoveredModel[];
+	fetchedAt: number;
+	error?: string;
+}
+
+export interface ProviderDiscoveryAdapter {
+	provider: string;
+	supportsDiscovery: boolean;
+	fetchModels(apiKey: string, baseUrl?: string): Promise<DiscoveredModel[]>;
+}
+
+/** Per-provider TTLs in milliseconds */
+export const DISCOVERY_TTLS: Record<string, number> = {
+	ollama: 5 * 60 * 1000, // 5 minutes (local, models change often)
+	openai: 60 * 60 * 1000, // 1 hour
+	google: 60 * 60 * 1000, // 1 hour
+	openrouter: 60 * 60 * 1000, // 1 hour
+	default: 24 * 60 * 60 * 1000, // 24 hours
+};
+
+export function getDefaultTTL(provider: string): number {
+	return DISCOVERY_TTLS[provider] ?? DISCOVERY_TTLS.default;
+}
+
+async function fetchWithTimeout(url: string, options: RequestInit = {}, timeoutMs = 5000): Promise<Response> {
+	const controller = new AbortController();
+	const timeout = setTimeout(() => controller.abort(), timeoutMs);
+	try {
+		return await fetch(url, { ...options, signal: controller.signal });
+	} finally {
+		clearTimeout(timeout);
+	}
+}
+
+// ─── OpenAI Adapter ──────────────────────────────────────────────────────────
+
+const OPENAI_EXCLUDED_PREFIXES = ["embedding", "tts", "dall-e", "whisper", "text-embedding", "davinci", "babbage"];
+
+class OpenAIDiscoveryAdapter implements ProviderDiscoveryAdapter {
+	provider = "openai";
+	supportsDiscovery = true;
+
+	async fetchModels(apiKey: string, baseUrl?: string): Promise<DiscoveredModel[]> {
+		const url = `${baseUrl ?? "https://api.openai.com"}/v1/models`;
+		const response = await fetchWithTimeout(url, {
+			headers: { Authorization: `Bearer ${apiKey}` },
+		});
+
+		if (!response.ok) {
+			throw new Error(`OpenAI models API returned ${response.status}: ${response.statusText}`);
+		}
+
+		const data = (await response.json()) as { data: Array<{ id: string; owned_by?: string }> };
+		return data.data
+			.filter((m) => !OPENAI_EXCLUDED_PREFIXES.some((prefix) => m.id.startsWith(prefix)))
+			.map((m) => ({
+				id: m.id,
+				name: m.id,
+				input: ["text" as const, "image" as const],
+			}));
+	}
+}
+
+// ─── Ollama Adapter ──────────────────────────────────────────────────────────
+
+class OllamaDiscoveryAdapter implements ProviderDiscoveryAdapter {
+	provider = "ollama";
+	supportsDiscovery = true;
+
+	async fetchModels(_apiKey: string, baseUrl?: string): Promise<DiscoveredModel[]> {
+		const url = `${baseUrl ?? "http://localhost:11434"}/api/tags`;
+		const response = await fetchWithTimeout(url);
+
+		if (!response.ok) {
+			throw new Error(`Ollama tags API returned ${response.status}: ${response.statusText}`);
+		}
+
+		const data = (await response.json()) as {
+			models: Array<{ name: string; size: number; details?: { parameter_size?: string } }>;
+		};
+
+		return (data.models ?? []).map((m) => ({
+			id: m.name,
+			name: m.name,
+			input: ["text" as const],
+		}));
+	}
+}
+
+// ─── OpenRouter Adapter ──────────────────────────────────────────────────────
+
+class OpenRouterDiscoveryAdapter implements ProviderDiscoveryAdapter {
+	provider = "openrouter";
+	supportsDiscovery = true;
+
+	async fetchModels(apiKey: string, baseUrl?: string): Promise<DiscoveredModel[]> {
+		const url = `${baseUrl ?? "https://openrouter.ai"}/api/v1/models`;
+		const response = await fetchWithTimeout(url, {
+			headers: { Authorization: `Bearer ${apiKey}` },
+		});
+
+		if (!response.ok) {
+			throw new Error(`OpenRouter models API returned ${response.status}: ${response.statusText}`);
+		}
+
+		const data = (await response.json()) as {
+			data: Array<{
+				id: string;
+				name: string;
+				context_length?: number;
+				top_provider?: { max_completion_tokens?: number };
+				pricing?: { prompt: string; completion: string };
+			}>;
+		};
+
+		return (data.data ?? []).map((m) => {
+			const cost =
+				m.pricing?.prompt !== undefined && m.pricing?.completion !== undefined
+					? {
+							input: parseFloat(m.pricing.prompt) * 1_000_000,
+							output: parseFloat(m.pricing.completion) * 1_000_000,
+							cacheRead: 0,
+							cacheWrite: 0,
+						}
+					: undefined;
+
+			return {
+				id: m.id,
+				name: m.name,
+				contextWindow: m.context_length,
+				maxTokens: m.top_provider?.max_completion_tokens,
+				cost,
+				input: ["text" as const, "image" as const],
+			};
+		});
+	}
+}
+
+// ─── Google/Gemini Adapter ───────────────────────────────────────────────────
+
+class GoogleDiscoveryAdapter implements ProviderDiscoveryAdapter {
+	provider = "google";
+	supportsDiscovery = true;
+
+	async fetchModels(apiKey: string, baseUrl?: string): Promise<DiscoveredModel[]> {
+		const url = `${baseUrl ?? "https://generativelanguage.googleapis.com"}/v1beta/models?key=${apiKey}`;
+		const response = await fetchWithTimeout(url);
+
+		if (!response.ok) {
+			throw new Error(`Google models API returned ${response.status}: ${response.statusText}`);
+		}
+
+		const data = (await response.json()) as {
+			models: Array<{
+				name: string;
+				displayName: string;
+				supportedGenerationMethods?: string[];
+				inputTokenLimit?: number;
+				outputTokenLimit?: number;
+			}>;
+		};
+
+		return (data.models ?? [])
+			.filter((m) => m.supportedGenerationMethods?.includes("generateContent"))
+			.map((m) => ({
+				id: m.name.replace("models/", ""),
+				name: m.displayName,
+				contextWindow: m.inputTokenLimit,
+				maxTokens: m.outputTokenLimit,
+				input: ["text" as const, "image" as const],
+			}));
+	}
+}
+
+// ─── Static Adapter (no discovery) ───────────────────────────────────────────
+
+class StaticDiscoveryAdapter implements ProviderDiscoveryAdapter {
+	provider: string;
+	supportsDiscovery = false;
+
+	constructor(provider: string) {
+		this.provider = provider;
+	}
+
+	async fetchModels(): Promise<DiscoveredModel[]> {
+		return [];
+	}
+}
+
+// ─── Registry ────────────────────────────────────────────────────────────────
+
+const adapters: Record<string, ProviderDiscoveryAdapter> = {
+	openai: new OpenAIDiscoveryAdapter(),
+	ollama: new OllamaDiscoveryAdapter(),
+	openrouter: new OpenRouterDiscoveryAdapter(),
+	google: new GoogleDiscoveryAdapter(),
+	anthropic: new StaticDiscoveryAdapter("anthropic"),
+	bedrock: new StaticDiscoveryAdapter("bedrock"),
+	"azure-openai": new StaticDiscoveryAdapter("azure-openai"),
+	groq: new StaticDiscoveryAdapter("groq"),
+	cerebras: new StaticDiscoveryAdapter("cerebras"),
+	xai: new StaticDiscoveryAdapter("xai"),
+	mistral: new StaticDiscoveryAdapter("mistral"),
+};
+
+export function getDiscoveryAdapter(provider: string): ProviderDiscoveryAdapter {
+	return adapters[provider] ?? new StaticDiscoveryAdapter(provider);
+}
+
+export function getDiscoverableProviders(): string[] {
+	return Object.entries(adapters)
+		.filter(([, adapter]) => adapter.supportsDiscovery)
+		.map(([name]) => name);
+}
--- a/packages/pi-coding-agent/src/core/model-registry-discovery.test.ts
+++ b/packages/pi-coding-agent/src/core/model-registry-discovery.test.ts
@ -0,0 +1,135 @@
+import assert from "node:assert/strict";
+import { mkdirSync, rmSync, writeFileSync } from "node:fs";
+import { tmpdir } from "node:os";
+import { join } from "node:path";
+import { afterEach, beforeEach, describe, it } from "node:test";
+import { AuthStorage } from "./auth-storage.js";
+import { ModelDiscoveryCache } from "./discovery-cache.js";
+import { getDefaultTTL, getDiscoverableProviders, getDiscoveryAdapter } from "./model-discovery.js";
+
+let testDir: string;
+
+beforeEach(() => {
+	testDir = join(tmpdir(), `model-registry-discovery-test-${Date.now()}-${Math.random().toString(36).slice(2)}`);
+	mkdirSync(testDir, { recursive: true });
+});
+
+afterEach(() => {
+	try {
+		rmSync(testDir, { recursive: true, force: true });
+	} catch {
+		// Cleanup best-effort
+	}
+});
+
+// ─── discovery cache integration ─────────────────────────────────────────────
+
+describe("ModelDiscoveryCache — integration with discovery", () => {
+	it("cache respects provider-specific TTLs", () => {
+		const cachePath = join(testDir, "cache.json");
+		const cache = new ModelDiscoveryCache(cachePath);
+
+		cache.set("ollama", [{ id: "llama2" }]);
+		const entry = cache.get("ollama");
+		assert.ok(entry);
+		assert.equal(entry.ttlMs, getDefaultTTL("ollama"));
+	});
+
+	it("cache uses custom TTL when provided", () => {
+		const cachePath = join(testDir, "cache.json");
+		const cache = new ModelDiscoveryCache(cachePath);
+
+		cache.set("openai", [{ id: "gpt-4o" }], 999);
+		const entry = cache.get("openai");
+		assert.ok(entry);
+		assert.equal(entry.ttlMs, 999);
+	});
+});
+
+// ─── adapter resolution ─────────────────────────────────────────────────────
+
+describe("Discovery adapter resolution", () => {
+	it("all discoverable providers have adapters", () => {
+		const providers = getDiscoverableProviders();
+		for (const provider of providers) {
+			const adapter = getDiscoveryAdapter(provider);
+			assert.equal(adapter.supportsDiscovery, true, `${provider} should support discovery`);
+		}
+	});
+
+	it("static adapters return empty model lists", async () => {
+		const staticProviders = ["anthropic", "bedrock", "azure-openai", "groq", "cerebras"];
+		for (const provider of staticProviders) {
+			const adapter = getDiscoveryAdapter(provider);
+			assert.equal(adapter.supportsDiscovery, false, `${provider} should not support discovery`);
+			const models = await adapter.fetchModels("dummy-key");
+			assert.deepEqual(models, [], `${provider} should return empty models`);
+		}
+	});
+});
+
+// ─── AuthStorage hasAuth for discovery ───────────────────────────────────────
+
+describe("AuthStorage — hasAuth for discovery providers", () => {
+	it("returns false for providers without auth", () => {
+		const storage = AuthStorage.inMemory({});
+		assert.equal(storage.hasAuth("openai"), false);
+		assert.equal(storage.hasAuth("ollama"), false);
+	});
+
+	it("returns true for providers with stored keys", () => {
+		const storage = AuthStorage.inMemory({
+			openai: { type: "api_key" as const, key: "sk-test" },
+		});
+		assert.equal(storage.hasAuth("openai"), true);
+		assert.equal(storage.hasAuth("ollama"), false);
+	});
+});
+
+// ─── cache persistence across instances ──────────────────────────────────────
+
+describe("ModelDiscoveryCache — persistence", () => {
+	it("data survives across cache instances", () => {
+		const cachePath = join(testDir, "persist.json");
+
+		const cache1 = new ModelDiscoveryCache(cachePath);
+		cache1.set("openai", [
+			{ id: "gpt-4o", name: "GPT-4o", contextWindow: 128000 },
+			{ id: "gpt-4o-mini", name: "GPT-4o Mini" },
+		]);
+
+		const cache2 = new ModelDiscoveryCache(cachePath);
+		const entry = cache2.get("openai");
+		assert.ok(entry);
+		assert.equal(entry.models.length, 2);
+		assert.equal(entry.models[0].contextWindow, 128000);
+	});
+
+	it("clear persists across instances", () => {
+		const cachePath = join(testDir, "clear.json");
+
+		const cache1 = new ModelDiscoveryCache(cachePath);
+		cache1.set("openai", [{ id: "gpt-4o" }]);
+		cache1.clear("openai");
+
+		const cache2 = new ModelDiscoveryCache(cachePath);
+		assert.equal(cache2.get("openai"), undefined);
+	});
+});
+
+// ─── discovery TTL values ────────────────────────────────────────────────────
+
+describe("Discovery TTL configuration", () => {
+	it("ollama has shortest TTL (local models change often)", () => {
+		const ollamaTTL = getDefaultTTL("ollama");
+		const openaiTTL = getDefaultTTL("openai");
+		assert.ok(ollamaTTL < openaiTTL, "ollama TTL should be shorter than openai");
+	});
+
+	it("unknown providers get default TTL", () => {
+		const customTTL = getDefaultTTL("my-custom-provider");
+		const defaultTTL = getDefaultTTL("default");
+		// Unknown providers should get the same TTL as the explicit "default" key
+		assert.equal(customTTL, defaultTTL);
+	});
+});
--- a/packages/pi-coding-agent/src/core/model-registry.ts
+++ b/packages/pi-coding-agent/src/core/model-registry.ts
@ -24,6 +24,9 @@ import { existsSync, readFileSync } from "fs";
 import { join } from "path";
 import { getAgentDir } from "../config.js";
 import type { AuthStorage } from "./auth-storage.js";
+import { ModelDiscoveryCache } from "./discovery-cache.js";
+import type { DiscoveredModel, DiscoveryResult } from "./model-discovery.js";
+import { getDefaultTTL, getDiscoverableProviders, getDiscoveryAdapter } from "./model-discovery.js";
 import { clearConfigValueCache, resolveConfigValue, resolveHeaders } from "./resolve-config-value.js";

 const Ajv = (AjvModule as any).default || AjvModule;
@ -221,6 +224,8 @@ export const clearApiKeyCache = clearConfigValueCache;
 */
 export class ModelRegistry {
 	private models: Model<Api>[] = [];
+	private discoveredModels: Model<Api>[] = [];
+	private discoveryCache: ModelDiscoveryCache;
 	private customProviderApiKeys: Map<string, string> = new Map();
 	private registeredProviders: Map<string, ProviderConfigInput> = new Map();
 	private loadError: string | undefined = undefined;
@ -229,6 +234,8 @@ export class ModelRegistry {
 		readonly authStorage: AuthStorage,
 		private modelsJsonPath: string | undefined = join(getAgentDir(), "models.json"),
 	) {
+		this.discoveryCache = new ModelDiscoveryCache();
+
 		// Set up fallback resolver for custom provider API keys
 		this.authStorage.setFallbackResolver((provider) => {
 			const keyConfig = this.customProviderApiKeys.get(provider);
@ -666,6 +673,106 @@ export class ModelRegistry {
 			});
 		}
 	}
+
+	/**
+	 * Discover models from all providers that support discovery.
+	 * Results are cached and merged into the registry (never overrides existing models).
+	 */
+	async discoverModels(providers?: string[]): Promise<DiscoveryResult[]> {
+		const targetProviders = providers ?? getDiscoverableProviders();
+		const results: DiscoveryResult[] = [];
+
+		for (const providerName of targetProviders) {
+			const adapter = getDiscoveryAdapter(providerName);
+			if (!adapter.supportsDiscovery) continue;
+
+			// Skip if cache is still fresh
+			if (!this.discoveryCache.isStale(providerName)) {
+				const cached = this.discoveryCache.get(providerName);
+				if (cached) {
+					results.push({
+						provider: providerName,
+						models: cached.models,
+						fetchedAt: cached.fetchedAt,
+					});
+					continue;
+				}
+			}
+
+			try {
+				const apiKey = await this.authStorage.getApiKey(providerName);
+				if (!apiKey && providerName !== "ollama") continue;
+
+				const models = await adapter.fetchModels(apiKey ?? "", undefined);
+				this.discoveryCache.set(providerName, models);
+				results.push({
+					provider: providerName,
+					models,
+					fetchedAt: Date.now(),
+				});
+			} catch (error) {
+				results.push({
+					provider: providerName,
+					models: [],
+					fetchedAt: Date.now(),
+					error: error instanceof Error ? error.message : String(error),
+				});
+			}
+		}
+
+		// Convert and merge discovered models
+		this.discoveredModels = this.convertDiscoveredModels(results);
+		return results;
+	}
+
+	/**
+	 * Get all models including discovered ones.
+	 * Discovered models are appended but never override existing models.
+	 */
+	getAllWithDiscovered(): Model<Api>[] {
+		const existingIds = new Set(this.models.map((m) => `${m.provider}/${m.id}`));
+		const unique = this.discoveredModels.filter((m) => !existingIds.has(`${m.provider}/${m.id}`));
+		return [...this.models, ...unique];
+	}
+
+	/**
+	 * Check if a model was added via discovery (not built-in or custom).
+	 */
+	isDiscovered(model: Model<Api>): boolean {
+		return this.discoveredModels.some((m) => m.provider === model.provider && m.id === model.id);
+	}
+
+	/**
+	 * Get the discovery cache instance.
+	 */
+	getDiscoveryCache(): ModelDiscoveryCache {
+		return this.discoveryCache;
+	}
+
+	/**
+	 * Convert DiscoveryResult[] into Model<Api>[] with default values.
+	 */
+	private convertDiscoveredModels(results: DiscoveryResult[]): Model<Api>[] {
+		const converted: Model<Api>[] = [];
+		for (const result of results) {
+			if (result.error) continue;
+			for (const dm of result.models) {
+				converted.push({
+					id: dm.id,
+					name: dm.name ?? dm.id,
+					api: "openai" as Api,
+					provider: result.provider,
+					baseUrl: "",
+					reasoning: dm.reasoning ?? false,
+					input: dm.input ?? ["text"],
+					cost: dm.cost ?? { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 },
+					contextWindow: dm.contextWindow ?? 128000,
+					maxTokens: dm.maxTokens ?? 16384,
+				} as Model<Api>);
+			}
+		}
+		return converted;
+	}
 }

 /**
--- a/packages/pi-coding-agent/src/core/models-json-writer.test.ts
+++ b/packages/pi-coding-agent/src/core/models-json-writer.test.ts
@ -0,0 +1,145 @@
+import assert from "node:assert/strict";
+import { existsSync, mkdirSync, readFileSync, rmSync } from "node:fs";
+import { tmpdir } from "node:os";
+import { join } from "node:path";
+import { afterEach, beforeEach, describe, it } from "node:test";
+import { ModelsJsonWriter } from "./models-json-writer.js";
+
+let testDir: string;
+let modelsJsonPath: string;
+
+beforeEach(() => {
+	testDir = join(tmpdir(), `models-json-writer-test-${Date.now()}-${Math.random().toString(36).slice(2)}`);
+	mkdirSync(testDir, { recursive: true });
+	modelsJsonPath = join(testDir, "models.json");
+});
+
+afterEach(() => {
+	try {
+		rmSync(testDir, { recursive: true, force: true });
+	} catch {
+		// Cleanup best-effort
+	}
+});
+
+function readModels(): Record<string, unknown> {
+	return JSON.parse(readFileSync(modelsJsonPath, "utf-8"));
+}
+
+// ─── addModel ────────────────────────────────────────────────────────────────
+
+describe("ModelsJsonWriter — addModel", () => {
+	it("creates file and adds model to new provider", () => {
+		const writer = new ModelsJsonWriter(modelsJsonPath);
+		writer.addModel("openai", { id: "gpt-4o", name: "GPT-4o" }, { baseUrl: "https://api.openai.com", apiKey: "env:OPENAI_API_KEY", api: "openai" });
+
+		const config = readModels() as any;
+		assert.ok(config.providers.openai);
+		assert.equal(config.providers.openai.models.length, 1);
+		assert.equal(config.providers.openai.models[0].id, "gpt-4o");
+	});
+
+	it("appends model to existing provider", () => {
+		const writer = new ModelsJsonWriter(modelsJsonPath);
+		writer.addModel("openai", { id: "gpt-4o" }, { baseUrl: "https://api.openai.com", apiKey: "env:OPENAI_API_KEY", api: "openai" });
+		writer.addModel("openai", { id: "gpt-4o-mini" });
+
+		const config = readModels() as any;
+		assert.equal(config.providers.openai.models.length, 2);
+	});
+
+	it("replaces model with same id", () => {
+		const writer = new ModelsJsonWriter(modelsJsonPath);
+		writer.addModel("openai", { id: "gpt-4o", name: "Old" }, { baseUrl: "https://api.openai.com", apiKey: "env:OPENAI_API_KEY", api: "openai" });
+		writer.addModel("openai", { id: "gpt-4o", name: "New" });
+
+		const config = readModels() as any;
+		assert.equal(config.providers.openai.models.length, 1);
+		assert.equal(config.providers.openai.models[0].name, "New");
+	});
+});
+
+// ─── removeModel ─────────────────────────────────────────────────────────────
+
+describe("ModelsJsonWriter — removeModel", () => {
+	it("removes a model from provider", () => {
+		const writer = new ModelsJsonWriter(modelsJsonPath);
+		writer.addModel("openai", { id: "gpt-4o" }, { baseUrl: "https://api.openai.com", apiKey: "env:OPENAI_API_KEY", api: "openai" });
+		writer.addModel("openai", { id: "gpt-4o-mini" });
+
+		writer.removeModel("openai", "gpt-4o");
+
+		const config = readModels() as any;
+		assert.equal(config.providers.openai.models.length, 1);
+		assert.equal(config.providers.openai.models[0].id, "gpt-4o-mini");
+	});
+
+	it("removes provider when last model is removed", () => {
+		const writer = new ModelsJsonWriter(modelsJsonPath);
+		writer.addModel("openai", { id: "gpt-4o" }, { baseUrl: "https://api.openai.com", apiKey: "env:OPENAI_API_KEY", api: "openai" });
+
+		writer.removeModel("openai", "gpt-4o");
+
+		const config = readModels() as any;
+		assert.equal(config.providers.openai, undefined);
+	});
+
+	it("handles removing from nonexistent provider", () => {
+		const writer = new ModelsJsonWriter(modelsJsonPath);
+		// Should not throw
+		writer.removeModel("nonexistent", "model-id");
+	});
+});
+
+// ─── setProvider / removeProvider ────────────────────────────────────────────
+
+describe("ModelsJsonWriter — provider operations", () => {
+	it("sets a provider configuration", () => {
+		const writer = new ModelsJsonWriter(modelsJsonPath);
+		writer.setProvider("custom", {
+			baseUrl: "http://localhost:8080",
+			apiKey: "test-key",
+			api: "openai",
+			models: [{ id: "local-model" }],
+		});
+
+		const config = readModels() as any;
+		assert.ok(config.providers.custom);
+		assert.equal(config.providers.custom.baseUrl, "http://localhost:8080");
+	});
+
+	it("removes a provider", () => {
+		const writer = new ModelsJsonWriter(modelsJsonPath);
+		writer.setProvider("custom", { baseUrl: "http://localhost:8080" });
+		writer.removeProvider("custom");
+
+		const config = readModels() as any;
+		assert.equal(config.providers.custom, undefined);
+	});
+
+	it("handles removing nonexistent provider", () => {
+		const writer = new ModelsJsonWriter(modelsJsonPath);
+		writer.removeProvider("nonexistent");
+		// Should not throw
+	});
+});
+
+// ─── listProviders ───────────────────────────────────────────────────────────
+
+describe("ModelsJsonWriter — listProviders", () => {
+	it("returns empty config when file does not exist", () => {
+		const writer = new ModelsJsonWriter(join(testDir, "nonexistent.json"));
+		const config = writer.listProviders();
+		assert.deepEqual(config, { providers: {} });
+	});
+
+	it("returns current provider config", () => {
+		const writer = new ModelsJsonWriter(modelsJsonPath);
+		writer.setProvider("openai", { baseUrl: "https://api.openai.com" });
+		writer.setProvider("ollama", { baseUrl: "http://localhost:11434" });
+
+		const config = writer.listProviders();
+		assert.ok(config.providers.openai);
+		assert.ok(config.providers.ollama);
+	});
+});
--- a/packages/pi-coding-agent/src/core/models-json-writer.ts
+++ b/packages/pi-coding-agent/src/core/models-json-writer.ts
@ -0,0 +1,188 @@
+/**
+ * Safe read-modify-write for models.json with file locking.
+ * Prevents concurrent writes from corrupting the config file.
+ */
+
+import { existsSync, mkdirSync, readFileSync, writeFileSync } from "fs";
+import { dirname, join } from "path";
+import lockfile from "proper-lockfile";
+import { getAgentDir } from "../config.js";
+
+interface ModelDefinition {
+	id: string;
+	name?: string;
+	api?: string;
+	baseUrl?: string;
+	reasoning?: boolean;
+	input?: ("text" | "image")[];
+	cost?: { input: number; output: number; cacheRead: number; cacheWrite: number };
+	contextWindow?: number;
+	maxTokens?: number;
+}
+
+interface ProviderConfig {
+	baseUrl?: string;
+	apiKey?: string;
+	api?: string;
+	headers?: Record<string, string>;
+	authHeader?: boolean;
+	models?: ModelDefinition[];
+	modelOverrides?: Record<string, Record<string, unknown>>;
+}
+
+interface ModelsConfig {
+	providers: Record<string, ProviderConfig>;
+}
+
+export class ModelsJsonWriter {
+	private modelsJsonPath: string;
+
+	constructor(modelsJsonPath?: string) {
+		this.modelsJsonPath = modelsJsonPath ?? join(getAgentDir(), "models.json");
+	}
+
+	/**
+	 * Add a model to a provider. Creates the provider if it doesn't exist.
+	 */
+	addModel(provider: string, model: ModelDefinition, providerConfig?: Partial<ProviderConfig>): void {
+		this.withLock((config) => {
+			if (!config.providers[provider]) {
+				config.providers[provider] = {
+					...providerConfig,
+					models: [],
+				};
+			}
+
+			const providerEntry = config.providers[provider];
+			if (!providerEntry.models) {
+				providerEntry.models = [];
+			}
+
+			// Replace existing model with same id, or append
+			const existingIndex = providerEntry.models.findIndex((m) => m.id === model.id);
+			if (existingIndex >= 0) {
+				providerEntry.models[existingIndex] = model;
+			} else {
+				providerEntry.models.push(model);
+			}
+
+			return config;
+		});
+	}
+
+	/**
+	 * Remove a model from a provider. Removes the provider if no models remain.
+	 */
+	removeModel(provider: string, modelId: string): void {
+		this.withLock((config) => {
+			const providerEntry = config.providers[provider];
+			if (!providerEntry?.models) return config;
+
+			providerEntry.models = providerEntry.models.filter((m) => m.id !== modelId);
+
+			// Clean up empty provider (no models and no overrides)
+			if (providerEntry.models.length === 0 && !providerEntry.modelOverrides) {
+				delete config.providers[provider];
+			}
+
+			return config;
+		});
+	}
+
+	/**
+	 * Set or update an entire provider configuration.
+	 */
+	setProvider(provider: string, providerConfig: ProviderConfig): void {
+		this.withLock((config) => {
+			config.providers[provider] = providerConfig;
+			return config;
+		});
+	}
+
+	/**
+	 * Remove a provider and all its models.
+	 */
+	removeProvider(provider: string): void {
+		this.withLock((config) => {
+			delete config.providers[provider];
+			return config;
+		});
+	}
+
+	/**
+	 * List all providers and their configurations.
+	 */
+	listProviders(): ModelsConfig {
+		return this.readConfig();
+	}
+
+	private readConfig(): ModelsConfig {
+		if (!existsSync(this.modelsJsonPath)) {
+			return { providers: {} };
+		}
+		try {
+			const content = readFileSync(this.modelsJsonPath, "utf-8");
+			return JSON.parse(content) as ModelsConfig;
+		} catch {
+			return { providers: {} };
+		}
+	}
+
+	private writeConfig(config: ModelsConfig): void {
+		const dir = dirname(this.modelsJsonPath);
+		if (!existsSync(dir)) {
+			mkdirSync(dir, { recursive: true });
+		}
+		writeFileSync(this.modelsJsonPath, JSON.stringify(config, null, 2), "utf-8");
+	}
+
+	private acquireLockWithRetry(): () => void {
+		const maxAttempts = 10;
+		const delayMs = 20;
+		let lastError: unknown;
+
+		// Ensure file exists for locking
+		const dir = dirname(this.modelsJsonPath);
+		if (!existsSync(dir)) {
+			mkdirSync(dir, { recursive: true });
+		}
+		if (!existsSync(this.modelsJsonPath)) {
+			writeFileSync(this.modelsJsonPath, JSON.stringify({ providers: {} }, null, 2), "utf-8");
+		}
+
+		for (let attempt = 1; attempt <= maxAttempts; attempt++) {
+			try {
+				return lockfile.lockSync(this.modelsJsonPath, { realpath: false });
+			} catch (error) {
+				const code =
+					typeof error === "object" && error !== null && "code" in error
+						? String((error as { code?: unknown }).code)
+						: undefined;
+				if (code !== "ELOCKED" || attempt === maxAttempts) {
+					throw error;
+				}
+				lastError = error;
+				const start = Date.now();
+				while (Date.now() - start < delayMs) {
+					// Busy-wait (same pattern as auth-storage.ts)
+				}
+			}
+		}
+
+		throw (lastError as Error) ?? new Error("Failed to acquire models.json lock");
+	}
+
+	private withLock(fn: (config: ModelsConfig) => ModelsConfig): void {
+		let release: (() => void) | undefined;
+		try {
+			release = this.acquireLockWithRetry();
+			const config = this.readConfig();
+			const updated = fn(config);
+			this.writeConfig(updated);
+		} finally {
+			if (release) {
+				release();
+			}
+		}
+	}
+}
--- a/packages/pi-coding-agent/src/core/settings-manager.ts
+++ b/packages/pi-coding-agent/src/core/settings-manager.ts
@ -79,6 +79,13 @@ export interface FallbackSettings {
 	chains?: Record<string, FallbackChainEntry[]>; // keyed by chain name
 }

+export interface ModelDiscoverySettings {
+	enabled?: boolean; // default: false
+	providers?: string[]; // limit discovery to specific providers
+	ttlMinutes?: number; // override default TTLs (in minutes)
+	autoRefreshOnModelSelect?: boolean; // default: false - refresh discovery when opening model selector
+}
+
 export type TransportSetting = Transport;

 /**
@ -134,6 +141,7 @@ export interface Settings {
 	bashInterceptor?: BashInterceptorSettings;
 	taskIsolation?: TaskIsolationSettings;
 	fallback?: FallbackSettings;
+	modelDiscovery?: ModelDiscoverySettings;
 }

 /** Deep merge settings: project/overrides take precedence, nested objects merge recursively */
@ -1076,4 +1084,17 @@ export class SettingsManager {
 			chains: this.getFallbackChains(),
 		};
 	}
+
+	getModelDiscoverySettings(): ModelDiscoverySettings {
+		return this.settings.modelDiscovery ?? {};
+	}
+
+	setModelDiscoveryEnabled(enabled: boolean): void {
+		if (!this.globalSettings.modelDiscovery) {
+			this.globalSettings.modelDiscovery = {};
+		}
+		this.globalSettings.modelDiscovery.enabled = enabled;
+		this.markModified("modelDiscovery", "enabled");
+		this.save();
+	}
 }
--- a/packages/pi-coding-agent/src/core/slash-commands.ts
+++ b/packages/pi-coding-agent/src/core/slash-commands.ts
@ -28,6 +28,7 @@ export const BUILTIN_SLASH_COMMANDS: ReadonlyArray<BuiltinSlashCommand> = [
 	{ name: "hotkeys", description: "Show all keyboard shortcuts" },
 	{ name: "fork", description: "Create a new fork from a previous message" },
 	{ name: "tree", description: "Navigate session tree (switch branches)" },
+	{ name: "provider", description: "Manage provider configuration" },
 	{ name: "login", description: "Login with OAuth provider" },
 	{ name: "logout", description: "Logout from OAuth provider" },
 	{ name: "new", description: "Start a new session" },
--- a/packages/pi-coding-agent/src/index.ts
+++ b/packages/pi-coding-agent/src/index.ts
@ -143,7 +143,11 @@ export {
 // Footer data provider (git branch + extension statuses - data not otherwise available to extensions)
 export type { ReadonlyFooterDataProvider } from "./core/footer-data-provider.js";
 export { convertToLlm } from "./core/messages.js";
+export { ModelDiscoveryCache } from "./core/discovery-cache.js";
+export type { DiscoveredModel, DiscoveryResult, ProviderDiscoveryAdapter } from "./core/model-discovery.js";
+export { getDiscoverableProviders, getDiscoveryAdapter } from "./core/model-discovery.js";
 export { ModelRegistry } from "./core/model-registry.js";
+export { ModelsJsonWriter } from "./core/models-json-writer.js";
 export type {
 	PackageManager,
 	PathMetadata,
@ -307,6 +311,7 @@ export {
 	LoginDialogComponent,
 	ModelSelectorComponent,
 	OAuthSelectorComponent,
+	ProviderManagerComponent,
 	type RenderDiffOptions,
 	rawKeyHint,
 	renderDiff,
--- a/packages/pi-coding-agent/src/main.ts
+++ b/packages/pi-coding-agent/src/main.ts
@ -11,7 +11,7 @@ import { createInterface } from "readline";
 import { type Args, parseArgs, printHelp } from "./cli/args.js";
 import { selectConfig } from "./cli/config-selector.js";
 import { processFileArguments } from "./cli/file-processor.js";
-import { listModels } from "./cli/list-models.js";
+import { discoverAndPrintModels, listModels } from "./cli/list-models.js";
 import { selectSession } from "./cli/session-picker.js";
 import { APP_NAME, getAgentDir, getModelsPath, VERSION } from "./config.js";
 import { AuthStorage } from "./core/auth-storage.js";
@ -660,9 +660,26 @@ export async function main(args: string[]) {
 		process.exit(0);
 	}

+	if (parsed.addProvider) {
+		const { ModelsJsonWriter } = await import("./core/models-json-writer.js");
+		const writer = new ModelsJsonWriter();
+		writer.setProvider(parsed.addProvider, {
+			baseUrl: parsed.addProviderBaseUrl,
+			apiKey: parsed.apiKey,
+		});
+		console.log(`Provider "${parsed.addProvider}" added to models.json`);
+		process.exit(0);
+	}
+
+	if (parsed.discoverModels !== undefined) {
+		const provider = typeof parsed.discoverModels === "string" ? parsed.discoverModels : undefined;
+		await discoverAndPrintModels(modelRegistry, provider);
+		process.exit(0);
+	}
+
 	if (parsed.listModels !== undefined) {
 		const searchPattern = typeof parsed.listModels === "string" ? parsed.listModels : undefined;
-		await listModels(modelRegistry, searchPattern);
+		await listModels(modelRegistry, { searchPattern, discover: parsed.discover });
 		process.exit(0);
 	}

--- a/packages/pi-coding-agent/src/modes/interactive/components/index.ts
+++ b/packages/pi-coding-agent/src/modes/interactive/components/index.ts
@ -18,6 +18,7 @@ export { appKey, appKeyHint, editorKey, keyHint, rawKeyHint } from "./keybinding
 export { LoginDialogComponent } from "./login-dialog.js";
 export { ModelSelectorComponent } from "./model-selector.js";
 export { OAuthSelectorComponent } from "./oauth-selector.js";
+export { ProviderManagerComponent } from "./provider-manager.js";
 export { type ModelsCallbacks, type ModelsConfig, ScopedModelsSelectorComponent } from "./scoped-models-selector.js";
 export { SessionSelectorComponent } from "./session-selector.js";
 export { type SettingsCallbacks, type SettingsConfig, SettingsSelectorComponent } from "./settings-selector.js";
--- a/packages/pi-coding-agent/src/modes/interactive/components/model-selector.ts
+++ b/packages/pi-coding-agent/src/modes/interactive/components/model-selector.ts
@ -160,7 +160,7 @@ export class ModelSelectorComponent extends Container implements Focusable {

 		// Load available models (built-in models still work even if models.json failed)
 		try {
-			const availableModels = await this.modelRegistry.getAvailable();
+			const availableModels = this.modelRegistry.getAvailable();
 			models = availableModels.map((model: Model<any>) => ({
 				provider: model.provider,
 				id: model.id,
--- a/packages/pi-coding-agent/src/modes/interactive/components/provider-manager.ts
+++ b/packages/pi-coding-agent/src/modes/interactive/components/provider-manager.ts
@ -0,0 +1,163 @@
+/**
+ * TUI component for managing provider configurations.
+ * Shows providers with auth status, discovery support, and model counts.
+ */
+
+import {
+	Container,
+	type Focusable,
+	getEditorKeybindings,
+	Spacer,
+	Text,
+	type TUI,
+} from "@gsd/pi-tui";
+import type { AuthStorage } from "../../../core/auth-storage.js";
+import { getDiscoverableProviders } from "../../../core/model-discovery.js";
+import type { ModelRegistry } from "../../../core/model-registry.js";
+import { theme } from "../theme/theme.js";
+import { rawKeyHint } from "./keybinding-hints.js";
+
+interface ProviderInfo {
+	name: string;
+	hasAuth: boolean;
+	supportsDiscovery: boolean;
+	modelCount: number;
+}
+
+export class ProviderManagerComponent extends Container implements Focusable {
+	private _focused = false;
+	get focused(): boolean {
+		return this._focused;
+	}
+	set focused(value: boolean) {
+		this._focused = value;
+	}
+
+	private providers: ProviderInfo[] = [];
+	private selectedIndex = 0;
+	private listContainer: Container;
+	private tui: TUI;
+	private authStorage: AuthStorage;
+	private modelRegistry: ModelRegistry;
+	private onDone: () => void;
+	private onDiscover: (provider: string) => void;
+
+	constructor(
+		tui: TUI,
+		authStorage: AuthStorage,
+		modelRegistry: ModelRegistry,
+		onDone: () => void,
+		onDiscover: (provider: string) => void,
+	) {
+		super();
+
+		this.tui = tui;
+		this.authStorage = authStorage;
+		this.modelRegistry = modelRegistry;
+		this.onDone = onDone;
+		this.onDiscover = onDiscover;
+
+		// Header
+		this.addChild(new Text(theme.fg("accent", "Provider Manager"), 0, 0));
+		this.addChild(new Spacer(1));
+
+		// Hints
+		const hints = [
+			rawKeyHint("d", "discover"),
+			rawKeyHint("r", "remove auth"),
+			rawKeyHint("esc", "close"),
+		].join("  ");
+		this.addChild(new Text(hints, 0, 0));
+		this.addChild(new Spacer(1));
+
+		// List
+		this.listContainer = new Container();
+		this.addChild(this.listContainer);
+
+		this.loadProviders();
+		this.updateList();
+	}
+
+	private loadProviders(): void {
+		const discoverableSet = new Set(getDiscoverableProviders());
+		const allModels = this.modelRegistry.getAll();
+
+		// Group models by provider
+		const providerModelCounts = new Map<string, number>();
+		for (const model of allModels) {
+			providerModelCounts.set(model.provider, (providerModelCounts.get(model.provider) ?? 0) + 1);
+		}
+
+		// Build provider list from all known providers
+		const providerNames = new Set([
+			...providerModelCounts.keys(),
+			...discoverableSet,
+		]);
+
+		this.providers = Array.from(providerNames)
+			.sort()
+			.map((name) => ({
+				name,
+				hasAuth: this.authStorage.hasAuth(name),
+				supportsDiscovery: discoverableSet.has(name),
+				modelCount: providerModelCounts.get(name) ?? 0,
+			}));
+	}
+
+	private updateList(): void {
+		this.listContainer.clear();
+
+		for (let i = 0; i < this.providers.length; i++) {
+			const p = this.providers[i];
+			const isSelected = i === this.selectedIndex;
+
+			const authBadge = p.hasAuth ? theme.fg("success", "[auth]") : theme.fg("muted", "[no auth]");
+			const discoveryBadge = p.supportsDiscovery ? theme.fg("accent", "[discovery]") : "";
+			const countBadge = theme.fg("muted", `(${p.modelCount} models)`);
+
+			const prefix = isSelected ? theme.fg("accent", "> ") : "  ";
+			const nameText = isSelected ? theme.fg("accent", p.name) : p.name;
+
+			const parts = [prefix, nameText, " ", authBadge];
+			if (discoveryBadge) parts.push(" ", discoveryBadge);
+			parts.push(" ", countBadge);
+
+			this.listContainer.addChild(new Text(parts.join(""), 0, 0));
+		}
+
+		if (this.providers.length === 0) {
+			this.listContainer.addChild(new Text(theme.fg("muted", "  No providers configured"), 0, 0));
+		}
+	}
+
+	handleInput(keyData: string): void {
+		const kb = getEditorKeybindings();
+
+		if (kb.matches(keyData, "selectUp")) {
+			if (this.providers.length === 0) return;
+			this.selectedIndex = this.selectedIndex === 0 ? this.providers.length - 1 : this.selectedIndex - 1;
+			this.updateList();
+			this.tui.requestRender();
+		} else if (kb.matches(keyData, "selectDown")) {
+			if (this.providers.length === 0) return;
+			this.selectedIndex = this.selectedIndex === this.providers.length - 1 ? 0 : this.selectedIndex + 1;
+			this.updateList();
+			this.tui.requestRender();
+		} else if (kb.matches(keyData, "selectCancel")) {
+			this.onDone();
+		} else if (keyData === "d" || keyData === "D") {
+			const provider = this.providers[this.selectedIndex];
+			if (provider?.supportsDiscovery) {
+				this.onDiscover(provider.name);
+			}
+		} else if (keyData === "r" || keyData === "R") {
+			const provider = this.providers[this.selectedIndex];
+			if (provider?.hasAuth) {
+				this.authStorage.remove(provider.name);
+				this.loadProviders();
+				this.updateList();
+				this.tui.requestRender();
+			}
+		}
+	}
+}
--- a/packages/pi-coding-agent/src/modes/interactive/interactive-mode.ts
+++ b/packages/pi-coding-agent/src/modes/interactive/interactive-mode.ts
@ -83,6 +83,7 @@ import { appKey, appKeyHint, editorKey, formatKeyForDisplay, keyHint, rawKeyHint
 import { LoginDialogComponent } from "./components/login-dialog.js";
 import { ModelSelectorComponent } from "./components/model-selector.js";
 import { OAuthSelectorComponent } from "./components/oauth-selector.js";
+import { ProviderManagerComponent } from "./components/provider-manager.js";
 import { ScopedModelsSelectorComponent } from "./components/scoped-models-selector.js";
 import { SessionSelectorComponent } from "./components/session-selector.js";
 import { SelectSubmenu, SettingsSelectorComponent, THINKING_DESCRIPTIONS } from "./components/settings-selector.js";
@ -1997,6 +1998,11 @@ export class InteractiveMode {
 				this.editor.setText("");
 				return;
 			}
+			if (text === "/provider") {
+				this.showProviderManager();
+				this.editor.setText("");
+				return;
+			}
 			if (text === "/login") {
 				this.showOAuthSelector("login");
 				this.editor.setText("");
@ -3746,6 +3752,37 @@ export class InteractiveMode {
 		this.showStatus("Resumed session");
 	}

+	private showProviderManager(): void {
+		this.showSelector((done) => {
+			const component = new ProviderManagerComponent(
+				this.ui,
+				this.session.modelRegistry.authStorage,
+				this.session.modelRegistry,
+				() => {
+					done();
+					this.ui.requestRender();
+				},
+				async (provider: string) => {
+					this.showStatus(`Discovering models for ${provider}...`);
+					try {
+						const results = await this.session.modelRegistry.discoverModels([provider]);
+						const result = results[0];
+						if (result?.error) {
+							this.showError(`Discovery failed: ${result.error}`);
+						} else {
+							this.showStatus(`Discovered ${result?.models.length ?? 0} models from ${provider}`);
+						}
+					} catch (error) {
+						this.showError(error instanceof Error ? error.message : String(error));
+					}
+					done();
+					this.ui.requestRender();
+				},
+			);
+			return { component, focus: component };
+		});
+	}
+
 	private async showOAuthSelector(mode: "login" | "logout"): Promise<void> {
 		if (mode === "logout") {
 			const providers = this.session.modelRegistry.authStorage.list();
--- a/src/onboarding.ts
+++ b/src/onboarding.ts
@ -747,7 +747,7 @@ async function runRemoteQuestionsStep(
    })
    if (p.isCancel(channelId) || !channelId) return null

-    const { saveRemoteQuestionsConfig } = await import('./resources/extensions/remote-questions/remote-command.js')
+    const { saveRemoteQuestionsConfig } = await import('./remote-questions-config.js')
    saveRemoteQuestionsConfig('slack', (channelId as string).trim())
    p.log.success(`Slack channel: ${pc.green((channelId as string).trim())}`)
    return 'Slack'
@ -852,7 +852,7 @@ async function runDiscordChannelStep(p: ClackModule, pc: PicoModule, token: stri
  }

  // Save remote questions config
-  const { saveRemoteQuestionsConfig } = await import('./resources/extensions/remote-questions/remote-command.js')
+  const { saveRemoteQuestionsConfig } = await import('./remote-questions-config.js')
  saveRemoteQuestionsConfig('discord', channelId)
  const channelName = channels.find(ch => ch.id === channelId)?.name
  p.log.success(`Discord channel: ${pc.green(channelName ? `#${channelName}` : channelId)}`)
--- a/src/remote-questions-config.ts
+++ b/src/remote-questions-config.ts
@ -0,0 +1,40 @@
+/**
+ * Remote Questions Config Helper
+ *
+ * Extracted from remote-questions extension so onboarding.ts can import
+ * it without crossing the compiled/uncompiled boundary. The extension
+ * files in src/resources/ are shipped as raw .ts and loaded via jiti,
+ * but onboarding.ts is compiled by tsc — dynamic imports from compiled
+ * JS to uncompiled .ts fail at runtime (#592).
+ */
+
+import { existsSync, readFileSync, writeFileSync, mkdirSync } from "node:fs";
+import { dirname } from "node:path";
+import { getGlobalGSDPreferencesPath } from "./resources/extensions/gsd/preferences.js";
+
+export function saveRemoteQuestionsConfig(channel: "slack" | "discord", channelId: string): void {
+  const prefsPath = getGlobalGSDPreferencesPath();
+  const block = [
+    "remote_questions:",
+    `  channel: ${channel}`,
+    `  channel_id: "${channelId}"`,
+    "  timeout_minutes: 5",
+    "  poll_interval_seconds: 5",
+  ].join("\n");
+
+  const content = existsSync(prefsPath) ? readFileSync(prefsPath, "utf-8") : "";
+  const fmMatch = content.match(/^---\n([\s\S]*?)\n---/);
+  let next = content;
+
+  if (fmMatch) {
+    let frontmatter = fmMatch[1];
+    const regex = /remote_questions:[\s\S]*?(?=\n[a-zA-Z_]|\n---|$)/;
+    frontmatter = regex.test(frontmatter) ? frontmatter.replace(regex, block) : `${frontmatter.trimEnd()}\n${block}`;
+    next = `---\n${frontmatter}\n---${content.slice(fmMatch[0].length)}`;
+  } else {
+    next = `---\n${block}\n---\n\n${content}`;
+  }
+
+  mkdirSync(dirname(prefsPath), { recursive: true });
+  writeFileSync(prefsPath, next, "utf-8");
+}
--- a/src/resources/extensions/gsd/activity-log.ts
+++ b/src/resources/extensions/gsd/activity-log.ts
@ -8,7 +8,7 @@
 * Diagnostic extraction is handled by session-forensics.ts.
 */

-import { writeFileSync, mkdirSync, readdirSync, unlinkSync, statSync, openSync, closeSync, constants } from "node:fs";
+import { writeFileSync, writeSync, mkdirSync, readdirSync, unlinkSync, statSync, openSync, closeSync, constants } from "node:fs";
 import { createHash } from "node:crypto";
 import { join } from "node:path";

@ -23,6 +23,15 @@ interface ActivityLogState {

 const activityLogState = new Map<string, ActivityLogState>();

+/**
+ * Clear accumulated activity log state (#611).
+ * Call when auto-mode stops to prevent unbounded memory growth
+ * from lastSnapshotKeyByUnit maps accumulating across units.
+ */
+export function clearActivityLogState(): void {
+  activityLogState.clear();
+}
+
 function scanNextSequence(activityDir: string): number {
  let maxSeq = 0;
  try {
@ -46,9 +55,21 @@ function getActivityState(activityDir: string): ActivityLogState {
  return state;
 }

-function snapshotKey(unitType: string, unitId: string, content: string): string {
-  const digest = createHash("sha1").update(content).digest("hex");
-  return `${unitType}\0${unitId}\0${digest}`;
+/**
+ * Build a lightweight dedup key from session entries without serializing
+ * the entire content to a string (#611). Uses entry count + hash of
+ * the last few entries as a fingerprint instead of hashing megabytes.
+ */
+function snapshotKey(unitType: string, unitId: string, entries: unknown[]): string {
+  const hash = createHash("sha1");
+  hash.update(`${unitType}\0${unitId}\0${entries.length}\0`);
+  // Hash only the last 3 entries as a fingerprint — if the session grew,
+  // the count change alone detects it; if content changed, the tail hash catches it.
+  const tail = entries.slice(-3);
+  for (const entry of tail) {
+    hash.update(JSON.stringify(entry));
+  }
+  return hash.digest("hex");
 }

 function nextActivityFilePath(
@ -91,14 +112,23 @@ export function saveActivityLog(
    mkdirSync(activityDir, { recursive: true });

    const safeUnitId = unitId.replace(/\//g, "-");
-    const content = `${entries.map(entry => JSON.stringify(entry)).join("\n")}\n`;
    const state = getActivityState(activityDir);
    const unitKey = `${unitType}\0${safeUnitId}`;
-    const key = snapshotKey(unitType, safeUnitId, content);
+    // Use lightweight fingerprint instead of serializing all entries (#611)
+    const key = snapshotKey(unitType, safeUnitId, entries);
    if (state.lastSnapshotKeyByUnit.get(unitKey) === key) return;

    const filePath = nextActivityFilePath(activityDir, state, unitType, safeUnitId);
-    writeFileSync(filePath, content, "utf-8");
+    // Stream entries to disk line-by-line instead of building one massive string (#611).
+    // For large sessions, the single-string approach allocated hundreds of MB.
+    const fd = openSync(filePath, "w");
+    try {
+      for (const entry of entries) {
+        writeSync(fd, JSON.stringify(entry) + "\n");
+      }
+    } finally {
+      closeSync(fd);
+    }
    state.nextSeq += 1;
    state.lastSnapshotKeyByUnit.set(unitKey, key);
  } catch (e) {
--- a/src/resources/extensions/gsd/auto-dashboard.ts
+++ b/src/resources/extensions/gsd/auto-dashboard.ts
@ -10,7 +10,7 @@ import type { ExtensionContext, ExtensionCommandContext } from "@gsd/pi-coding-a
 import type { GSDState } from "./types.js";
 import { getCurrentBranch } from "./worktree.js";
 import { getActiveHook } from "./post-unit-hooks.js";
-import { getLedger, getProjectTotals, formatCost, formatTokenCount } from "./metrics.js";
+import { getLedger, getProjectTotals, formatCost, formatTokenCount, formatTierSavings } from "./metrics.js";
 import {
  resolveMilestoneFile,
  resolveSliceFile,
@ -39,6 +39,8 @@ export interface AutoDashboardData {
  projectedRemainingCost?: number;
  /** Whether token profile has been auto-downgraded due to budget prediction */
  profileDowngraded?: boolean;
+  /** Number of pending captures awaiting triage (0 if none or file missing) */
+  pendingCaptureCount: number;
 }

 // ─── Unit Description Helpers ─────────────────────────────────────────────────
@ -239,6 +241,7 @@ export function updateProgressWidget(
  unitId: string,
  state: GSDState,
  accessors: WidgetStateAccessors,
+  tierBadge?: string,
 ): void {
  if (!ctx.hasUI) return;

@ -319,7 +322,8 @@ export function updateProgressWidget(

        const target = task ? `${task.id}: ${task.title}` : unitId;
        const actionLeft = `${pad}${theme.fg("accent", "▸")} ${theme.fg("accent", verb)}  ${theme.fg("text", target)}`;
-        const phaseBadge = theme.fg("dim", phaseLabel);
+        const tierTag = tierBadge ? theme.fg("dim", `[${tierBadge}] `) : "";
+        const phaseBadge = `${tierTag}${theme.fg("dim", phaseLabel)}`;
        lines.push(rightAlign(actionLeft, phaseBadge, width));
        lines.push("");

@ -414,6 +418,14 @@ export function updateProgressWidget(
            ? `${modelPhase}${theme.fg("dim", modelDisplay)}`
            : "";
          lines.push(rightAlign(`${pad}${sLeft}`, sRight, width));
+
+          // Dynamic routing savings summary
+          if (mLedger && mLedger.units.some(u => u.tier)) {
+            const savings = formatTierSavings(mLedger.units);
+            if (savings) {
+              lines.push(truncateToWidth(theme.fg("dim", `${pad}${savings}`), width));
+            }
+          }
        }

        const hintParts: string[] = [];
--- a/src/resources/extensions/gsd/auto-prompts.ts
+++ b/src/resources/extensions/gsd/auto-prompts.ts
@ -96,7 +96,7 @@ export async function inlineDependencySummaries(
 export async function inlineGsdRootFile(
  base: string, filename: string, label: string,
 ): Promise<string | null> {
-  const key = filename.replace(/\.md$/i, "").toUpperCase() as "PROJECT" | "DECISIONS" | "QUEUE" | "STATE" | "REQUIREMENTS";
+  const key = filename.replace(/\.md$/i, "").toUpperCase() as "PROJECT" | "DECISIONS" | "QUEUE" | "STATE" | "REQUIREMENTS" | "KNOWLEDGE";
  const absPath = resolveGsdRootFile(base, key);
  if (!existsSync(absPath)) return null;
  return inlineFileOptional(absPath, relGsdRootFile(key), label);
@ -384,6 +384,8 @@ export async function buildResearchMilestonePrompt(mid: string, midTitle: string
  if (requirementsInline) inlined.push(requirementsInline);
  const decisionsInline = await inlineGsdRootFile(base, "decisions.md", "Decisions");
  if (decisionsInline) inlined.push(decisionsInline);
+  const knowledgeInlineRM = await inlineGsdRootFile(base, "knowledge.md", "Project Knowledge");
+  if (knowledgeInlineRM) inlined.push(knowledgeInlineRM);
  inlined.push(inlineTemplate("research", "Research"));

  const inlinedContext = `## Inlined Context (preloaded — do not re-read these files)\n\n${inlined.join("\n\n---\n\n")}`;
@ -394,7 +396,7 @@ export async function buildResearchMilestonePrompt(mid: string, midTitle: string
    milestoneId: mid, milestoneTitle: midTitle,
    milestonePath: relMilestonePath(base, mid),
    contextPath: contextRel,
-    outputPath: outputRelPath,
+    outputPath: join(base, outputRelPath),
    inlinedContext,
    ...buildSkillDiscoveryVars(),
  });
@ -420,6 +422,8 @@ export async function buildPlanMilestonePrompt(mid: string, midTitle: string, ba
  if (requirementsInline) inlined.push(requirementsInline);
  const decisionsInline = inlineLevel !== "minimal" ? await inlineGsdRootFile(base, "decisions.md", "Decisions") : null;
  if (decisionsInline) inlined.push(decisionsInline);
+  const knowledgeInlinePM = await inlineGsdRootFile(base, "knowledge.md", "Project Knowledge");
+  if (knowledgeInlinePM) inlined.push(knowledgeInlinePM);
  inlined.push(inlineTemplate("roadmap", "Roadmap"));
  if (inlineLevel === "full") {
    inlined.push(inlineTemplate("decisions", "Decisions"));
@ -435,14 +439,14 @@ export async function buildPlanMilestonePrompt(mid: string, midTitle: string, ba
  const inlinedContext = `## Inlined Context (preloaded — do not re-read these files)\n\n${inlined.join("\n\n---\n\n")}`;

  const outputRelPath = relMilestoneFile(base, mid, "ROADMAP");
-  const secretsOutputPath = relMilestoneFile(base, mid, "SECRETS");
+  const secretsOutputPath = join(base, relMilestoneFile(base, mid, "SECRETS"));
  return loadPrompt("plan-milestone", {
    workingDirectory: base,
    milestoneId: mid, milestoneTitle: midTitle,
    milestonePath: relMilestonePath(base, mid),
    contextPath: contextRel,
    researchPath: researchRel,
-    outputPath: outputRelPath,
+    outputPath: join(base, outputRelPath),
    secretsOutputPath,
    inlinedContext,
  });
@ -468,6 +472,8 @@ export async function buildResearchSlicePrompt(
  if (decisionsInline) inlined.push(decisionsInline);
  const requirementsInline = await inlineGsdRootFile(base, "requirements.md", "Requirements");
  if (requirementsInline) inlined.push(requirementsInline);
+  const knowledgeInlineRS = await inlineGsdRootFile(base, "knowledge.md", "Project Knowledge");
+  if (knowledgeInlineRS) inlined.push(knowledgeInlineRS);
  inlined.push(inlineTemplate("research", "Research"));

  const depContent = await inlineDependencySummaries(mid, sid, base);
@ -485,7 +491,7 @@ export async function buildResearchSlicePrompt(
    roadmapPath: roadmapRel,
    contextPath: contextRel,
    milestoneResearchPath: milestoneResearchRel,
-    outputPath: outputRelPath,
+    outputPath: join(base, outputRelPath),
    inlinedContext,
    dependencySummaries: depContent,
    ...buildSkillDiscoveryVars(),
@ -511,6 +517,8 @@ export async function buildPlanSlicePrompt(
    const requirementsInline = await inlineGsdRootFile(base, "requirements.md", "Requirements");
    if (requirementsInline) inlined.push(requirementsInline);
  }
+  const knowledgeInlinePS = await inlineGsdRootFile(base, "knowledge.md", "Project Knowledge");
+  if (knowledgeInlinePS) inlined.push(knowledgeInlinePS);
  inlined.push(inlineTemplate("plan", "Slice Plan"));
  if (inlineLevel === "full") {
    inlined.push(inlineTemplate("task-plan", "Task Plan"));
@ -530,7 +538,7 @@ export async function buildPlanSlicePrompt(
    slicePath: relSlicePath(base, mid, sid),
    roadmapPath: roadmapRel,
    researchPath: researchRel,
-    outputPath: outputRelPath,
+    outputPath: join(base, outputRelPath),
    inlinedContext,
    dependencySummaries: depContent,
  });
@ -585,14 +593,19 @@ export async function buildExecuteTaskPrompt(
    ? priorSummaries.slice(-1)
    : priorSummaries;
  const carryForwardSection = await buildCarryForwardSection(effectivePriorSummaries, base);
+
+  // Inline project knowledge if available
+  const knowledgeInlineET = await inlineGsdRootFile(base, "knowledge.md", "Project Knowledge");
+
  const inlinedTemplates = inlineLevel === "minimal"
    ? inlineTemplate("task-summary", "Task Summary")
    : [
        inlineTemplate("task-summary", "Task Summary"),
        inlineTemplate("decisions", "Decisions"),
+        ...(knowledgeInlineET ? [knowledgeInlineET] : []),
      ].join("\n\n---\n\n");

-  const taskSummaryPath = `${relSlicePath(base, mid, sid)}/tasks/${tid}-SUMMARY.md`;
+  const taskSummaryPath = join(base, `${relSlicePath(base, mid, sid)}/tasks/${tid}-SUMMARY.md`);

  const activeOverrides = await loadActiveOverrides(base);
  const overridesSection = formatOverridesSection(activeOverrides);
@ -601,7 +614,7 @@ export async function buildExecuteTaskPrompt(
    overridesSection,
    workingDirectory: base,
    milestoneId: mid, sliceId: sid, sliceTitle: sTitle, taskId: tid, taskTitle: tTitle,
-    planPath: relSliceFile(base, mid, sid, "PLAN"),
+    planPath: join(base, relSliceFile(base, mid, sid, "PLAN")),
    slicePath: relSlicePath(base, mid, sid),
    taskPlanPath: taskPlanRelPath,
    taskPlanInline,
@ -631,6 +644,8 @@ export async function buildCompleteSlicePrompt(
    const requirementsInline = await inlineGsdRootFile(base, "requirements.md", "Requirements");
    if (requirementsInline) inlined.push(requirementsInline);
  }
+  const knowledgeInlineCS = await inlineGsdRootFile(base, "knowledge.md", "Project Knowledge");
+  if (knowledgeInlineCS) inlined.push(knowledgeInlineCS);

  // Inline all task summaries for this slice
  const tDir = resolveTasksDir(base, mid, sid);
@ -657,14 +672,14 @@ export async function buildCompleteSlicePrompt(
  const inlinedContext = `## Inlined Context (preloaded — do not re-read these files)\n\n${inlined.join("\n\n---\n\n")}`;

  const sliceRel = relSlicePath(base, mid, sid);
-  const sliceSummaryPath = `${sliceRel}/${sid}-SUMMARY.md`;
-  const sliceUatPath = `${sliceRel}/${sid}-UAT.md`;
+  const sliceSummaryPath = join(base, `${sliceRel}/${sid}-SUMMARY.md`);
+  const sliceUatPath = join(base, `${sliceRel}/${sid}-UAT.md`);

  return loadPrompt("complete-slice", {
    workingDirectory: base,
    milestoneId: mid, sliceId: sid, sliceTitle: sTitle,
    slicePath: sliceRel,
-    roadmapPath: roadmapRel,
+    roadmapPath: join(base, roadmapRel),
    inlinedContext,
    sliceSummaryPath,
    sliceUatPath,
@ -704,6 +719,8 @@ export async function buildCompleteMilestonePrompt(
    const projectInline = await inlineGsdRootFile(base, "project.md", "Project");
    if (projectInline) inlined.push(projectInline);
  }
+  const knowledgeInlineCM = await inlineGsdRootFile(base, "knowledge.md", "Project Knowledge");
+  if (knowledgeInlineCM) inlined.push(knowledgeInlineCM);
  // Inline milestone context file (milestone-level, not GSD root)
  const contextPath = resolveMilestoneFile(base, mid, "CONTEXT");
  const contextRel = relMilestoneFile(base, mid, "CONTEXT");
@ -713,7 +730,7 @@ export async function buildCompleteMilestonePrompt(

  const inlinedContext = `## Inlined Context (preloaded — do not re-read these files)\n\n${inlined.join("\n\n---\n\n")}`;

-  const milestoneSummaryPath = `${relMilestonePath(base, mid)}/${mid}-SUMMARY.md`;
+  const milestoneSummaryPath = join(base, `${relMilestonePath(base, mid)}/${mid}-SUMMARY.md`);

  return loadPrompt("complete-milestone", {
    workingDirectory: base,
@ -765,7 +782,21 @@ export async function buildReplanSlicePrompt(

  const inlinedContext = `## Inlined Context (preloaded — do not re-read these files)\n\n${inlined.join("\n\n---\n\n")}`;

-  const replanPath = `${relSlicePath(base, mid, sid)}/${sid}-REPLAN.md`;
+  const replanPath = join(base, `${relSlicePath(base, mid, sid)}/${sid}-REPLAN.md`);
+
+  // Build capture context for replan prompt (captures that triggered this replan)
+  let captureContext = "(none)";
+  try {
+    const { loadReplanCaptures } = await import("./triage-resolution.js");
+    const replanCaptures = loadReplanCaptures(base);
+    if (replanCaptures.length > 0) {
+      captureContext = replanCaptures.map(c =>
+        `- **${c.id}**: "${c.text}" — ${c.rationale ?? "no rationale"}`
+      ).join("\n");
+    }
+  } catch {
+    // Non-fatal — captures module may not be available
+  }

  return loadPrompt("replan-slice", {
    workingDirectory: base,
@ -773,10 +804,11 @@ export async function buildReplanSlicePrompt(
    sliceId: sid,
    sliceTitle: sTitle,
    slicePath: relSlicePath(base, mid, sid),
-    planPath: slicePlanRel,
+    planPath: join(base, slicePlanRel),
    blockerTaskId,
    inlinedContext,
    replanPath,
+    captureContext,
  });
 }

@ -798,7 +830,7 @@ export async function buildRunUatPrompt(

  const inlinedContext = `## Inlined Context (preloaded — do not re-read these files)\n\n${inlined.join("\n\n---\n\n")}`;

-  const uatResultPath = relSliceFile(base, mid, sliceId, "UAT-RESULT");
+  const uatResultPath = join(base, relSliceFile(base, mid, sliceId, "UAT-RESULT"));
  const uatType = extractUatType(uatContent) ?? "human-experience";

  return loadPrompt("run-uat", {
@ -832,10 +864,26 @@ export async function buildReassessRoadmapPrompt(
    const decisionsInline = await inlineGsdRootFile(base, "decisions.md", "Decisions");
    if (decisionsInline) inlined.push(decisionsInline);
  }
+  const knowledgeInlineRA = await inlineGsdRootFile(base, "knowledge.md", "Project Knowledge");
+  if (knowledgeInlineRA) inlined.push(knowledgeInlineRA);

  const inlinedContext = `## Inlined Context (preloaded — do not re-read these files)\n\n${inlined.join("\n\n---\n\n")}`;

-  const assessmentPath = relSliceFile(base, mid, completedSliceId, "ASSESSMENT");
+  const assessmentPath = join(base, relSliceFile(base, mid, completedSliceId, "ASSESSMENT"));
+
+  // Build deferred captures context for reassess prompt
+  let deferredCaptures = "(none)";
+  try {
+    const { loadDeferredCaptures } = await import("./triage-resolution.js");
+    const deferred = loadDeferredCaptures(base);
+    if (deferred.length > 0) {
+      deferredCaptures = deferred.map(c =>
+        `- **${c.id}**: "${c.text}" — ${c.rationale ?? "deferred during triage"}`
+      ).join("\n");
+    }
+  } catch {
+    // Non-fatal — captures module may not be available
+  }

  return loadPrompt("reassess-roadmap", {
    workingDirectory: base,
@ -846,6 +894,7 @@ export async function buildReassessRoadmapPrompt(
    completedSliceSummaryPath: summaryRel,
    assessmentPath,
    inlinedContext,
+    deferredCaptures,
  });
 }

--- a/src/resources/extensions/gsd/auto-worktree.ts
+++ b/src/resources/extensions/gsd/auto-worktree.ts
@ -14,8 +14,10 @@ import {
  removeWorktree,
  worktreePath,
 } from "./worktree-manager.js";
+import { detectWorktreeName } from "./worktree.js";
 import {
  MergeConflictError,
+  readIntegrationBranch,
 } from "./git-service.js";
 import { parseRoadmap } from "./files.js";
 import { loadEffectiveGSDPreferences } from "./preferences.js";
@ -90,7 +92,12 @@ export function autoWorktreeBranch(milestoneId: string): string {
 */
 export function createAutoWorktree(basePath: string, milestoneId: string): string {
  const branch = autoWorktreeBranch(milestoneId);
-  const info = createWorktree(basePath, milestoneId, { branch });
+
+  // Use the integration branch recorded in META.json as the start point.
+  // This ensures the worktree branch is created from the branch the user
+  // was on when they started the milestone (e.g. f-setup-gsd-2), not main.
+  const integrationBranch = readIntegrationBranch(basePath, milestoneId) ?? undefined;
+  const info = createWorktree(basePath, milestoneId, { branch, startPoint: integrationBranch });

  // Copy .gsd/ planning artifacts from the source repo into the new worktree.
  // Worktrees are fresh git checkouts — untracked files don't carry over.
@ -224,6 +231,27 @@ export function getAutoWorktreeOriginalBase(): string | null {
  return originalBase;
 }

+export function getActiveAutoWorktreeContext(): {
+  originalBase: string;
+  worktreeName: string;
+  branch: string;
+} | null {
+  if (!originalBase) return null;
+  const cwd = process.cwd();
+  const resolvedBase = existsSync(originalBase) ? realpathSync(originalBase) : originalBase;
+  const wtDir = join(resolvedBase, ".gsd", "worktrees");
+  if (!cwd.startsWith(wtDir)) return null;
+  const worktreeName = detectWorktreeName(cwd);
+  if (!worktreeName) return null;
+  const branch = nativeGetCurrentBranch(cwd);
+  if (!branch.startsWith("milestone/")) return null;
+  return {
+    originalBase,
+    worktreeName,
+    branch,
+  };
+}
+
 // ─── Merge Milestone -> Main ───────────────────────────────────────────────

 /**
@ -279,11 +307,12 @@ export function mergeMilestoneToMain(
  const previousCwd = process.cwd();
  process.chdir(originalBasePath_);

-  // 4. Resolve main branch from preferences
+  // 4. Resolve integration branch — prefer milestone metadata, fall back to preferences / "main"
  const prefs = loadEffectiveGSDPreferences()?.preferences?.git ?? {};
-  const mainBranch = prefs.main_branch || "main";
+  const integrationBranch = readIntegrationBranch(originalBasePath_, milestoneId);
+  const mainBranch = integrationBranch ?? prefs.main_branch ?? "main";

-  // 5. Checkout main
+  // 5. Checkout integration branch
  nativeCheckoutBranch(originalBasePath_, mainBranch);

  // 6. Build rich commit message
--- a/src/resources/extensions/gsd/auto.ts
+++ b/src/resources/extensions/gsd/auto.ts
@ -19,6 +19,7 @@ import type {
 import { deriveState, invalidateStateCache } from "./state.js";
 import type { BudgetEnforcementMode, GSDState } from "./types.js";
 import { loadFile, parseRoadmap, getManifestStatus, resolveAllOverrides } from "./files.js";
+import { loadPrompt } from "./prompt-loader.js";
 export { inlinePriorMilestoneSummary } from "./files.js";
 import { collectSecretsFromManifest } from "../get-secrets-from-user.js";
 import {
@ -29,7 +30,7 @@ import {
  buildMilestoneFileName, buildSliceFileName, buildTaskFileName,
 } from "./paths.js";
 import { invalidateAllCaches } from "./cache.js";
-import { saveActivityLog } from "./activity-log.js";
+import { saveActivityLog, clearActivityLogState } from "./activity-log.js";
 import { synthesizeCrashRecovery, getDeepDiagnostic } from "./session-forensics.js";
 import { writeLock, clearLock, readCrashLock, formatCrashInfo, isLockProcessAlive } from "./crash-recovery.js";
 import {
@ -39,9 +40,12 @@ import {
  readUnitRuntimeRecord,
  writeUnitRuntimeRecord,
 } from "./unit-runtime.js";
-import { resolveAutoSupervisorConfig, resolveModelWithFallbacksForUnit, loadEffectiveGSDPreferences, resolveSkillDiscoveryMode } from "./preferences.js";
+import { resolveAutoSupervisorConfig, resolveModelWithFallbacksForUnit, loadEffectiveGSDPreferences, resolveSkillDiscoveryMode, resolveDynamicRoutingConfig } from "./preferences.js";
 import { sendDesktopNotification } from "./notifications.js";
 import type { GSDPreferences } from "./preferences.js";
+import { classifyUnitComplexity, tierLabel } from "./complexity-classifier.js";
+import { resolveModelForComplexity } from "./model-router.js";
+import { initRoutingHistory, resetRoutingHistory, recordOutcome } from "./routing-history.js";
 import {
  checkPostUnitHooks,
  getActiveHook,
@ -92,7 +96,9 @@ import {
  getAutoWorktreePath,
  getAutoWorktreeOriginalBase,
  mergeMilestoneToMain,
+  autoWorktreeBranch,
 } from "./auto-worktree.js";
+import { pruneQueueOrder } from "./queue-order.js";
 import { showNextAction } from "../shared/next-action-ui.js";
 import {
  resolveExpectedArtifactPath,
@ -127,6 +133,7 @@ import {
  deregisterSigtermHandler as _deregisterSigtermHandler,
  detectWorkingTreeActivity,
 } from "./auto-supervisor.js";
+import { hasPendingCaptures, loadPendingCaptures, countPendingCaptures } from "./captures.js";

 // ─── State ────────────────────────────────────────────────────────────────────

@ -196,6 +203,33 @@ function shouldUseWorktreeIsolation(): boolean {
  return true; // default: worktree
 }

+/**
+ * Detect and escape a stale worktree cwd (#608).
+ *
+ * After milestone completion + merge, the worktree directory is removed but
+ * the process cwd may still point inside `.gsd/worktrees/<MID>/`.
+ * When a new session starts, `process.cwd()` is passed as `base` to startAuto
+ * and all subsequent writes land in the wrong directory. This function detects
+ * that scenario and chdir back to the project root.
+ *
+ * Returns the corrected base path.
+ */
+function escapeStaleWorktree(base: string): string {
+  const marker = `${pathSep}.gsd${pathSep}worktrees${pathSep}`;
+  const idx = base.indexOf(marker);
+  if (idx === -1) return base;
+
+  // base is inside .gsd/worktrees/<something> — extract the project root
+  const projectRoot = base.slice(0, idx);
+  try {
+    process.chdir(projectRoot);
+  } catch {
+    // If chdir fails, return the original — caller will handle errors downstream
+    return base;
+  }
+  return projectRoot;
+}
+
 /** Crash recovery prompt — set by startAuto, consumed by first dispatchNextUnit */
 let pendingCrashRecovery: string | null = null;

@ -204,6 +238,9 @@ let autoStartTime: number = 0;
 let completedUnits: { type: string; id: string; startedAt: number; finishedAt: number }[] = [];
 let currentUnit: { type: string; id: string; startedAt: number } | null = null;

+/** Track dynamic routing decision for the current unit (for metrics) */
+let currentUnitRouting: { tier: string; modelDowngraded: boolean } | null = null;
+
 /** Track current milestone to detect transitions */
 let currentMilestoneId: string | null = null;
 let lastBudgetAlertLevel: BudgetAlertLevel = 0;
@ -228,6 +265,9 @@ const DISPATCH_GAP_TIMEOUT_MS = 5_000; // 5 seconds
 /** SIGTERM handler registered while auto-mode is active — cleared on stop/pause. */
 let _sigtermHandler: (() => void) | null = null;

+/** Tool calls currently being executed — prevents false idle detection during long-running tools. */
+const inFlightTools = new Set<string>();
+
 type BudgetAlertLevel = 0 | 75 | 90 | 100;

 export function getBudgetAlertLevel(budgetPct: number): BudgetAlertLevel {
@ -269,6 +309,15 @@ export { type AutoDashboardData } from "./auto-dashboard.js";
 export function getAutoDashboardData(): AutoDashboardData {
  const ledger = getLedger();
  const totals = ledger ? getProjectTotals(ledger.units) : null;
+  // Pending capture count — lazy check, non-fatal
+  let pendingCaptureCount = 0;
+  try {
+    if (basePath) {
+      pendingCaptureCount = countPendingCaptures(basePath);
+    }
+  } catch {
+    // Non-fatal — captures module may not be loaded
+  }
  return {
    active,
    paused,
@ -280,6 +329,7 @@ export function getAutoDashboardData(): AutoDashboardData {
    basePath,
    totalCost: totals?.cost ?? 0,
    totalTokens: totals?.tokens.total ?? 0,
+    pendingCaptureCount,
  };
 }

@ -293,6 +343,22 @@ export function isAutoPaused(): boolean {
  return paused;
 }

+/**
+ * Mark a tool execution as in-flight. Called from index.ts on tool_execution_start.
+ * Prevents the idle watchdog from declaring the agent idle while tools are executing.
+ */
+export function markToolStart(toolCallId: string): void {
+  if (!active) return;
+  inFlightTools.add(toolCallId);
+}
+
+/**
+ * Mark a tool execution as completed. Called from index.ts on tool_execution_end.
+ */
+export function markToolEnd(toolCallId: string): void {
+  inFlightTools.delete(toolCallId);
+}
+
 /**
 * Return the base path to use for the auto.lock file.
 * Always uses the original project root (not the worktree) so that
@ -345,6 +411,7 @@ function clearUnitTimeout(): void {
    clearInterval(idleWatchdogHandle);
    idleWatchdogHandle = null;
  }
+  inFlightTools.clear();
  clearDispatchGapWatchdog();
 }

@ -426,14 +493,18 @@ export async function stopAuto(ctx?: ExtensionContext, pi?: ExtensionAPI): Promi
        `Auto-worktree teardown failed: ${err instanceof Error ? err.message : String(err)}`,
        "warning",
      );
-      // Force basePath back to original even if teardown failed
-      if (originalBasePath) {
-        basePath = originalBasePath;
-        try { process.chdir(basePath); } catch { /* best-effort */ }
-      }
    }
  }

+  // Always restore cwd to project root on stop (#608).
+  // Even if isInAutoWorktree returned false (e.g., module state was already
+  // cleared by mergeMilestoneToMain), the process cwd may still be inside
+  // the worktree directory. Force it back to originalBasePath.
+  if (originalBasePath) {
+    basePath = originalBasePath;
+    try { process.chdir(basePath); } catch { /* best-effort */ }
+  }
+
  const ledger = getLedger();
  if (ledger && ledger.units.length > 0) {
    const totals = getProjectTotals(ledger.units);
@ -451,6 +522,7 @@ export async function stopAuto(ctx?: ExtensionContext, pi?: ExtensionAPI): Promi
  }

  resetMetrics();
+  resetRoutingHistory();
  resetHookState();
  if (basePath) clearPersistedHookState(basePath);
  active = false;
@ -458,12 +530,15 @@ export async function stopAuto(ctx?: ExtensionContext, pi?: ExtensionAPI): Promi
  stepMode = false;
  unitDispatchCount.clear();
  unitRecoveryCount.clear();
+  inFlightTools.clear();
  lastBudgetAlertLevel = 0;
  unitLifetimeDispatches.clear();
  currentUnit = null;
  currentMilestoneId = null;
  originalBasePath = "";
+  completedUnits = [];
  clearSliceProgressCache();
+  clearActivityLogState();
  pendingCrashRecovery = null;
  _handlingAgentEnd = false;
  ctx?.ui.setStatus("gsd-auto", undefined);
@ -519,6 +594,11 @@ export async function startAuto(
 ): Promise<void> {
  const requestedStepMode = options?.step ?? false;

+  // Escape stale worktree cwd from a previous milestone (#608).
+  // After milestone merge + worktree removal, the process cwd may still point
+  // inside .gsd/worktrees/<MID>/ — detect and chdir back to project root.
+  base = escapeStaleWorktree(base);
+
  // If resuming from paused state, just re-activate and dispatch next unit.
  // The conversation is still intact — no need to reinitialize everything.
  if (paused) {
@ -569,17 +649,17 @@ export async function startAuto(
    ctx.ui.setFooter(hideFooter);
    ctx.ui.notify(stepMode ? "Step-mode resumed." : "Auto-mode resumed.", "info");
    // Restore hook state from disk in case session was interrupted
-    restoreHookState(base);
+    restoreHookState(basePath);
    // Rebuild disk state before resuming — user interaction during pause may have changed files
-    try { await rebuildState(base); } catch { /* non-fatal */ }
+    try { await rebuildState(basePath); } catch { /* non-fatal */ }
    try {
-      const report = await runGSDDoctor(base, { fix: true });
+      const report = await runGSDDoctor(basePath, { fix: true });
      if (report.fixesApplied.length > 0) {
        ctx.ui.notify(`Resume: applied ${report.fixesApplied.length} fix(es) to state.`, "info");
      }
    } catch { /* non-fatal */ }
    // Self-heal: clear stale runtime records where artifacts already exist
-    await selfHealRuntimeRecords(base, ctx, completedKeySet);
+    await selfHealRuntimeRecords(basePath, ctx, completedKeySet);
    invalidateAllCaches();
    await dispatchNextUnit(ctx, pi);
    return;
@ -748,6 +828,9 @@ export async function startAuto(
  // Initialize metrics — loads existing ledger from disk
  initMetrics(base);

+  // Initialize routing history for adaptive learning
+  initRoutingHistory(base);
+
  // Snapshot installed skills so we can detect new ones after research
  if (resolveSkillDiscoveryMode() !== "off") {
    snapshotSkills();
@ -950,7 +1033,7 @@ export async function handleAgentEnd(
      const hookStartedAt = Date.now();
      if (currentUnit) {
        const modelId = ctx.model?.id ?? "unknown";
-        snapshotUnitMetrics(ctx, currentUnit.type, currentUnit.id, currentUnit.startedAt, modelId);
+        snapshotUnitMetrics(ctx, currentUnit.type, currentUnit.id, currentUnit.startedAt, modelId, currentUnitRouting ?? undefined);
        saveActivityLog(ctx, basePath, currentUnit.type, currentUnit.id);
      }
      currentUnit = { type: hookUnit.unitType, id: hookUnit.unitId, startedAt: hookStartedAt };
@ -1045,6 +1128,108 @@ export async function handleAgentEnd(
    }
  }

+  // ── Triage check: dispatch triage unit if pending captures exist ──────────
+  // Fires after hooks complete, before normal dispatch. Follows the same
+  // early-dispatch-and-return pattern as hooks and fix-merge.
+  // Skip for: step mode (shows wizard instead), triage units (prevent triage-on-triage),
+  // hook units (hooks run before triage conceptually).
+  if (
+    !stepMode &&
+    currentUnit &&
+    !currentUnit.type.startsWith("hook/") &&
+    currentUnit.type !== "triage-captures" &&
+    currentUnit.type !== "quick-task"
+  ) {
+    try {
+      if (hasPendingCaptures(basePath)) {
+        const pending = loadPendingCaptures(basePath);
+        if (pending.length > 0) {
+          const state = await deriveState(basePath);
+          const mid = state.activeMilestone?.id;
+          const sid = state.activeSlice?.id;
+
+          if (mid && sid) {
+            // Build triage prompt with current context
+            let currentPlan = "";
+            let roadmapContext = "";
+            const planFile = resolveSliceFile(basePath, mid, sid, "PLAN");
+            if (planFile) currentPlan = (await loadFile(planFile)) ?? "";
+            const roadmapFile = resolveMilestoneFile(basePath, mid, "ROADMAP");
+            if (roadmapFile) roadmapContext = (await loadFile(roadmapFile)) ?? "";
+
+            const capturesList = pending.map(c =>
+              `- **${c.id}**: "${c.text}" (captured: ${c.timestamp})`
+            ).join("\n");
+
+            const prompt = loadPrompt("triage-captures", {
+              pendingCaptures: capturesList,
+              currentPlan: currentPlan || "(no active slice plan)",
+              roadmapContext: roadmapContext || "(no active roadmap)",
+            });
+
+            ctx.ui.notify(
+              `Triaging ${pending.length} pending capture${pending.length === 1 ? "" : "s"}...`,
+              "info",
+            );
+
+            // Close out previous unit metrics
+            if (currentUnit) {
+              const modelId = ctx.model?.id ?? "unknown";
+              snapshotUnitMetrics(ctx, currentUnit.type, currentUnit.id, currentUnit.startedAt, modelId);
+              saveActivityLog(ctx, basePath, currentUnit.type, currentUnit.id);
+            }
+
+            // Dispatch triage as a new unit (early-dispatch-and-return)
+            const triageUnitType = "triage-captures";
+            const triageUnitId = `${mid}/${sid}/triage`;
+            const triageStartedAt = Date.now();
+            currentUnit = { type: triageUnitType, id: triageUnitId, startedAt: triageStartedAt };
+            writeUnitRuntimeRecord(basePath, triageUnitType, triageUnitId, triageStartedAt, {
+              phase: "dispatched",
+              wrapupWarningSent: false,
+              timeoutAt: null,
+              lastProgressAt: triageStartedAt,
+              progressCount: 0,
+              lastProgressKind: "dispatch",
+            });
+            updateProgressWidget(ctx, triageUnitType, triageUnitId, state);
+
+            const result = await cmdCtx!.newSession();
+            if (result.cancelled) {
+              await stopAuto(ctx, pi);
+              return;
+            }
+            const sessionFile = ctx.sessionManager.getSessionFile();
+            writeLock(basePath, triageUnitType, triageUnitId, completedUnits.length, sessionFile);
+
+            // Start unit timeout for triage (use same supervisor config as hooks)
+            clearUnitTimeout();
+            const supervisor = resolveAutoSupervisorConfig();
+            const triageTimeoutMs = (supervisor.hard_timeout_minutes ?? 30) * 60 * 1000;
+            unitTimeoutHandle = setTimeout(async () => {
+              unitTimeoutHandle = null;
+              if (!active) return;
+              ctx.ui.notify(
+                `Triage unit exceeded timeout. Pausing auto-mode.`,
+                "warning",
+              );
+              await pauseAuto(ctx, pi);
+            }, triageTimeoutMs);
+
+            if (!active) return;
+            pi.sendMessage(
+              { customType: "gsd-auto", content: prompt, display: verbose },
+              { triggerTurn: true },
+            );
+            return; // handleAgentEnd will fire again when triage session completes
+          }
+        }
+      }
+    } catch {
+      // Triage check failure is non-fatal — proceed to normal dispatch
+    }
+  }
+
  // In step mode, pause and show a wizard instead of immediately dispatching
  if (stepMode) {
    await showStepWizard(ctx, pi);
@ -1166,7 +1351,10 @@ function updateProgressWidget(
  unitId: string,
  state: GSDState,
 ): void {
-  _updateProgressWidget(ctx, unitType, unitId, state, widgetStateAccessors);
+  const badge = currentUnitRouting?.tier
+    ? ({ light: "L", standard: "S", heavy: "H" }[currentUnitRouting.tier] ?? undefined)
+    : undefined;
+  _updateProgressWidget(ctx, unitType, unitId, state, widgetStateAccessors, badge);
 }

 /** State accessors for the widget — closures over module globals. */
@ -1245,12 +1433,90 @@ async function dispatchNextUnit(
      "info",
    );
    sendDesktopNotification("GSD", `Milestone ${currentMilestoneId} complete!`, "success", "milestone");
+    // Hint: visualizer available after milestone transition
+    const vizPrefs = loadEffectiveGSDPreferences()?.preferences;
+    if (vizPrefs?.auto_visualize) {
+      ctx.ui.notify("Run /gsd visualize to see progress overview.", "info");
+    }
    // Reset stuck detection for new milestone
    unitDispatchCount.clear();
    unitRecoveryCount.clear();
    unitLifetimeDispatches.clear();
-    // Capture integration branch for the new milestone and update git service
-    captureIntegrationBranch(originalBasePath || basePath, mid, { commitDocs: loadEffectiveGSDPreferences()?.preferences?.git?.commit_docs });
+    // Clear completed-units.json for the finished milestone
+    try {
+      const file = completedKeysPath(basePath);
+      if (existsSync(file)) writeFileSync(file, JSON.stringify([]), "utf-8");
+      completedKeySet.clear();
+    } catch { /* non-fatal */ }
+
+    // ── Worktree lifecycle on milestone transition (#616) ──────────────
+    // When transitioning from M_old to M_new inside a worktree, we must:
+    // 1. Merge the completed milestone's worktree back to main
+    // 2. Re-derive state from the project root
+    // 3. Create a new worktree for the incoming milestone
+    // Without this, M_new runs inside M_old's worktree on the wrong branch,
+    // and artifact paths resolve against the wrong .gsd/ directory.
+    if (isInAutoWorktree(basePath) && originalBasePath && shouldUseWorktreeIsolation()) {
+      try {
+        const roadmapPath = resolveMilestoneFile(originalBasePath, currentMilestoneId, "ROADMAP");
+        if (roadmapPath) {
+          const roadmapContent = readFileSync(roadmapPath, "utf-8");
+          const mergeResult = mergeMilestoneToMain(originalBasePath, currentMilestoneId, roadmapContent);
+          ctx.ui.notify(
+            `Milestone ${currentMilestoneId} merged to main.${mergeResult.pushed ? " Pushed to remote." : ""}`,
+            "info",
+          );
+        } else {
+          // No roadmap found — teardown worktree without merge
+          teardownAutoWorktree(originalBasePath, currentMilestoneId);
+          ctx.ui.notify(`Exited worktree for ${currentMilestoneId} (no roadmap for merge).`, "info");
+        }
+      } catch (err) {
+        ctx.ui.notify(
+          `Milestone merge failed during transition: ${err instanceof Error ? err.message : String(err)}`,
+          "warning",
+        );
+        // Force cwd back to project root even if merge failed
+        if (originalBasePath) {
+          try { process.chdir(originalBasePath); } catch { /* best-effort */ }
+        }
+      }
+
+      // Update basePath to project root (mergeMilestoneToMain already chdir'd)
+      basePath = originalBasePath;
+      gitService = new GitServiceImpl(basePath, loadEffectiveGSDPreferences()?.preferences?.git ?? {});
+      invalidateAllCaches();
+
+      // Re-derive state from project root before creating new worktree
+      state = await deriveState(basePath);
+      mid = state.activeMilestone?.id;
+      midTitle = state.activeMilestone?.title;
+
+      // Create new worktree for the incoming milestone
+      if (mid) {
+        captureIntegrationBranch(basePath, mid, { commitDocs: loadEffectiveGSDPreferences()?.preferences?.git?.commit_docs });
+        try {
+          const wtPath = createAutoWorktree(basePath, mid);
+          basePath = wtPath;
+          gitService = new GitServiceImpl(basePath, loadEffectiveGSDPreferences()?.preferences?.git ?? {});
+          ctx.ui.notify(`Created auto-worktree for ${mid} at ${wtPath}`, "info");
+        } catch (err) {
+          ctx.ui.notify(
+            `Auto-worktree creation for ${mid} failed: ${err instanceof Error ? err.message : String(err)}. Continuing in project root.`,
+            "warning",
+          );
+        }
+      }
+    } else {
+      // Not in worktree — just capture integration branch for the new milestone
+      captureIntegrationBranch(originalBasePath || basePath, mid, { commitDocs: loadEffectiveGSDPreferences()?.preferences?.git?.commit_docs });
+    }
+
+    // Prune completed milestone from queue order file
+    const pendingIds = state.registry
+      .filter(m => m.status !== "complete")
+      .map(m => m.id);
+    pruneQueueOrder(basePath, pendingIds);
  }
  if (mid) {
    currentMilestoneId = mid;
@ -1261,7 +1527,7 @@ async function dispatchNextUnit(
    // Save final session before stopping
    if (currentUnit) {
      const modelId = ctx.model?.id ?? "unknown";
-      snapshotUnitMetrics(ctx, currentUnit.type, currentUnit.id, currentUnit.startedAt, modelId);
+      snapshotUnitMetrics(ctx, currentUnit.type, currentUnit.id, currentUnit.startedAt, modelId, currentUnitRouting ?? undefined);
      saveActivityLog(ctx, basePath, currentUnit.type, currentUnit.id);
    }
    sendDesktopNotification("GSD", "All milestones complete!", "success", "milestone");
@ -1289,7 +1555,7 @@ async function dispatchNextUnit(
  if (!mid || !midTitle) {
    if (currentUnit) {
      const modelId = ctx.model?.id ?? "unknown";
-      snapshotUnitMetrics(ctx, currentUnit.type, currentUnit.id, currentUnit.startedAt, modelId);
+      snapshotUnitMetrics(ctx, currentUnit.type, currentUnit.id, currentUnit.startedAt, modelId, currentUnitRouting ?? undefined);
      saveActivityLog(ctx, basePath, currentUnit.type, currentUnit.id);
    }
    await stopAuto(ctx, pi);
@ -1304,7 +1570,7 @@ async function dispatchNextUnit(
  if (state.phase === "complete") {
    if (currentUnit) {
      const modelId = ctx.model?.id ?? "unknown";
-      snapshotUnitMetrics(ctx, currentUnit.type, currentUnit.id, currentUnit.startedAt, modelId);
+      snapshotUnitMetrics(ctx, currentUnit.type, currentUnit.id, currentUnit.startedAt, modelId, currentUnitRouting ?? undefined);
      saveActivityLog(ctx, basePath, currentUnit.type, currentUnit.id);
    }
    // Clear completed-units.json for the finished milestone so it doesn't grow unbounded.
@ -1331,6 +1597,39 @@ async function dispatchNextUnit(
          `Milestone merge failed: ${err instanceof Error ? err.message : String(err)}`,
          "warning",
        );
+        // Ensure cwd is restored even if merge failed partway through (#608).
+        // mergeMilestoneToMain may have chdir'd but then thrown, leaving us
+        // in an indeterminate location.
+        if (originalBasePath) {
+          basePath = originalBasePath;
+          try { process.chdir(basePath); } catch { /* best-effort */ }
+        }
+      }
+    } else if (currentMilestoneId && !isInAutoWorktree(basePath)) {
+      // Branch isolation mode (#603): no worktree, but we may be on a milestone/* branch.
+      // Squash-merge back to the integration branch (or main) before stopping.
+      try {
+        const currentBranch = getCurrentBranch(basePath);
+        const milestoneBranch = autoWorktreeBranch(currentMilestoneId);
+        if (currentBranch === milestoneBranch) {
+          const roadmapPath = resolveMilestoneFile(basePath, currentMilestoneId, "ROADMAP");
+          if (roadmapPath) {
+            const roadmapContent = readFileSync(roadmapPath, "utf-8");
+            // mergeMilestoneToMain handles: auto-commit, checkout integration branch,
+            // squash merge, commit, optional push, branch deletion.
+            const mergeResult = mergeMilestoneToMain(basePath, currentMilestoneId, roadmapContent);
+            gitService = new GitServiceImpl(basePath, loadEffectiveGSDPreferences()?.preferences?.git ?? {});
+            ctx.ui.notify(
+              `Milestone ${currentMilestoneId} merged (branch mode).${mergeResult.pushed ? " Pushed to remote." : ""}`,
+              "info",
+            );
+          }
+        }
+      } catch (err) {
+        ctx.ui.notify(
+          `Milestone merge failed (branch mode): ${err instanceof Error ? err.message : String(err)}`,
+          "warning",
+        );
      }
    }
    sendDesktopNotification("GSD", `Milestone ${mid} complete!`, "success", "milestone");
@ -1341,7 +1640,7 @@ async function dispatchNextUnit(
  if (state.phase === "blocked") {
    if (currentUnit) {
      const modelId = ctx.model?.id ?? "unknown";
-      snapshotUnitMetrics(ctx, currentUnit.type, currentUnit.id, currentUnit.startedAt, modelId);
+      snapshotUnitMetrics(ctx, currentUnit.type, currentUnit.id, currentUnit.startedAt, modelId, currentUnitRouting ?? undefined);
      saveActivityLog(ctx, basePath, currentUnit.type, currentUnit.id);
    }
    await stopAuto(ctx, pi);
@ -1449,7 +1748,7 @@ async function dispatchNextUnit(
  if (dispatchResult.action === "stop") {
    if (currentUnit) {
      const modelId = ctx.model?.id ?? "unknown";
-      snapshotUnitMetrics(ctx, currentUnit.type, currentUnit.id, currentUnit.startedAt, modelId);
+      snapshotUnitMetrics(ctx, currentUnit.type, currentUnit.id, currentUnit.startedAt, modelId, currentUnitRouting ?? undefined);
      saveActivityLog(ctx, basePath, currentUnit.type, currentUnit.id);
    }
    await stopAuto(ctx, pi);
@ -1559,7 +1858,7 @@ async function dispatchNextUnit(
  if (lifetimeCount > MAX_LIFETIME_DISPATCHES) {
    if (currentUnit) {
      const modelId = ctx.model?.id ?? "unknown";
-      snapshotUnitMetrics(ctx, currentUnit.type, currentUnit.id, currentUnit.startedAt, modelId);
+      snapshotUnitMetrics(ctx, currentUnit.type, currentUnit.id, currentUnit.startedAt, modelId, currentUnitRouting ?? undefined);
    }
    saveActivityLog(ctx, basePath, unitType, unitId);
    const expected = diagnoseExpectedArtifact(unitType, unitId, basePath);
@ -1573,7 +1872,7 @@ async function dispatchNextUnit(
  if (prevCount >= MAX_UNIT_DISPATCHES) {
    if (currentUnit) {
      const modelId = ctx.model?.id ?? "unknown";
-      snapshotUnitMetrics(ctx, currentUnit.type, currentUnit.id, currentUnit.startedAt, modelId);
+      snapshotUnitMetrics(ctx, currentUnit.type, currentUnit.id, currentUnit.startedAt, modelId, currentUnitRouting ?? undefined);
    }
    saveActivityLog(ctx, basePath, unitType, unitId);

@ -1731,9 +2030,19 @@ async function dispatchNextUnit(
  // The session still holds the previous unit's data (newSession hasn't fired yet).
  if (currentUnit) {
    const modelId = ctx.model?.id ?? "unknown";
-    snapshotUnitMetrics(ctx, currentUnit.type, currentUnit.id, currentUnit.startedAt, modelId);
+    snapshotUnitMetrics(ctx, currentUnit.type, currentUnit.id, currentUnit.startedAt, modelId, currentUnitRouting ?? undefined);
    saveActivityLog(ctx, basePath, currentUnit.type, currentUnit.id);

+    // Record routing outcome for adaptive learning
+    if (currentUnitRouting) {
+      const isRetry = currentUnit.type === unitType && currentUnit.id === unitId;
+      recordOutcome(
+        currentUnit.type,
+        currentUnitRouting.tier as "light" | "standard" | "heavy",
+        !isRetry, // success = not being retried
+      );
+    }
+
    // Only mark the previous unit as completed if:
    // 1. We're not about to re-dispatch the same unit (retry scenario)
    // 2. The expected artifact actually exists on disk
@ -1757,6 +2066,10 @@ async function dispatchNextUnit(
        startedAt: currentUnit.startedAt,
        finishedAt: Date.now(),
      });
+      // Cap to last 200 entries to prevent unbounded growth (#611)
+      if (completedUnits.length > 200) {
+        completedUnits = completedUnits.slice(-200);
+      }
      clearUnitRuntimeRecord(basePath, currentUnit.type, currentUnit.id);
      unitDispatchCount.delete(`${currentUnit.type}/${currentUnit.id}`);
      unitRecoveryCount.delete(`${currentUnit.type}/${currentUnit.id}`);
@ -1832,7 +2145,54 @@ async function dispatchNextUnit(
  const modelConfig = resolveModelWithFallbacksForUnit(unitType);
  if (modelConfig) {
    const availableModels = ctx.modelRegistry.getAvailable();
-    const modelsToTry = [modelConfig.primary, ...modelConfig.fallbacks];
+
+    // ─── Dynamic Model Routing ─────────────────────────────────────────
+    // If enabled, classify unit complexity and potentially downgrade to a
+    // cheaper model. The user's configured model is the ceiling.
+    const routingConfig = resolveDynamicRoutingConfig();
+    let effectiveModelConfig = modelConfig;
+    let routingTierLabel = "";
+    currentUnitRouting = null;
+
+    if (routingConfig.enabled) {
+      // Compute budget pressure if budget ceiling is set
+      let budgetPct: number | undefined;
+      if (routingConfig.budget_pressure !== false) {
+        const budgetCeiling = prefs?.budget_ceiling;
+        if (budgetCeiling !== undefined && budgetCeiling > 0) {
+          const currentLedger = getLedger();
+          const totalCost = currentLedger ? getProjectTotals(currentLedger.units).cost : 0;
+          budgetPct = totalCost / budgetCeiling;
+        }
+      }
+
+      // Classify complexity (hook routing controlled by config.hooks)
+      const isHook = unitType.startsWith("hook/");
+      const shouldClassify = !isHook || routingConfig.hooks !== false;
+
+      if (shouldClassify) {
+        const classification = classifyUnitComplexity(unitType, unitId, basePath, budgetPct);
+        const availableModelIds = availableModels.map(m => m.id);
+        const routing = resolveModelForComplexity(classification, modelConfig, routingConfig, availableModelIds);
+
+        if (routing.wasDowngraded) {
+          effectiveModelConfig = {
+            primary: routing.modelId,
+            fallbacks: routing.fallbacks,
+          };
+          if (verbose) {
+            ctx.ui.notify(
+              `Dynamic routing [${tierLabel(classification.tier)}]: ${routing.modelId} (${classification.reason})`,
+              "info",
+            );
+          }
+        }
+        routingTierLabel = ` [${tierLabel(classification.tier)}]`;
+        currentUnitRouting = { tier: classification.tier, modelDowngraded: routing.wasDowngraded };
+      }
+    }
+
+    const modelsToTry = [effectiveModelConfig.primary, ...effectiveModelConfig.fallbacks];
    let modelSet = false;

    for (const modelId of modelsToTry) {
@ -1897,11 +2257,11 @@ async function dispatchNextUnit(

      const ok = await pi.setModel(model, { persist: false });
      if (ok) {
-        const fallbackNote = modelId === modelConfig.primary
+        const fallbackNote = modelId === effectiveModelConfig.primary
          ? ""
-          : ` (fallback from ${modelConfig.primary})`;
+          : ` (fallback from ${effectiveModelConfig.primary})`;
        const phase = unitPhaseLabel(unitType);
-        ctx.ui.notify(`Model [${phase}]: ${model.provider}/${model.id}${fallbackNote}`, "info");
+        ctx.ui.notify(`Model [${phase}]${routingTierLabel}: ${model.provider}/${model.id}${fallbackNote}`, "info");
        modelSet = true;
        break;
      } else {
@ -1957,6 +2317,16 @@ async function dispatchNextUnit(
    if (!runtime) return;
    if (Date.now() - runtime.lastProgressAt < idleTimeoutMs) return;

+    // Agent has tool calls currently executing (await_job, long bash, etc.) —
+    // not idle, just waiting for tool completion.
+    if (inFlightTools.size > 0) {
+      writeUnitRuntimeRecord(basePath, unitType, unitId, currentUnit.startedAt, {
+        lastProgressAt: Date.now(),
+        lastProgressKind: "tool-in-flight",
+      });
+      return;
+    }
+
    // Before triggering recovery, check if the agent is actually producing
    // work on disk.  `git status --porcelain` is cheap and catches any
    // staged/unstaged/untracked changes the agent made since lastProgressAt.
@ -1970,7 +2340,7 @@ async function dispatchNextUnit(

    if (currentUnit) {
      const modelId = ctx.model?.id ?? "unknown";
-      snapshotUnitMetrics(ctx, currentUnit.type, currentUnit.id, currentUnit.startedAt, modelId);
+      snapshotUnitMetrics(ctx, currentUnit.type, currentUnit.id, currentUnit.startedAt, modelId, currentUnitRouting ?? undefined);
    }
    saveActivityLog(ctx, basePath, unitType, unitId);

@ -1996,7 +2366,7 @@ async function dispatchNextUnit(
        timeoutAt: Date.now(),
      });
      const modelId = ctx.model?.id ?? "unknown";
-      snapshotUnitMetrics(ctx, currentUnit.type, currentUnit.id, currentUnit.startedAt, modelId);
+      snapshotUnitMetrics(ctx, currentUnit.type, currentUnit.id, currentUnit.startedAt, modelId, currentUnitRouting ?? undefined);
    }
    saveActivityLog(ctx, basePath, unitType, unitId);

--- a/src/resources/extensions/gsd/captures.ts
+++ b/src/resources/extensions/gsd/captures.ts
@ -0,0 +1,384 @@
+/**
+ * GSD Captures — Fire-and-forget thought capture with triage classification
+ *
+ * Append-only capture file at `.gsd/CAPTURES.md`. Each capture is an H3 section
+ * with bold metadata fields, parseable by the same patterns used in files.ts.
+ *
+ * Worktree-aware: captures always resolve to the original project root's
+ * `.gsd/CAPTURES.md`, not the worktree's local `.gsd/`.
+ */
+
+import { existsSync, readFileSync, writeFileSync, mkdirSync } from "node:fs";
+import { join, resolve, sep } from "node:path";
+import { randomUUID } from "node:crypto";
+import { gsdRoot } from "./paths.js";
+
+// ─── Types ────────────────────────────────────────────────────────────────────
+
+export type Classification = "quick-task" | "inject" | "defer" | "replan" | "note";
+
+export interface CaptureEntry {
+  id: string;
+  text: string;
+  timestamp: string;
+  status: "pending" | "triaged" | "resolved";
+  classification?: Classification;
+  resolution?: string;
+  rationale?: string;
+  resolvedAt?: string;
+}
+
+export interface TriageResult {
+  captureId: string;
+  classification: Classification;
+  rationale: string;
+  affectedFiles?: string[];
+  targetSlice?: string;
+}
+
+// ─── Constants ────────────────────────────────────────────────────────────────
+
+const CAPTURES_FILENAME = "CAPTURES.md";
+const VALID_CLASSIFICATIONS: readonly string[] = [
+  "quick-task", "inject", "defer", "replan", "note",
+];
+
+// ─── Path Resolution ──────────────────────────────────────────────────────────
+
+/**
+ * Resolve the path to CAPTURES.md, aware of worktree context.
+ *
+ * In worktree-isolated mode, basePath is `.gsd/worktrees/<MID>/`.
+ * Captures must resolve to the *original* project root's `.gsd/CAPTURES.md`,
+ * not the worktree-local `.gsd/`. This ensures all captures go to one file
+ * regardless of which worktree the agent is running in.
+ *
+ * Detection: if basePath contains `/.gsd/worktrees/`, walk up to the
+ * directory that contains `.gsd/worktrees/` — that's the project root.
+ */
+export function resolveCapturesPath(basePath: string): string {
+  const resolved = resolve(basePath);
+  const worktreeMarker = `${sep}.gsd${sep}worktrees${sep}`;
+  const idx = resolved.indexOf(worktreeMarker);
+  if (idx !== -1) {
+    // basePath is inside a worktree — resolve to project root
+    const projectRoot = resolved.slice(0, idx);
+    return join(projectRoot, ".gsd", CAPTURES_FILENAME);
+  }
+  return join(gsdRoot(basePath), CAPTURES_FILENAME);
+}
+
+// ─── File I/O ─────────────────────────────────────────────────────────────────
+
+/**
+ * Append a new capture entry to CAPTURES.md.
+ * Creates `.gsd/` and the file if they don't exist.
+ * Returns the generated capture ID.
+ */
+export function appendCapture(basePath: string, text: string): string {
+  const filePath = resolveCapturesPath(basePath);
+  const dir = join(filePath, "..");
+  if (!existsSync(dir)) {
+    mkdirSync(dir, { recursive: true });
+  }
+
+  const id = `CAP-${randomUUID().slice(0, 8)}`;
+  const timestamp = new Date().toISOString();
+
+  const entry = [
+    `### ${id}`,
+    `**Text:** ${text}`,
+    `**Captured:** ${timestamp}`,
+    `**Status:** pending`,
+    "",
+  ].join("\n");
+
+  if (existsSync(filePath)) {
+    const existing = readFileSync(filePath, "utf-8");
+    writeFileSync(filePath, existing.trimEnd() + "\n\n" + entry, "utf-8");
+  } else {
+    const header = `# Captures\n\n`;
+    writeFileSync(filePath, header + entry, "utf-8");
+  }
+
+  return id;
+}
+
+/**
+ * Parse all capture entries from CAPTURES.md.
+ * Returns entries in file order (oldest first).
+ */
+export function loadAllCaptures(basePath: string): CaptureEntry[] {
+  const filePath = resolveCapturesPath(basePath);
+  if (!existsSync(filePath)) return [];
+
+  const content = readFileSync(filePath, "utf-8");
+  return parseCapturesContent(content);
+}
+
+/**
+ * Load only pending (unresolved) captures.
+ */
+export function loadPendingCaptures(basePath: string): CaptureEntry[] {
+  return loadAllCaptures(basePath).filter(c => c.status === "pending");
+}
+
+/**
+ * Fast check for pending captures without full parse.
+ * Reads the file and scans for `**Status:** pending` via regex.
+ * Returns false if the file doesn't exist.
+ */
+export function hasPendingCaptures(basePath: string): boolean {
+  const filePath = resolveCapturesPath(basePath);
+  if (!existsSync(filePath)) return false;
+  try {
+    const content = readFileSync(filePath, "utf-8");
+    return /\*\*Status:\*\*\s*pending/i.test(content);
+  } catch {
+    return false;
+  }
+}
+
+/**
+ * Count pending captures without full parse — single file read.
+ * Uses regex to count `**Status:** pending` occurrences.
+ * Returns 0 if file doesn't exist or on error.
+ */
+export function countPendingCaptures(basePath: string): number {
+  const filePath = resolveCapturesPath(basePath);
+  if (!existsSync(filePath)) return 0;
+  try {
+    const content = readFileSync(filePath, "utf-8");
+    const matches = content.match(/\*\*Status:\*\*\s*pending/gi);
+    return matches ? matches.length : 0;
+  } catch {
+    return 0;
+  }
+}
+
+/**
+ * Mark a capture as resolved with classification and rationale.
+ * Rewrites the entry in place, preserving other entries.
+ */
+export function markCaptureResolved(
+  basePath: string,
+  captureId: string,
+  classification: Classification,
+  resolution: string,
+  rationale: string,
+): void {
+  const filePath = resolveCapturesPath(basePath);
+  if (!existsSync(filePath)) return;
+
+  const content = readFileSync(filePath, "utf-8");
+  const resolvedAt = new Date().toISOString();
+
+  // Find the section for this capture ID and rewrite its fields
+  const sectionRegex = new RegExp(
+    `(### ${escapeRegex(captureId)}\\n(?:(?!### ).)*?)(?=### |$)`,
+    "s",
+  );
+  const match = sectionRegex.exec(content);
+  if (!match) return;
+
+  let section = match[1];
+
+  // Update Status field
+  section = section.replace(
+    /\*\*Status:\*\*\s*.+/,
+    `**Status:** resolved`,
+  );
+
+  // Append classification, resolution, rationale, and timestamp if not present
+  const newFields = [
+    `**Classification:** ${classification}`,
+    `**Resolution:** ${resolution}`,
+    `**Rationale:** ${rationale}`,
+    `**Resolved:** ${resolvedAt}`,
+  ];
+
+  // Remove any existing classification/resolution/rationale/resolved fields
+  // (in case of re-triage)
+  section = section.replace(/\*\*Classification:\*\*\s*.+\n?/g, "");
+  section = section.replace(/\*\*Resolution:\*\*\s*.+\n?/g, "");
+  section = section.replace(/\*\*Rationale:\*\*\s*.+\n?/g, "");
+  section = section.replace(/\*\*Resolved:\*\*\s*.+\n?/g, "");
+
+  // Add new fields after Status line
+  section = section.trimEnd() + "\n" + newFields.join("\n") + "\n";
+
+  const updated = content.replace(sectionRegex, section);
+  writeFileSync(filePath, updated, "utf-8");
+}
+
+// ─── Parser ───────────────────────────────────────────────────────────────────
+
+/**
+ * Parse CAPTURES.md content into CaptureEntry array.
+ */
+function parseCapturesContent(content: string): CaptureEntry[] {
+  const entries: CaptureEntry[] = [];
+
+  // Split on H3 headings
+  const sections = content.split(/^### /m).slice(1); // skip content before first H3
+
+  for (const section of sections) {
+    const lines = section.split("\n");
+    const id = lines[0]?.trim();
+    if (!id) continue;
+
+    const body = lines.slice(1).join("\n");
+    const text = extractBoldField(body, "Text");
+    const timestamp = extractBoldField(body, "Captured");
+    const statusRaw = extractBoldField(body, "Status");
+    const classification = extractBoldField(body, "Classification") as Classification | null;
+    const resolution = extractBoldField(body, "Resolution");
+    const rationale = extractBoldField(body, "Rationale");
+    const resolvedAt = extractBoldField(body, "Resolved");
+
+    if (!text || !timestamp) continue;
+
+    const status = (statusRaw === "resolved" || statusRaw === "triaged")
+      ? statusRaw
+      : "pending";
+
+    entries.push({
+      id,
+      text,
+      timestamp,
+      status,
+      ...(classification && VALID_CLASSIFICATIONS.includes(classification) ? { classification } : {}),
+      ...(resolution ? { resolution } : {}),
+      ...(rationale ? { rationale } : {}),
+      ...(resolvedAt ? { resolvedAt } : {}),
+    });
+  }
+
+  return entries;
+}
+
+/**
+ * Extract value from a bold-prefixed line like "**Key:** Value".
+ * Local copy of the pattern from files.ts to keep this module self-contained.
+ */
+function extractBoldField(text: string, key: string): string | null {
+  const regex = new RegExp(`^\\*\\*${escapeRegex(key)}:\\*\\*\\s*(.+)$`, "m");
+  const match = regex.exec(text);
+  return match ? match[1].trim() : null;
+}
+
+function escapeRegex(s: string): string {
+  return s.replace(/[.*+?^${}()|[\]\\]/g, "\\$&");
+}
+
+// ─── Triage Output Parser ─────────────────────────────────────────────────────
+
+/**
+ * Parse LLM triage output into TriageResult array.
+ *
+ * Handles:
+ * - Clean JSON array
+ * - JSON wrapped in fenced code block (```json ... ```)
+ * - JSON with leading/trailing prose
+ * - Single object (not array) — wraps in array
+ * - Malformed JSON — returns empty array (caller should fall back to note)
+ * - Partial results — valid entries are kept, invalid skipped
+ */
+export function parseTriageOutput(llmResponse: string): TriageResult[] {
+  if (!llmResponse || !llmResponse.trim()) return [];
+
+  // Try to extract JSON from fenced code blocks first
+  const fenced = llmResponse.match(/```(?:json)?\s*\n?([\s\S]*?)\n?\s*```/);
+  const jsonStr = fenced ? fenced[1] : extractJsonSubstring(llmResponse);
+
+  if (!jsonStr) return [];
+
+  try {
+    const parsed = JSON.parse(jsonStr);
+    const arr = Array.isArray(parsed) ? parsed : [parsed];
+    return arr
+      .filter(isValidTriageResult)
+      .map(normalizeTriageResult);
+  } catch {
+    return [];
+  }
+}
+
+/**
+ * Try to find a JSON array or object substring in prose text.
+ * Looks for the first [ or { and finds its matching bracket.
+ */
+function extractJsonSubstring(text: string): string | null {
+  // Find first [ or {
+  const arrStart = text.indexOf("[");
+  const objStart = text.indexOf("{");
+
+  let start: number;
+  let openChar: string;
+  let closeChar: string;
+
+  if (arrStart === -1 && objStart === -1) return null;
+  if (arrStart === -1) {
+    start = objStart;
+    openChar = "{";
+    closeChar = "}";
+  } else if (objStart === -1) {
+    start = arrStart;
+    openChar = "[";
+    closeChar = "]";
+  } else {
+    start = Math.min(arrStart, objStart);
+    openChar = start === arrStart ? "[" : "{";
+    closeChar = start === arrStart ? "]" : "}";
+  }
+
+  // Find matching bracket
+  let depth = 0;
+  let inString = false;
+  let escape = false;
+
+  for (let i = start; i < text.length; i++) {
+    const ch = text[i];
+    if (escape) {
+      escape = false;
+      continue;
+    }
+    if (ch === "\\") {
+      escape = true;
+      continue;
+    }
+    if (ch === '"') {
+      inString = !inString;
+      continue;
+    }
+    if (inString) continue;
+    if (ch === openChar) depth++;
+    if (ch === closeChar) depth--;
+    if (depth === 0) {
+      return text.slice(start, i + 1);
+    }
+  }
+
+  return null;
+}
+
+function isValidTriageResult(obj: unknown): boolean {
+  if (!obj || typeof obj !== "object") return false;
+  const o = obj as Record<string, unknown>;
+  return (
+    typeof o.captureId === "string" &&
+    typeof o.classification === "string" &&
+    VALID_CLASSIFICATIONS.includes(o.classification) &&
+    typeof o.rationale === "string"
+  );
+}
+
+function normalizeTriageResult(obj: Record<string, unknown>): TriageResult {
+  return {
+    captureId: obj.captureId as string,
+    classification: obj.classification as Classification,
+    rationale: obj.rationale as string,
+    ...(Array.isArray(obj.affectedFiles) ? { affectedFiles: obj.affectedFiles as string[] } : {}),
+    ...(typeof obj.targetSlice === "string" ? { targetSlice: obj.targetSlice } : {}),
+  };
+}
--- a/src/resources/extensions/gsd/commands.ts
+++ b/src/resources/extensions/gsd/commands.ts
@ -11,8 +11,11 @@ import { join, dirname } from "node:path";
 import { fileURLToPath } from "node:url";
 import { deriveState } from "./state.js";
 import { GSDDashboardOverlay } from "./dashboard-overlay.js";
+import { GSDVisualizerOverlay } from "./visualizer-overlay.js";
 import { showQueue, showDiscuss } from "./guided-flow.js";
 import { startAuto, stopAuto, pauseAuto, isAutoActive, isAutoPaused, isStepMode, stopAutoRemote } from "./auto.js";
+import { resolveProjectRoot } from "./worktree.js";
+import { appendCapture, hasPendingCaptures, loadPendingCaptures } from "./captures.js";
 import {
  getGlobalGSDPreferencesPath,
  getLegacyGlobalGSDPreferencesPath,
@ -22,7 +25,7 @@ import {
  loadEffectiveGSDPreferences,
  resolveAllSkillReferences,
 } from "./preferences.js";
-import { loadFile, saveFile, appendOverride } from "./files.js";
+import { loadFile, saveFile, appendOverride, appendKnowledge } from "./files.js";
 import {
  formatDoctorIssuesForPrompt,
  formatDoctorReport,
@ -56,14 +59,20 @@ function dispatchDoctorHeal(pi: ExtensionAPI, scope: string | undefined, reportT
  );
 }

+/** Resolve the effective project root, accounting for worktree paths. */
+function projectRoot(): string {
+  return resolveProjectRoot(process.cwd());
+}
+
 export function registerGSDCommand(pi: ExtensionAPI): void {
  pi.registerCommand("gsd", {
-    description: "GSD — Get Shit Done: /gsd next|auto|stop|pause|status|queue|history|undo|skip|export|cleanup|prefs|config|hooks|doctor|migrate|remote|steer",
+    description: "GSD — Get Shit Done: /gsd next|auto|stop|pause|status|visualize|queue|capture|triage|history|undo|skip|export|cleanup|prefs|config|hooks|doctor|migrate|remote|steer|knowledge",
    getArgumentCompletions: (prefix: string) => {
      const subcommands = [
-        "next", "auto", "stop", "pause", "status", "queue", "discuss",
+        "next", "auto", "stop", "pause", "status", "visualize", "queue", "discuss",
+        "capture", "triage",
        "history", "undo", "skip", "export", "cleanup", "prefs",
-        "config", "hooks", "doctor", "migrate", "remote", "steer",
+        "config", "hooks", "doctor", "migrate", "remote", "steer", "knowledge",
      ];
      const parts = prefix.trim().split(/\s+/);

@ -126,6 +135,13 @@ export function registerGSDCommand(pi: ExtensionAPI): void {
          .map((cmd) => ({ value: `cleanup ${cmd}`, label: cmd }));
      }

+      if (parts[0] === "knowledge" && parts.length <= 2) {
+        const subPrefix = parts[1] ?? "";
+        return ["rule", "pattern", "lesson"]
+          .filter((cmd) => cmd.startsWith(subPrefix))
+          .map((cmd) => ({ value: `knowledge ${cmd}`, label: cmd }));
+      }
+
      if (parts[0] === "doctor") {
        const modePrefix = parts[1] ?? "";
        const modes = ["fix", "heal", "audit"];
@ -150,6 +166,11 @@ export function registerGSDCommand(pi: ExtensionAPI): void {
        return;
      }

+      if (trimmed === "visualize") {
+        await handleVisualize(ctx);
+        return;
+      }
+
      if (trimmed === "prefs" || trimmed.startsWith("prefs ")) {
        await handlePrefs(trimmed.replace(/^prefs\s*/, "").trim(), ctx);
        return;
@ -162,24 +183,24 @@ export function registerGSDCommand(pi: ExtensionAPI): void {

      if (trimmed === "next" || trimmed.startsWith("next ")) {
        if (trimmed.includes("--dry-run")) {
-          await handleDryRun(ctx, process.cwd());
+          await handleDryRun(ctx, projectRoot());
          return;
        }
        const verboseMode = trimmed.includes("--verbose");
-        await startAuto(ctx, pi, process.cwd(), verboseMode, { step: true });
+        await startAuto(ctx, pi, projectRoot(), verboseMode, { step: true });
        return;
      }

      if (trimmed === "auto" || trimmed.startsWith("auto ")) {
        const verboseMode = trimmed.includes("--verbose");
-        await startAuto(ctx, pi, process.cwd(), verboseMode);
+        await startAuto(ctx, pi, projectRoot(), verboseMode);
        return;
      }

      if (trimmed === "stop") {
        if (!isAutoActive() && !isAutoPaused()) {
          // Not running in this process — check for a remote auto-mode session
-          const result = stopAutoRemote(process.cwd());
+          const result = stopAutoRemote(projectRoot());
          if (result.found) {
            ctx.ui.notify(`Sent stop signal to auto-mode session (PID ${result.pid}). It will shut down gracefully.`, "info");
          } else if (result.error) {
@ -207,42 +228,52 @@ export function registerGSDCommand(pi: ExtensionAPI): void {
      }

      if (trimmed === "history" || trimmed.startsWith("history ")) {
-        await handleHistory(trimmed.replace(/^history\s*/, "").trim(), ctx, process.cwd());
+        await handleHistory(trimmed.replace(/^history\s*/, "").trim(), ctx, projectRoot());
        return;
      }

      if (trimmed === "undo" || trimmed.startsWith("undo ")) {
-        await handleUndo(trimmed.replace(/^undo\s*/, "").trim(), ctx, pi, process.cwd());
+        await handleUndo(trimmed.replace(/^undo\s*/, "").trim(), ctx, pi, projectRoot());
        return;
      }

      if (trimmed.startsWith("skip ")) {
-        await handleSkip(trimmed.replace(/^skip\s*/, "").trim(), ctx, process.cwd());
+        await handleSkip(trimmed.replace(/^skip\s*/, "").trim(), ctx, projectRoot());
        return;
      }

      if (trimmed === "export" || trimmed.startsWith("export ")) {
-        await handleExport(trimmed.replace(/^export\s*/, "").trim(), ctx, process.cwd());
+        await handleExport(trimmed.replace(/^export\s*/, "").trim(), ctx, projectRoot());
        return;
      }

      if (trimmed === "cleanup branches") {
-        await handleCleanupBranches(ctx, process.cwd());
+        await handleCleanupBranches(ctx, projectRoot());
        return;
      }

      if (trimmed === "cleanup snapshots") {
-        await handleCleanupSnapshots(ctx, process.cwd());
+        await handleCleanupSnapshots(ctx, projectRoot());
        return;
      }

      if (trimmed === "queue") {
-        await showQueue(ctx, pi, process.cwd());
+        await showQueue(ctx, pi, projectRoot());
        return;
      }

      if (trimmed === "discuss") {
-        await showDiscuss(ctx, pi, process.cwd());
+        await showDiscuss(ctx, pi, projectRoot());
+        return;
+      }
+
+      if (trimmed.startsWith("capture ") || trimmed === "capture") {
+        await handleCapture(trimmed.replace(/^capture\s*/, "").trim(), ctx);
+        return;
+      }
+
+      if (trimmed === "triage") {
+        await handleTriage(ctx, pi, process.cwd());
        return;
      }

@ -266,6 +297,15 @@ export function registerGSDCommand(pi: ExtensionAPI): void {
        return;
      }

+      if (trimmed.startsWith("knowledge ")) {
+        await handleKnowledge(trimmed.replace(/^knowledge\s+/, "").trim(), ctx);
+        return;
+      }
+      if (trimmed === "knowledge") {
+        ctx.ui.notify("Usage: /gsd knowledge <rule|pattern|lesson> <description>. Example: /gsd knowledge rule Use real DB for integration tests", "warning");
+        return;
+      }
+
      if (trimmed === "migrate" || trimmed.startsWith("migrate ")) {
        const { handleMigrate } = await import("./migrate/command.js");
        await handleMigrate(trimmed.replace(/^migrate\s*/, "").trim(), ctx, pi);
@ -279,12 +319,12 @@ export function registerGSDCommand(pi: ExtensionAPI): void {

      if (trimmed === "") {
        // Bare /gsd defaults to step mode
-        await startAuto(ctx, pi, process.cwd(), false, { step: true });
+        await startAuto(ctx, pi, projectRoot(), false, { step: true });
        return;
      }

      ctx.ui.notify(
-        `Unknown: /gsd ${trimmed}. Use /gsd next|auto|stop|pause|status|queue|discuss|history|undo|skip <unit>|export|cleanup|prefs|config|hooks|doctor|migrate|remote|steer <change>.`,
+        `Unknown: /gsd ${trimmed}. Use /gsd next|auto|stop|pause|status|visualize|queue|capture|triage|discuss|history|undo|skip <unit>|export|cleanup|prefs|config|hooks|doctor|migrate|remote|steer <change>|knowledge <type> <entry>.`,
        "warning",
      );
    },
@ -292,7 +332,7 @@ export function registerGSDCommand(pi: ExtensionAPI): void {
 }

 async function handleStatus(ctx: ExtensionCommandContext): Promise<void> {
-  const basePath = process.cwd();
+  const basePath = projectRoot();
  const state = await deriveState(basePath);

  if (state.registry.length === 0) {
@ -322,6 +362,28 @@ export async function fireStatusViaCommand(
  await handleStatus(ctx as ExtensionCommandContext);
 }

+async function handleVisualize(ctx: ExtensionCommandContext): Promise<void> {
+  if (!ctx.hasUI) {
+    ctx.ui.notify("Visualizer requires an interactive terminal.", "warning");
+    return;
+  }
+
+  await ctx.ui.custom<void>(
+    (tui, theme, _kb, done) => {
+      return new GSDVisualizerOverlay(tui, theme, () => done());
+    },
+    {
+      overlay: true,
+      overlayOptions: {
+        width: "80%",
+        minWidth: 80,
+        maxHeight: "90%",
+        anchor: "center",
+      },
+    },
+  );
+}
+
 async function handlePrefs(args: string, ctx: ExtensionCommandContext): Promise<void> {
  const trimmed = args.trim();

@ -376,9 +438,9 @@ async function handleDoctor(args: string, ctx: ExtensionCommandContext, pi: Exte
  const parts = trimmed ? trimmed.split(/\s+/) : [];
  const mode = parts[0] === "fix" || parts[0] === "heal" || parts[0] === "audit" ? parts[0] : "doctor";
  const requestedScope = mode === "doctor" ? parts[0] : parts[1];
-  const scope = await selectDoctorScope(process.cwd(), requestedScope);
+  const scope = await selectDoctorScope(projectRoot(), requestedScope);
  const effectiveScope = mode === "audit" ? requestedScope : scope;
-  const report = await runGSDDoctor(process.cwd(), {
+  const report = await runGSDDoctor(projectRoot(), {
    fix: mode === "fix" || mode === "heal",
    scope: effectiveScope,
  });
@ -495,8 +557,10 @@ async function handlePrefsWizard(
    prefs.auto_supervisor = autoSup;
  }

-  // ─── Git main branch ────────────────────────────────────────────────────
+  // ─── Git settings ───────────────────────────────────────────────────────
  const git: Record<string, unknown> = (prefs.git as Record<string, unknown>) ?? {};
+
+  // main_branch
  const currentBranch = git.main_branch ? String(git.main_branch) : "";
  const branchInput = await ctx.ui.input(
    `Git main branch${currentBranch ? ` (current: ${currentBranch})` : ""}:`,
@ -510,6 +574,90 @@ async function handlePrefsWizard(
      delete git.main_branch;
    }
  }
+
+  // Boolean git toggles
+  const gitBooleanFields = [
+    { key: "auto_push", label: "Auto-push commits after committing", defaultVal: false },
+    { key: "push_branches", label: "Push milestone branches to remote", defaultVal: false },
+    { key: "snapshots", label: "Create WIP snapshot commits during long tasks", defaultVal: false },
+  ] as const;
+
+  for (const field of gitBooleanFields) {
+    const current = git[field.key];
+    const currentStr = current !== undefined ? String(current) : "";
+    const choice = await ctx.ui.select(
+      `${field.label}${currentStr ? ` (current: ${currentStr})` : ` (default: ${field.defaultVal})`}:`,
+      ["true", "false", "(keep current)"],
+    );
+    if (choice && choice !== "(keep current)") {
+      git[field.key] = choice === "true";
+    }
+  }
+
+  // remote
+  const currentRemote = git.remote ? String(git.remote) : "";
+  const remoteInput = await ctx.ui.input(
+    `Git remote name${currentRemote ? ` (current: ${currentRemote})` : " (default: origin)"}:`,
+    currentRemote || "origin",
+  );
+  if (remoteInput !== null && remoteInput !== undefined) {
+    const val = remoteInput.trim();
+    if (val && val !== "origin") {
+      git.remote = val;
+    } else if (!val && currentRemote) {
+      delete git.remote;
+    }
+  }
+
+  // pre_merge_check
+  const currentPreMerge = git.pre_merge_check !== undefined ? String(git.pre_merge_check) : "";
+  const preMergeChoice = await ctx.ui.select(
+    `Pre-merge check${currentPreMerge ? ` (current: ${currentPreMerge})` : " (default: false)"}:`,
+    ["true", "false", "auto", "(keep current)"],
+  );
+  if (preMergeChoice && preMergeChoice !== "(keep current)") {
+    if (preMergeChoice === "auto") {
+      git.pre_merge_check = "auto";
+    } else {
+      git.pre_merge_check = preMergeChoice === "true";
+    }
+  }
+
+  // commit_type
+  const currentCommitType = git.commit_type ? String(git.commit_type) : "";
+  const commitTypes = ["feat", "fix", "refactor", "docs", "test", "chore", "perf", "ci", "build", "style", "(inferred — default)", "(keep current)"];
+  const commitChoice = await ctx.ui.select(
+    `Default commit type${currentCommitType ? ` (current: ${currentCommitType})` : ""}:`,
+    commitTypes,
+  );
+  if (commitChoice && typeof commitChoice === "string" && commitChoice !== "(keep current)") {
+    if ((commitChoice as string).startsWith("(inferred")) {
+      delete git.commit_type;
+    } else {
+      git.commit_type = commitChoice;
+    }
+  }
+
+  // merge_strategy
+  const currentMerge = git.merge_strategy ? String(git.merge_strategy) : "";
+  const mergeChoice = await ctx.ui.select(
+    `Merge strategy${currentMerge ? ` (current: ${currentMerge})` : ""}:`,
+    ["squash", "merge", "(keep current)"],
+  );
+  if (mergeChoice && mergeChoice !== "(keep current)") {
+    git.merge_strategy = mergeChoice;
+  }
+
+  // isolation
+  const currentIsolation = git.isolation ? String(git.isolation) : "";
+  const isolationChoice = await ctx.ui.select(
+    `Git isolation strategy${currentIsolation ? ` (current: ${currentIsolation})` : " (default: worktree)"}:`,
+    ["worktree", "branch", "(keep current)"],
+  );
+  if (isolationChoice && isolationChoice !== "(keep current)") {
+    git.isolation = isolationChoice;
+  }
+
  // ─── Git commit_docs ────────────────────────────────────────────────────
  const currentCommitDocs = git.commit_docs;
  const commitDocsChoice = await ctx.ui.select(
@ -544,6 +692,89 @@ async function handlePrefsWizard(
    prefs.unique_milestone_ids = uniqueChoice === "true";
  }

+  // ─── Budget & cost control ────────────────────────────────────────────
+  const currentCeiling = prefs.budget_ceiling;
+  const ceilingStr = currentCeiling !== undefined ? String(currentCeiling) : "";
+  const ceilingInput = await ctx.ui.input(
+    `Budget ceiling (USD)${ceilingStr ? ` (current: $${ceilingStr})` : " (default: no limit)"}:`,
+    ceilingStr || "",
+  );
+  if (ceilingInput !== null && ceilingInput !== undefined) {
+    const val = ceilingInput.trim().replace(/^\$/, "");
+    if (val && !isNaN(Number(val)) && isFinite(Number(val))) {
+      prefs.budget_ceiling = Number(val);
+    } else if (val && (isNaN(Number(val)) || !isFinite(Number(val)))) {
+      ctx.ui.notify(`Invalid budget ceiling "${val}" — must be a number. Keeping previous value.`, "warning");
+    } else if (!val && ceilingStr) {
+      delete prefs.budget_ceiling;
+    }
+  }
+
+  const currentEnforcement = (prefs.budget_enforcement as string) ?? "";
+  const enforcementChoice = await ctx.ui.select(
+    `Budget enforcement${currentEnforcement ? ` (current: ${currentEnforcement})` : " (default: pause)"}:`,
+    ["warn", "pause", "halt", "(keep current)"],
+  );
+  if (enforcementChoice && enforcementChoice !== "(keep current)") {
+    prefs.budget_enforcement = enforcementChoice;
+  }
+
+  const currentContextPause = prefs.context_pause_threshold;
+  const contextPauseStr = currentContextPause !== undefined ? String(currentContextPause) : "";
+  const contextPauseInput = await ctx.ui.input(
+    `Context pause threshold (0-100%, 0=disabled)${contextPauseStr ? ` (current: ${contextPauseStr}%)` : " (default: 0)"}:`,
+    contextPauseStr || "0",
+  );
+  if (contextPauseInput !== null && contextPauseInput !== undefined) {
+    const val = contextPauseInput.trim().replace(/%$/, "");
+    if (val && !isNaN(Number(val)) && Number(val) >= 0 && Number(val) <= 100) {
+      const num = Number(val);
+      if (num === 0) {
+        delete prefs.context_pause_threshold;
+      } else {
+        prefs.context_pause_threshold = num;
+      }
+    } else if (val && (isNaN(Number(val)) || Number(val) < 0 || Number(val) > 100)) {
+      ctx.ui.notify(`Invalid context pause threshold "${val}" — must be 0-100. Keeping previous value.`, "warning");
+    }
+  }
+
+  // ─── Notifications ────────────────────────────────────────────────────
+  const notif: Record<string, boolean> = (prefs.notifications as Record<string, boolean>) ?? {};
+  const notifFields = [
+    { key: "enabled", label: "Notifications enabled (master toggle)", defaultVal: true },
+    { key: "on_complete", label: "Notify on unit completion", defaultVal: true },
+    { key: "on_error", label: "Notify on errors", defaultVal: true },
+    { key: "on_budget", label: "Notify on budget thresholds", defaultVal: true },
+    { key: "on_milestone", label: "Notify on milestone completion", defaultVal: true },
+    { key: "on_attention", label: "Notify when manual attention needed", defaultVal: true },
+  ] as const;
+
+  for (const field of notifFields) {
+    const current = notif[field.key];
+    const currentStr = current !== undefined ? String(current) : "";
+    const choice = await ctx.ui.select(
+      `${field.label}${currentStr ? ` (current: ${currentStr})` : ` (default: ${field.defaultVal})`}:`,
+      ["true", "false", "(keep current)"],
+    );
+    if (choice && choice !== "(keep current)") {
+      notif[field.key] = choice === "true";
+    }
+  }
+  if (Object.keys(notif).length > 0) {
+    prefs.notifications = notif;
+  }
+
+  // ─── UAT dispatch ─────────────────────────────────────────────────────
+  const currentUat = prefs.uat_dispatch;
+  const uatChoice = await ctx.ui.select(
+    `UAT dispatch mode${currentUat !== undefined ? ` (current: ${currentUat})` : " (default: false)"}:`,
+    ["true", "false", "(keep current)"],
+  );
+  if (uatChoice && uatChoice !== "(keep current)") {
+    prefs.uat_dispatch = uatChoice === "true";
+  }
+
  // ─── Serialize to frontmatter ───────────────────────────────────────────
  prefs.version = prefs.version || 1;
  const frontmatter = serializePreferencesToFrontmatter(prefs);
@ -634,7 +865,10 @@ function serializePreferencesToFrontmatter(prefs: Record<string, unknown>): stri
  const orderedKeys = [
    "version", "always_use_skills", "prefer_skills", "avoid_skills",
    "skill_rules", "custom_instructions", "models", "skill_discovery",
-    "auto_supervisor", "uat_dispatch", "unique_milestone_ids", "budget_ceiling", "remote_questions", "git",
+    "auto_supervisor", "uat_dispatch", "unique_milestone_ids",
+    "budget_ceiling", "budget_enforcement", "context_pause_threshold",
+    "notifications", "remote_questions", "git",
+    "post_unit_hooks", "pre_dispatch_hooks",
  ];

  const seen = new Set<string>();
@ -972,6 +1206,131 @@ async function handleCleanupSnapshots(ctx: ExtensionCommandContext, basePath: st
  ctx.ui.notify(`Pruned ${pruned} old snapshot refs. ${refs.length - pruned} remain.`, "success");
 }

+async function handleKnowledge(args: string, ctx: ExtensionCommandContext): Promise<void> {
+  const parts = args.split(/\s+/);
+  const typeArg = parts[0]?.toLowerCase();
+
+  if (!typeArg || !["rule", "pattern", "lesson"].includes(typeArg)) {
+    ctx.ui.notify(
+      "Usage: /gsd knowledge <rule|pattern|lesson> <description>\nExample: /gsd knowledge rule Use real DB for integration tests",
+      "warning",
+    );
+    return;
+  }
+
+  const entryText = parts.slice(1).join(" ").trim();
+  if (!entryText) {
+    ctx.ui.notify(`Usage: /gsd knowledge ${typeArg} <description>`, "warning");
+    return;
+  }
+
+  const type = typeArg as "rule" | "pattern" | "lesson";
+  const basePath = process.cwd();
+  const state = await deriveState(basePath);
+  const scope = state.activeMilestone?.id
+    ? `${state.activeMilestone.id}${state.activeSlice ? `/${state.activeSlice.id}` : ""}`
+    : "global";
+
+  await appendKnowledge(basePath, type, entryText, scope);
+  ctx.ui.notify(`Added ${type} to KNOWLEDGE.md: "${entryText}"`, "success");
+}
+
+// ─── Capture Command ──────────────────────────────────────────────────────────
+
+/**
+ * Handle `/gsd capture "..."` — fire-and-forget thought capture.
+ * Appends to `.gsd/CAPTURES.md` without interrupting auto-mode.
+ * Works in all modes: auto running, paused, stopped, no project.
+ */
+async function handleCapture(args: string, ctx: ExtensionCommandContext): Promise<void> {
+  // Strip surrounding quotes from the argument
+  let text = args.trim();
+  if (!text) {
+    ctx.ui.notify('Usage: /gsd capture "your thought here"', "warning");
+    return;
+  }
+  // Remove wrapping quotes (single or double)
+  if ((text.startsWith('"') && text.endsWith('"')) || (text.startsWith("'") && text.endsWith("'"))) {
+    text = text.slice(1, -1);
+  }
+  if (!text) {
+    ctx.ui.notify('Usage: /gsd capture "your thought here"', "warning");
+    return;
+  }
+
+  const basePath = process.cwd();
+
+  // Ensure .gsd/ exists — capture should work even without a milestone
+  const gsdDir = join(basePath, ".gsd");
+  if (!existsSync(gsdDir)) {
+    mkdirSync(gsdDir, { recursive: true });
+  }
+
+  const id = appendCapture(basePath, text);
+  ctx.ui.notify(`Captured: ${id} — "${text.length > 60 ? text.slice(0, 57) + "..." : text}"`, "info");
+}
+
+// ─── Triage Command ───────────────────────────────────────────────────────────
+
+/**
+ * Handle `/gsd triage` — manually trigger triage of pending captures.
+ * Dispatches the triage prompt to the LLM for classification.
+ * Triage result handling (confirmation UI) is wired in T03.
+ */
+async function handleTriage(ctx: ExtensionCommandContext, pi: ExtensionAPI, basePath: string): Promise<void> {
+  if (!hasPendingCaptures(basePath)) {
+    ctx.ui.notify("No pending captures to triage.", "info");
+    return;
+  }
+
+  const pending = loadPendingCaptures(basePath);
+  ctx.ui.notify(`Triaging ${pending.length} pending capture${pending.length === 1 ? "" : "s"}...`, "info");
+
+  // Build context for the triage prompt
+  const state = await deriveState(basePath);
+  let currentPlan = "";
+  let roadmapContext = "";
+
+  if (state.activeMilestone && state.activeSlice) {
+    const { resolveSliceFile, resolveMilestoneFile } = await import("./paths.js");
+    const planFile = resolveSliceFile(basePath, state.activeMilestone.id, state.activeSlice.id, "PLAN");
+    if (planFile) {
+      const { loadFile: load } = await import("./files.js");
+      currentPlan = (await load(planFile)) ?? "";
+    }
+    const roadmapFile = resolveMilestoneFile(basePath, state.activeMilestone.id, "ROADMAP");
+    if (roadmapFile) {
+      const { loadFile: load } = await import("./files.js");
+      roadmapContext = (await load(roadmapFile)) ?? "";
+    }
+  }
+
+  // Format pending captures for the prompt
+  const capturesList = pending.map(c =>
+    `- **${c.id}**: "${c.text}" (captured: ${c.timestamp})`
+  ).join("\n");
+
+  // Dispatch triage prompt
+  const { loadPrompt } = await import("./prompt-loader.js");
+  const prompt = loadPrompt("triage-captures", {
+    pendingCaptures: capturesList,
+    currentPlan: currentPlan || "(no active slice plan)",
+    roadmapContext: roadmapContext || "(no active roadmap)",
+  });
+
+  const workflowPath = process.env.GSD_WORKFLOW_PATH ?? join(process.env.HOME ?? "~", ".pi", "GSD-WORKFLOW.md");
+  const workflow = readFileSync(workflowPath, "utf-8");
+
+  pi.sendMessage(
+    {
+      customType: "gsd-triage",
+      content: `Read the following GSD workflow protocol and execute exactly.\n\n${workflow}\n\n## Your Task\n\n${prompt}`,
+      display: false,
+    },
+    { triggerTurn: true },
+  );
+}
+
 async function handleSteer(change: string, ctx: ExtensionCommandContext, pi: ExtensionAPI): Promise<void> {
  const basePath = process.cwd();
  const state = await deriveState(basePath);
--- a/src/resources/extensions/gsd/complexity-classifier.ts
+++ b/src/resources/extensions/gsd/complexity-classifier.ts
@ -0,0 +1,322 @@
+// GSD Extension — Complexity Classifier
+// Classifies unit complexity for dynamic model routing.
+// Pure heuristics + adaptive learning — no LLM calls. Sub-millisecond classification.
+
+import { existsSync, readFileSync } from "node:fs";
+import { join } from "node:path";
+import { gsdRoot } from "./paths.js";
+import { getAdaptiveTierAdjustment } from "./routing-history.js";
+
+// ─── Types ───────────────────────────────────────────────────────────────────
+
+export type ComplexityTier = "light" | "standard" | "heavy";
+
+export interface ClassificationResult {
+  tier: ComplexityTier;
+  reason: string;
+  downgraded: boolean;   // true if budget pressure lowered the tier
+}
+
+export interface TaskMetadata {
+  fileCount?: number;
+  dependencyCount?: number;
+  isNewFile?: boolean;
+  tags?: string[];
+  estimatedLines?: number;
+  codeBlockCount?: number;      // number of fenced code blocks in plan
+  complexityKeywords?: string[]; // detected complexity signals
+}
+
+// ─── Unit Type → Default Tier Mapping ────────────────────────────────────────
+
+const UNIT_TYPE_TIERS: Record<string, ComplexityTier> = {
+  // Tier 1 — Light: structured summaries, completion, UAT
+  "complete-slice": "light",
+  "run-uat": "light",
+
+  // Tier 2 — Standard: research, routine planning
+  "research-milestone": "standard",
+  "research-slice": "standard",
+  "plan-milestone": "standard",
+  "plan-slice": "standard",
+
+  // Tier 3 — Heavy: execution, replanning (requires deep reasoning)
+  "execute-task": "standard",   // default standard, upgraded by metadata
+  "replan-slice": "heavy",
+  "reassess-roadmap": "heavy",
+};
+
+// ─── Public API ──────────────────────────────────────────────────────────────
+
+/**
+ * Classify unit complexity to determine which model tier to use.
+ *
+ * @param unitType    The type of unit being dispatched
+ * @param unitId      The unit ID (e.g. "M001/S01/T01")
+ * @param basePath    Project base path (for reading task plans)
+ * @param budgetPct   Current budget usage as fraction (0.0-1.0+), or undefined if no budget
+ * @param metadata    Optional pre-parsed task metadata
+ */
+export function classifyUnitComplexity(
+  unitType: string,
+  unitId: string,
+  basePath: string,
+  budgetPct?: number,
+  metadata?: TaskMetadata,
+): ClassificationResult {
+  // Hook units default to light
+  if (unitType.startsWith("hook/")) {
+    const result: ClassificationResult = { tier: "light", reason: "hook unit", downgraded: false };
+    return applyBudgetPressure(result, budgetPct);
+  }
+
+  // Start with the default tier for this unit type
+  let tier = UNIT_TYPE_TIERS[unitType] ?? "standard";
+  let reason = `unit type: ${unitType}`;
+
+  // For execute-task, analyze task metadata for complexity signals
+  if (unitType === "execute-task") {
+    const taskAnalysis = analyzeTaskComplexity(unitId, basePath, metadata);
+    tier = taskAnalysis.tier;
+    reason = taskAnalysis.reason;
+  }
+
+  // For plan-slice, check if the slice has many tasks (complex planning)
+  if (unitType === "plan-slice" || unitType === "plan-milestone") {
+    const planAnalysis = analyzePlanComplexity(unitId, basePath);
+    if (planAnalysis) {
+      tier = planAnalysis.tier;
+      reason = planAnalysis.reason;
+    }
+  }
+
+  // Adaptive learning: check if history suggests bumping the tier
+  const tags = metadata?.tags ?? extractTaskMetadata(unitId, basePath).tags;
+  const adaptiveAdjustment = getAdaptiveTierAdjustment(unitType, tier, tags);
+  if (adaptiveAdjustment && tierOrdinal(adaptiveAdjustment) > tierOrdinal(tier)) {
+    reason = `${reason} (adaptive: high failure rate at ${tier})`;
+    tier = adaptiveAdjustment;
+  }
+
+  const result: ClassificationResult = { tier, reason, downgraded: false };
+  return applyBudgetPressure(result, budgetPct);
+}
+
+/**
+ * Get a short label for the tier (for dashboard display).
+ */
+export function tierLabel(tier: ComplexityTier): string {
+  switch (tier) {
+    case "light": return "L";
+    case "standard": return "S";
+    case "heavy": return "H";
+  }
+}
+
+/**
+ * Get the tier ordering value (for comparison).
+ */
+export function tierOrdinal(tier: ComplexityTier): number {
+  switch (tier) {
+    case "light": return 0;
+    case "standard": return 1;
+    case "heavy": return 2;
+  }
+}
+
+// ─── Task Complexity Analysis ────────────────────────────────────────────────
+
+interface TaskAnalysis {
+  tier: ComplexityTier;
+  reason: string;
+}
+
+function analyzeTaskComplexity(
+  unitId: string,
+  basePath: string,
+  metadata?: TaskMetadata,
+): TaskAnalysis {
+  // Try to read task plan for complexity signals
+  const meta = metadata ?? extractTaskMetadata(unitId, basePath);
+
+  // Heavy signals
+  if (meta.dependencyCount && meta.dependencyCount >= 3) {
+    return { tier: "heavy", reason: `${meta.dependencyCount} dependencies` };
+  }
+  if (meta.fileCount && meta.fileCount >= 6) {
+    return { tier: "heavy", reason: `${meta.fileCount} files to modify` };
+  }
+  if (meta.estimatedLines && meta.estimatedLines >= 500) {
+    return { tier: "heavy", reason: `~${meta.estimatedLines} lines estimated` };
+  }
+
+  // Heavy signals from complexity keywords (Phase 4)
+  if (meta.complexityKeywords && meta.complexityKeywords.length >= 2) {
+    return { tier: "heavy", reason: `complex: ${meta.complexityKeywords.join(", ")}` };
+  }
+  if (meta.codeBlockCount && meta.codeBlockCount >= 5) {
+    return { tier: "heavy", reason: `${meta.codeBlockCount} code blocks in plan` };
+  }
+
+  // Standard signals from single complexity keyword
+  if (meta.complexityKeywords && meta.complexityKeywords.length === 1) {
+    return { tier: "standard", reason: `${meta.complexityKeywords[0]} task` };
+  }
+
+  // Light signals (simple tasks)
+  if (meta.tags?.some(t => /^(docs?|readme|comment|config|typo|rename)$/i.test(t))) {
+    return { tier: "light", reason: `simple task: ${meta.tags.join(", ")}` };
+  }
+  if (meta.fileCount !== undefined && meta.fileCount <= 1 && !meta.isNewFile) {
+    return { tier: "light", reason: "single file modification" };
+  }
+
+  // Standard by default
+  return { tier: "standard", reason: "standard execution task" };
+}
+
+function analyzePlanComplexity(
+  unitId: string,
+  basePath: string,
+): TaskAnalysis | null {
+  // Check if this is a milestone-level plan (more complex) vs single slice
+  const parts = unitId.split("/");
+  if (parts.length === 1) {
+    // Milestone-level planning is always at least standard
+    return { tier: "standard", reason: "milestone-level planning" };
+  }
+
+  // For slice planning, try to read the context/research to gauge complexity
+  // If research exists and is large, bump to heavy
+  const [mid, sid] = parts;
+  const researchPath = join(gsdRoot(basePath), mid, "slices", sid, "RESEARCH.md");
+  try {
+    if (existsSync(researchPath)) {
+      const content = readFileSync(researchPath, "utf-8");
+      const lineCount = content.split("\n").length;
+      if (lineCount > 200) {
+        return { tier: "heavy", reason: `complex slice: ${lineCount}-line research` };
+      }
+    }
+  } catch {
+    // Non-fatal
+  }
+
+  return null; // Use default tier
+}
+
+/**
+ * Extract task metadata from the task plan file on disk.
+ */
+function extractTaskMetadata(unitId: string, basePath: string): TaskMetadata {
+  const meta: TaskMetadata = {};
+  const parts = unitId.split("/");
+  if (parts.length !== 3) return meta;
+
+  const [mid, sid, tid] = parts;
+  const taskPlanPath = join(gsdRoot(basePath), mid, "slices", sid, "tasks", `${tid}-PLAN.md`);
+
+  try {
+    if (!existsSync(taskPlanPath)) return meta;
+    const content = readFileSync(taskPlanPath, "utf-8");
+    const lines = content.split("\n");
+
+    // Count files mentioned in "Files:" or "- Files:" lines
+    const fileLines = lines.filter(l => /^\s*-?\s*files?\s*:/i.test(l));
+    if (fileLines.length > 0) {
+      // Count comma-separated or bullet-pointed files
+      const allFiles = new Set<string>();
+      for (const line of fileLines) {
+        const filesStr = line.replace(/^\s*-?\s*files?\s*:\s*/i, "");
+        const files = filesStr.split(/[,;]/).map(f => f.trim()).filter(Boolean);
+        files.forEach(f => allFiles.add(f));
+      }
+      meta.fileCount = allFiles.size;
+    }
+
+    // Check for "new file" or "create" keywords
+    meta.isNewFile = lines.some(l => /\b(create|new file|scaffold|bootstrap)\b/i.test(l));
+
+    // Look for tags/labels in frontmatter or content
+    const tags: string[] = [];
+    if (content.match(/\b(refactor|migration|architect)/i)) tags.push("refactor");
+    if (content.match(/\b(test|spec|coverage)\b/i)) tags.push("test");
+    if (content.match(/\b(doc|readme|comment|jsdoc)\b/i)) tags.push("docs");
+    if (content.match(/\b(config|env|setting)\b/i)) tags.push("config");
+    if (content.match(/\b(rename|typo|spelling)\b/i)) tags.push("rename");
+    meta.tags = tags;
+
+    // Try to extract estimated lines from content
+    const estimateMatch = content.match(/~?\s*(\d+)\s*lines?\b/i);
+    if (estimateMatch) {
+      meta.estimatedLines = parseInt(estimateMatch[1], 10);
+    }
+
+    // Phase 4: Deeper introspection signals
+
+    // Count fenced code blocks (```) — more code blocks = more complex implementation
+    const codeBlockMatches = content.match(/^```/gm);
+    meta.codeBlockCount = codeBlockMatches ? Math.floor(codeBlockMatches.length / 2) : 0;
+
+    // Detect complexity keywords that suggest harder tasks
+    const complexityKeywords: string[] = [];
+    if (content.match(/\b(migration|migrate|schema change)\b/i)) complexityKeywords.push("migration");
+    if (content.match(/\b(architect|design pattern|system design)\b/i)) complexityKeywords.push("architecture");
+    if (content.match(/\b(security|auth|encrypt|credential|vulnerability)\b/i)) complexityKeywords.push("security");
+    if (content.match(/\b(performance|optimize|cache|index)\b/i)) complexityKeywords.push("performance");
+    if (content.match(/\b(concurrent|parallel|race condition|mutex|lock)\b/i)) complexityKeywords.push("concurrency");
+    if (content.match(/\b(backward.?compat|breaking change|deprecat)\b/i)) complexityKeywords.push("compatibility");
+    meta.complexityKeywords = complexityKeywords;
+  } catch {
+    // Non-fatal — metadata extraction is best-effort
+  }
+
+  return meta;
+}
+
+// ─── Budget Pressure ─────────────────────────────────────────────────────────
+
+/**
+ * Apply budget pressure to a classification result.
+ * As budget usage increases, more aggressively downgrade tiers.
+ *
+ * - <50%:   Normal classification (no change)
+ * - 50-75%: Tier 2 → Tier 1 where possible
+ * - 75-90%: Only heavy tasks keep configured model
+ * - >90%:   Everything except replan-slice gets cheapest model
+ */
+function applyBudgetPressure(
+  result: ClassificationResult,
+  budgetPct?: number,
+): ClassificationResult {
+  if (budgetPct === undefined || budgetPct < 0.5) return result;
+
+  const original = result.tier;
+
+  if (budgetPct >= 0.9) {
+    // >90%: almost everything goes to light
+    if (result.tier !== "heavy") {
+      result.tier = "light";
+    } else {
+      // Even heavy gets downgraded to standard
+      result.tier = "standard";
+    }
+  } else if (budgetPct >= 0.75) {
+    // 75-90%: only heavy stays, everything else goes to light
+    if (result.tier === "standard") {
+      result.tier = "light";
+    }
+  } else {
+    // 50-75%: standard → light
+    if (result.tier === "standard") {
+      result.tier = "light";
+    }
+  }
+
+  if (result.tier !== original) {
+    result.downgraded = true;
+    result.reason = `${result.reason} (budget pressure: ${Math.round(budgetPct * 100)}%)`;
+  }
+
+  return result;
+}
--- a/src/resources/extensions/gsd/dashboard-overlay.ts
+++ b/src/resources/extensions/gsd/dashboard-overlay.ts
@ -40,6 +40,9 @@ function unitLabel(type: string): string {
    case "execute-task": return "Execute";
    case "complete-slice": return "Complete";
    case "reassess-roadmap": return "Reassess";
+    case "triage-captures": return "Triage";
+    case "quick-task": return "Quick Task";
+    case "replan-slice": return "Replan";
    default: return type;
  }
 }
@ -346,6 +349,13 @@ export class GSDDashboardOverlay {
      lines.push(blank());
    }

+    // Pending captures badge — only shown when captures are waiting for triage
+    if (this.dashData.pendingCaptureCount > 0) {
+      const count = this.dashData.pendingCaptureCount;
+      lines.push(row(th.fg("warning", `📌 ${count} pending capture${count === 1 ? "" : "s"} awaiting triage`)));
+      lines.push(blank());
+    }
+
    if (this.loading) {
      lines.push(centered(th.fg("dim", "Loading dashboard…")));
      return lines;
--- a/src/resources/extensions/gsd/dispatch-guard.ts
+++ b/src/resources/extensions/gsd/dispatch-guard.ts
@ -5,7 +5,7 @@ import { readFileSync } from "node:fs";
 import { readdirSync } from "node:fs";
 import { resolveMilestoneFile, milestonesDir } from "./paths.js";
 import { parseRoadmapSlices } from "./roadmap-slices.js";
-import { extractMilestoneSeq, milestoneIdSort } from "./guided-flow.js";
+import { findMilestoneIds } from "./guided-flow.js";

 const SLICE_DISPATCH_TYPES = new Set([
  "research-slice",
@ -43,24 +43,12 @@ export function getPriorSliceCompletionBlocker(base: string, _mainBranch: string
  const [targetMid, targetSid] = unitId.split("/");
  if (!targetMid || !targetSid) return null;

-  const targetSeq = extractMilestoneSeq(targetMid);
-  if (targetSeq === 0) return null;
-
-  // Scan actual milestone directories instead of iterating by number
-  let milestoneIds: string[];
-  try {
-    milestoneIds = readdirSync(milestonesDir(base), { withFileTypes: true })
-      .filter(d => d.isDirectory())
-      .map(d => {
-        const match = d.name.match(/^(M\d+(?:-[a-z0-9]{6})?)/);
-        return match ? match[1] : null;
-      })
-      .filter((id): id is string => id !== null)
-      .sort(milestoneIdSort)
-      .filter(id => extractMilestoneSeq(id) <= targetSeq);
-  } catch {
-    return null;
-  }
+  // Use findMilestoneIds to respect custom queue order.
+  // Only check milestones that come BEFORE the target in queue order.
+  const allIds = findMilestoneIds(base);
+  const targetIdx = allIds.indexOf(targetMid);
+  if (targetIdx < 0) return null;
+  const milestoneIds = allIds.slice(0, targetIdx + 1);

  for (const mid of milestoneIds) {
    // Read from disk (working tree) — always has the latest state
--- a/src/resources/extensions/gsd/docs/preferences-reference.md
+++ b/src/resources/extensions/gsd/docs/preferences-reference.md
@ -80,9 +80,9 @@ Setting `prefer_skills: []` does **not** disable skill discovery — it just mea

 - `skill_rules`: situational rules with a human-readable `when` trigger and one or more of `use`, `prefer`, or `avoid`.

- `custom_instructions`: extra durable instructions related to skill use.
+- `custom_instructions`: extra durable instructions related to skill use. For operational project knowledge (recurring rules, gotchas, patterns), use `.gsd/KNOWLEDGE.md` instead — it's injected into every agent prompt automatically and agents can append to it during execution.

- `models`: per-stage model selection for auto-mode. Keys: `research`, `planning`, `execution`, `completion`. Values can be:
+- `models`: per-stage model selection for auto-mode. Keys: `research`, `planning`, `execution`, `execution_simple`, `completion`, `subagent`. Values can be:
  - Simple string: `"claude-sonnet-4-6"` — single model, no fallbacks
  - Provider-qualified string: `"bedrock/claude-sonnet-4-6"` — targets a specific provider when the same model ID exists across multiple providers
  - Object with fallbacks: `{ model: "claude-opus-4-6", fallbacks: ["glm-5", "minimax-m2.5"] }` — tries fallbacks in order if primary fails
@ -108,10 +108,75 @@ Setting `prefer_skills: []` does **not** disable skill discovery — it just mea
  - `pre_merge_check`: boolean or `"auto"` — run pre-merge checks before merging a worktree back to the integration branch. `true` always runs, `false` never runs, `"auto"` runs when CI is detected. Default: `false`.
  - `commit_type`: string — override the conventional commit type prefix. Must be one of: `feat`, `fix`, `refactor`, `docs`, `test`, `chore`, `perf`, `ci`, `build`, `style`. Default: inferred from diff content.
  - `main_branch`: string — the primary branch name for new git repos (e.g., `"main"`, `"master"`, `"trunk"`). Also used by `getMainBranch()` as the preferred branch when auto-detection is ambiguous. Default: `"main"`.
+  - `merge_strategy`: `"squash"` or `"merge"` — controls how worktree branches are merged back. `"squash"` combines all commits into one; `"merge"` preserves individual commits. Default: `"squash"`.
+  - `isolation`: `"worktree"` or `"branch"` — controls auto-mode git isolation strategy. `"worktree"` creates a milestone worktree for isolated work; `"branch"` works directly in the project root (useful for submodule-heavy repos). Default: `"worktree"`.
  - `commit_docs`: boolean — when `false`, prevents GSD from committing `.gsd/` planning artifacts to git. The `.gsd/` folder is added to `.gitignore` and kept local-only. Useful for teams where only some members use GSD, or when company policy requires a clean repository. Default: `true`.

 - `unique_milestone_ids`: boolean — when `true`, generates milestone IDs in `M{seq}-{rand6}` format (e.g. `M001-eh88as`) instead of plain sequential `M001`. Prevents ID collisions in team workflows where multiple contributors create milestones concurrently. Both formats coexist — existing `M001`-style milestones remain valid. Default: `false`.

+- `budget_ceiling`: number — maximum dollar amount to spend on auto-mode. When reached, behavior is controlled by `budget_enforcement`. Default: no limit.
+
+- `budget_enforcement`: `"warn"`, `"pause"`, or `"halt"` — action taken when `budget_ceiling` is reached.
+  - `warn` — log a warning but continue execution.
+  - `pause` — pause auto-mode and wait for user confirmation.
+  - `halt` — stop auto-mode immediately.
+  - Default: `"pause"`.
+
+- `context_pause_threshold`: number (0-100) — context window usage percentage at which auto-mode should pause to suggest checkpointing. Set to `0` to disable. Default: `0` (disabled).
+
+- `token_profile`: `"budget"`, `"balanced"`, or `"quality"` — coordinates model selection, phase skipping, and context compression. `budget` skips research/reassessment and uses cheaper models; `balanced` (default) runs all phases; `quality` prefers higher-quality models. See token-optimization docs.
+
+- `phases`: fine-grained control over which phases run. Usually set by `token_profile`, but can be overridden. Keys:
+  - `skip_research`: boolean — skip milestone-level research. Default: `false`.
+  - `skip_reassess`: boolean — skip roadmap reassessment after each slice. Default: `false`.
+  - `skip_slice_research`: boolean — skip per-slice research. Default: `false`.
+
+- `remote_questions`: route interactive questions to Slack/Discord for headless auto-mode. Keys:
+  - `channel`: `"slack"` or `"discord"` — channel type.
+  - `channel_id`: string or number — channel ID.
+  - `timeout_minutes`: number — question timeout in minutes (clamped 1-30).
+  - `poll_interval_seconds`: number — poll interval in seconds (clamped 2-30).
+
+- `notifications`: configures desktop notification behavior during auto-mode. Keys:
+  - `enabled`: boolean — master toggle for all notifications. Default: `true`.
+  - `on_complete`: boolean — notify when a unit completes. Default: `true`.
+  - `on_error`: boolean — notify on errors. Default: `true`.
+  - `on_budget`: boolean — notify when budget thresholds are reached. Default: `true`.
+  - `on_milestone`: boolean — notify when a milestone finishes. Default: `true`.
+  - `on_attention`: boolean — notify when manual attention is needed. Default: `true`.
+
+- `uat_dispatch`: boolean — when `true`, enables UAT (User Acceptance Testing) dispatch mode. Default: `false`.
+
+- `post_unit_hooks`: array — hooks that fire after a unit completes. Each entry has:
+  - `name`: string — unique hook identifier.
+  - `after`: string[] — unit types that trigger this hook (e.g., `["execute-task"]`).
+  - `prompt`: string — prompt sent to the LLM. Supports `{milestoneId}`, `{sliceId}`, `{taskId}` substitutions.
+  - `max_cycles`: number — max times this hook fires per trigger (default: 1, max: 10).
+  - `model`: string — optional model override.
+  - `artifact`: string — expected output file name (relative to task/slice dir). Hook is skipped if file already exists (idempotent).
+  - `retry_on`: string — if this file is produced instead of the artifact, re-run the trigger unit then re-run hooks.
+  - `agent`: string — agent definition file to use for hook execution.
+  - `enabled`: boolean — toggle without removing (default: `true`).
+
+- `pre_dispatch_hooks`: array — hooks that fire before a unit is dispatched. Each entry has:
+  - `name`: string — unique hook identifier.
+  - `before`: string[] — unit types to intercept.
+  - `action`: `"modify"`, `"skip"`, or `"replace"` — what to do with the unit.
+  - `prepend`: string — text prepended to unit prompt (for `"modify"` action).
+  - `append`: string — text appended to unit prompt (for `"modify"` action).
+  - `prompt`: string — replacement prompt (for `"replace"` action; required when action is `"replace"`).
+  - `unit_type`: string — override unit type label (for `"replace"` action).
+  - `skip_if`: string — for `"skip"` action: only skip if this file exists (relative to unit dir).
+  - `model`: string — optional model override when this hook fires.
+  - `enabled`: boolean — toggle without removing (default: `true`).
+
+  **Action validation:**
+  - `"modify"` requires at least one of `prepend` or `append`.
+  - `"replace"` requires `prompt`.
+  - `"skip"` is valid with no additional fields.
+
+  **Known unit types for `before`/`after`:** `research-milestone`, `plan-milestone`, `research-slice`, `plan-slice`, `execute-task`, `complete-slice`, `replan-slice`, `reassess-roadmap`, `run-uat`.
+
 ---

 ## Best Practices
@ -277,3 +342,137 @@ git:
 ```

 All git fields are optional. Omit any field to use the default behavior. Project-level preferences override global preferences on a per-field basis.
+
+---
+
+## Budget & Cost Control Example
+
+```yaml
+---
+version: 1
+budget_ceiling: 10.00
+budget_enforcement: pause
+context_pause_threshold: 80
+---
+```
+
+Sets a $10 budget ceiling. Auto-mode pauses when the ceiling is reached. Context window pauses at 80% usage for checkpointing.
+
+---
+
+## Notifications Example
+
+```yaml
+---
+version: 1
+notifications:
+  enabled: true
+  on_complete: false
+  on_error: true
+  on_budget: true
+  on_milestone: true
+  on_attention: true
+---
+```
+
+Disables per-unit completion notifications (noisy in long runs) while keeping error, budget, milestone, and attention notifications enabled.
+
+---
+
+## Post-Unit Hooks Example
+
+```yaml
+---
+version: 1
+post_unit_hooks:
+  - name: code-review
+    after:
+      - execute-task
+    prompt: "Review the code changes in {sliceId}/{taskId} for quality, security, and test coverage."
+    max_cycles: 1
+    artifact: REVIEW.md
+---
+```
+
+Runs an automated code review after each task execution. Skips if `REVIEW.md` already exists (idempotent).
+
+---
+
+## Pre-Dispatch Hooks Examples
+
+**Modify — inject instructions before every task:**
+
+```yaml
+---
+version: 1
+pre_dispatch_hooks:
+  - name: enforce-standards
+    before:
+      - execute-task
+    action: modify
+    prepend: "Follow our TypeScript coding standards and always run linting."
+---
+```
+
+**Skip — skip per-slice research when a research file already exists:**
+
+```yaml
+---
+version: 1
+pre_dispatch_hooks:
+  - name: skip-existing-research
+    before:
+      - research-slice
+    action: skip
+    skip_if: RESEARCH.md
+---
+```
+
+**Replace — substitute a custom prompt for task execution:**
+
+```yaml
+---
+version: 1
+pre_dispatch_hooks:
+  - name: tdd-execute
+    before:
+      - execute-task
+    action: replace
+    prompt: "Implement the task using strict TDD. Write failing tests first, then implement, then refactor."
+    model: claude-opus-4-6
+---
+```
+
+---
+
+## Token Profile & Phases Example
+
+```yaml
+---
+version: 1
+token_profile: budget
+phases:
+  skip_research: true
+  skip_reassess: true
+  skip_slice_research: false
+---
+```
+
+Uses the `budget` profile to minimize token usage, with explicit override to keep slice-level research enabled.
+
+---
+
+## Remote Questions Example
+
+```yaml
+---
+version: 1
+remote_questions:
+  channel: slack
+  channel_id: "C0123456789"
+  timeout_minutes: 15
+  poll_interval_seconds: 10
+---
+```
+
+Routes interactive questions to a Slack channel for headless auto-mode sessions. Questions time out after 15 minutes if unanswered.
--- a/src/resources/extensions/gsd/files.ts
+++ b/src/resources/extensions/gsd/files.ts
@ -849,7 +849,7 @@ export function parseContextDependsOn(content: string | null): string[] {
  const fm = parseFrontmatterMap(fmLines);
  const raw = fm['depends_on'];
  if (!Array.isArray(raw) || raw.length === 0) return [];
-  return (raw as string[]).map(s => String(s).toUpperCase().trim()).filter(Boolean);
+  return (raw as string[]).map(s => String(s).trim()).filter(Boolean);
 }

 /**
@ -951,6 +951,128 @@ export async function appendOverride(basePath: string, change: string, appliedAt
  }
 }

+export async function appendKnowledge(
+  basePath: string,
+  type: "rule" | "pattern" | "lesson",
+  entry: string,
+  scope: string,
+): Promise<void> {
+  const knowledgePath = resolveGsdRootFile(basePath, "KNOWLEDGE");
+  const existing = await loadFile(knowledgePath);
+
+  if (existing) {
+    // Find the next ID for this type
+    const prefix = type === "rule" ? "K" : type === "pattern" ? "P" : "L";
+    const idPattern = new RegExp(`^\\| ${prefix}(\\d+)`, "gm");
+    let maxId = 0;
+    let match;
+    while ((match = idPattern.exec(existing)) !== null) {
+      const num = parseInt(match[1], 10);
+      if (num > maxId) maxId = num;
+    }
+    const nextId = `${prefix}${String(maxId + 1).padStart(3, "0")}`;
+
+    // Build the table row
+    let row: string;
+    if (type === "rule") {
+      row = `| ${nextId} | ${scope} | ${entry} | — | manual |`;
+    } else if (type === "pattern") {
+      row = `| ${nextId} | ${entry} | — | ${scope} |`;
+    } else {
+      row = `| ${nextId} | ${entry} | — | — | ${scope} |`;
+    }
+
+    // Find the right section and append after the table header
+    const sectionHeading = type === "rule" ? "## Rules" : type === "pattern" ? "## Patterns" : "## Lessons Learned";
+    const sectionIdx = existing.indexOf(sectionHeading);
+    if (sectionIdx !== -1) {
+      // Find the end of the table header row (the |---|...| line)
+      const afterHeading = existing.indexOf("\n", sectionIdx);
+      // Find the next section or end
+      const nextSection = existing.indexOf("\n## ", afterHeading + 1);
+      const insertPoint = nextSection !== -1 ? nextSection : existing.length;
+
+      // Insert row before the next section (or at end)
+      const before = existing.slice(0, insertPoint).trimEnd();
+      const after = existing.slice(insertPoint);
+      await saveFile(knowledgePath, before + "\n" + row + "\n" + after);
+    } else {
+      // Section not found — append at end
+      await saveFile(knowledgePath, existing.trimEnd() + "\n\n" + row + "\n");
+    }
+  } else {
+    // Create file from scratch with template header
+    const header = [
+      "# Project Knowledge",
+      "",
+      "Append-only register of project-specific rules, patterns, and lessons learned.",
+      "Agents read this before every unit. Add entries when you discover something worth remembering.",
+      "",
+    ].join("\n");
+
+    let content: string;
+    if (type === "rule") {
+      content = header + [
+        "## Rules",
+        "",
+        "| # | Scope | Rule | Why | Added |",
+        "|---|-------|------|-----|-------|",
+        `| K001 | ${scope} | ${entry} | — | manual |`,
+        "",
+        "## Patterns",
+        "",
+        "| # | Pattern | Where | Notes |",
+        "|---|---------|-------|-------|",
+        "",
+        "## Lessons Learned",
+        "",
+        "| # | What Happened | Root Cause | Fix | Scope |",
+        "|---|--------------|------------|-----|-------|",
+        "",
+      ].join("\n");
+    } else if (type === "pattern") {
+      content = header + [
+        "## Rules",
+        "",
+        "| # | Scope | Rule | Why | Added |",
+        "|---|-------|------|-----|-------|",
+        "",
+        "## Patterns",
+        "",
+        "| # | Pattern | Where | Notes |",
+        "|---|---------|-------|-------|",
+        `| P001 | ${entry} | — | ${scope} |`,
+        "",
+        "## Lessons Learned",
+        "",
+        "| # | What Happened | Root Cause | Fix | Scope |",
+        "|---|--------------|------------|-----|-------|",
+        "",
+      ].join("\n");
+    } else {
+      content = header + [
+        "## Rules",
+        "",
+        "| # | Scope | Rule | Why | Added |",
+        "|---|-------|------|-----|-------|",
+        "",
+        "## Patterns",
+        "",
+        "| # | Pattern | Where | Notes |",
+        "|---|---------|-------|-------|",
+        "",
+        "## Lessons Learned",
+        "",
+        "| # | What Happened | Root Cause | Fix | Scope |",
+        "|---|--------------|------------|-----|-------|",
+        `| L001 | ${entry} | — | — | ${scope} |`,
+        "",
+      ].join("\n");
+    }
+    await saveFile(knowledgePath, content);
+  }
+}
+
 export async function loadActiveOverrides(basePath: string): Promise<Override[]> {
  const overridesPath = resolveGsdRootFile(basePath, "OVERRIDES");
  const content = await loadFile(overridesPath);
--- a/src/resources/extensions/gsd/guided-flow.ts
+++ b/src/resources/extensions/gsd/guided-flow.ts
@ -22,11 +22,12 @@ import {
 } from "./paths.js";
 import { randomInt } from "node:crypto";
 import { join } from "node:path";
-import { readFileSync, existsSync, mkdirSync, readdirSync, rmSync, unlinkSync } from "node:fs";
+import { readFileSync, writeFileSync, existsSync, mkdirSync, readdirSync, rmSync, unlinkSync } from "node:fs";
 import { nativeIsRepo, nativeInit, nativeAddPaths, nativeCommit } from "./native-git-bridge.js";
 import { ensureGitignore, ensurePreferences, untrackRuntimeFiles } from "./gitignore.js";
 import { loadEffectiveGSDPreferences } from "./preferences.js";
 import { showConfirm } from "../shared/confirm-ui.js";
+import { loadQueueOrder, sortByQueueOrder, saveQueueOrder } from "./queue-order.js";

 // ─── Auto-start after discuss ─────────────────────────────────────────────────

@ -203,13 +204,16 @@ function buildDiscussPrompt(nextId: string, preamble: string, _basePath: string)
 export function findMilestoneIds(basePath: string): string[] {
  const dir = milestonesDir(basePath);
  try {
-    return readdirSync(dir, { withFileTypes: true })
+    const ids = readdirSync(dir, { withFileTypes: true })
      .filter((d) => d.isDirectory())
      .map((d) => {
        const match = d.name.match(/^(M\d+(?:-[a-z0-9]{6})?)/);
        return match ? match[1] : d.name;
-      })
-      .sort(milestoneIdSort);
+      });
+
+    // Apply custom queue order if available, else fall back to numeric sort
+    const customOrder = loadQueueOrder(basePath);
+    return sortByQueueOrder(ids, customOrder);
  } catch {
    return [];
  }
@ -305,6 +309,235 @@ export async function showQueue(
    return;
  }

+  // ── Count pending milestones ────────────────────────────────────────
+  const pendingMilestones = state.registry.filter(
+    m => m.status === "pending" || m.status === "active",
+  );
+  const completeCount = state.registry.filter(m => m.status === "complete").length;
+
+  // ── If multiple pending milestones, show queue management hub ──────
+  if (pendingMilestones.length > 1) {
+    const choice = await showNextAction(ctx, {
+      title: "GSD — Queue Management",
+      summary: [
+        `${completeCount} complete, ${pendingMilestones.length} pending.`,
+      ],
+      actions: [
+        {
+          id: "reorder",
+          label: "Reorder queue",
+          description: `Change execution order of ${pendingMilestones.length} pending milestones.`,
+          recommended: true,
+        },
+        {
+          id: "add",
+          label: "Add new work",
+          description: "Queue new milestones via discussion.",
+        },
+      ],
+      notYetMessage: "Run /gsd queue when ready.",
+    });
+
+    if (choice === "reorder") {
+      await handleQueueReorder(ctx, basePath, state);
+      return;
+    }
+    if (choice === "not_yet") return;
+    // "add" falls through to existing queue-add logic below
+  }
+
+  // ── Existing queue-add flow ─────────────────────────────────────────
+  await showQueueAdd(ctx, pi, basePath, state);
+}
+
+async function handleQueueReorder(
+  ctx: ExtensionCommandContext,
+  basePath: string,
+  state: Awaited<ReturnType<typeof deriveState>>,
+): Promise<void> {
+  const { showQueueReorder: showReorderUI } = await import("./queue-reorder-ui.js");
+  const { invalidateStateCache } = await import("./state.js");
+
+  const completed = state.registry
+    .filter(m => m.status === "complete")
+    .map(m => ({ id: m.id, title: m.title, dependsOn: m.dependsOn }));
+
+  const pending = state.registry
+    .filter(m => m.status !== "complete")
+    .map(m => ({ id: m.id, title: m.title, dependsOn: m.dependsOn }));
+
+  const result = await showReorderUI(ctx, completed, pending);
+  if (!result) {
+    ctx.ui.notify("Queue reorder cancelled.", "info");
+    return;
+  }
+
+  // Save the new order
+  saveQueueOrder(basePath, result.order);
+  invalidateStateCache();
+
+  // Remove conflicting depends_on entries from CONTEXT.md files
+  if (result.depsToRemove.length > 0) {
+    removeDependsOnFromContextFiles(basePath, result.depsToRemove);
+  }
+
+  // Sync PROJECT.md milestone sequence table
+  syncProjectMdSequence(basePath, state.registry, result.order);
+
+  // Commit the change
+  const filesToAdd = [".gsd/QUEUE-ORDER.json", ".gsd/PROJECT.md"];
+  for (const r of result.depsToRemove) {
+    filesToAdd.push(`.gsd/milestones/${r.milestone}/${r.milestone}-CONTEXT.md`);
+  }
+  try {
+    nativeAddPaths(basePath, filesToAdd);
+    nativeCommit(basePath, "docs: reorder queue");
+  } catch {
+    // Commit may fail if nothing changed or git hooks block — non-fatal
+  }
+
+  const depInfo = result.depsToRemove.length > 0
+    ? ` (removed ${result.depsToRemove.length} depends_on)`
+    : "";
+  ctx.ui.notify(`Queue reordered: ${result.order.join(" → ")}${depInfo}`, "info");
+}
+
+/**
+ * Remove specific depends_on entries from milestone CONTEXT.md frontmatter.
+ */
+function removeDependsOnFromContextFiles(
+  basePath: string,
+  depsToRemove: Array<{ milestone: string; dep: string }>,
+): void {
+  // Group removals by milestone
+  const byMilestone = new Map<string, string[]>();
+  for (const { milestone, dep } of depsToRemove) {
+    const existing = byMilestone.get(milestone) ?? [];
+    existing.push(dep);
+    byMilestone.set(milestone, existing);
+  }
+
+  for (const [mid, depsToRemoveForMid] of byMilestone) {
+    const contextFile = resolveMilestoneFile(basePath, mid, "CONTEXT");
+    if (!contextFile || !existsSync(contextFile)) continue;
+
+    const content = readFileSync(contextFile, "utf-8");
+
+    // Parse frontmatter
+    const trimmed = content.trimStart();
+    if (!trimmed.startsWith("---")) continue;
+    const afterFirst = trimmed.indexOf("\n");
+    if (afterFirst === -1) continue;
+    const rest = trimmed.slice(afterFirst + 1);
+    const endIdx = rest.indexOf("\n---");
+    if (endIdx === -1) continue;
+
+    const fmText = rest.slice(0, endIdx);
+    const body = rest.slice(endIdx + 4);
+
+    // Parse depends_on line(s)
+    const fmLines = fmText.split("\n");
+    const removeSet = new Set(depsToRemoveForMid.map(d => d.toUpperCase()));
+
+    // Handle inline format: depends_on: [M009, M010]
+    const inlineMatch = fmLines.findIndex(l => /^depends_on:\s*\[/.test(l));
+    if (inlineMatch >= 0) {
+      const line = fmLines[inlineMatch];
+      const inner = line.match(/\[([^\]]*)\]/);
+      if (inner) {
+        const remaining = inner[1]
+          .split(",")
+          .map(s => s.trim())
+          .filter(s => s && !removeSet.has(s.toUpperCase()));
+        if (remaining.length === 0) {
+          fmLines.splice(inlineMatch, 1);
+        } else {
+          fmLines[inlineMatch] = `depends_on: [${remaining.join(", ")}]`;
+        }
+      }
+    } else {
+      // Handle multi-line format
+      const keyIdx = fmLines.findIndex(l => /^depends_on:\s*$/.test(l));
+      if (keyIdx >= 0) {
+        let end = keyIdx + 1;
+        while (end < fmLines.length && /^\s+-\s/.test(fmLines[end])) {
+          const val = fmLines[end].replace(/^\s+-\s*/, "").trim().toUpperCase();
+          if (removeSet.has(val)) {
+            fmLines.splice(end, 1);
+          } else {
+            end++;
+          }
+        }
+        if (end === keyIdx + 1 || (end <= fmLines.length && !/^\s+-\s/.test(fmLines[keyIdx + 1] ?? ""))) {
+          fmLines.splice(keyIdx, 1);
+        }
+      }
+    }
+
+    // Rebuild file
+    const newFm = fmLines.filter(l => l !== undefined).join("\n");
+    const newContent = newFm.trim()
+      ? `---\n${newFm}\n---${body}`
+      : body.replace(/^\n+/, "");
+    writeFileSync(contextFile, newContent, "utf-8");
+  }
+}
+
+function syncProjectMdSequence(
+  basePath: string,
+  registry: Array<{ id: string; title: string; status: string }>,
+  newOrder: string[],
+): void {
+  const projectPath = resolveGsdRootFile(basePath, "PROJECT");
+  if (!projectPath || !existsSync(projectPath)) return;
+
+  const content = readFileSync(projectPath, "utf-8");
+  const lines = content.split("\n");
+
+  const headerIdx = lines.findIndex(l => /^##\s+Milestone Sequence/.test(l));
+  if (headerIdx < 0) return;
+
+  let tableStart = headerIdx + 1;
+  while (tableStart < lines.length && !lines[tableStart].startsWith("|")) tableStart++;
+  if (tableStart >= lines.length) return;
+
+  let tableEnd = tableStart + 1;
+  while (tableEnd < lines.length && lines[tableEnd].startsWith("|")) tableEnd++;
+
+  const registryMap = new Map(registry.map(m => [m.id, m]));
+  const completedSet = new Set(registry.filter(m => m.status === "complete").map(m => m.id));
+
+  const newRows: string[] = [];
+  for (const m of registry) {
+    if (m.status === "complete") {
+      newRows.push(`| ${m.id} | ${m.title} | ✅ Complete |`);
+    }
+  }
+  let isFirst = true;
+  for (const id of newOrder) {
+    if (completedSet.has(id)) continue;
+    const m = registryMap.get(id);
+    if (!m) continue;
+    const status = isFirst ? "📋 Next" : "📋 Queued";
+    newRows.push(`| ${m.id} | ${m.title} | ${status} |`);
+    isFirst = false;
+  }
+
+  const headerLine = lines[tableStart];
+  const separatorLine = lines[tableStart + 1];
+  const newTable = [headerLine, separatorLine, ...newRows];
+  lines.splice(tableStart, tableEnd - tableStart, ...newTable);
+  writeFileSync(projectPath, lines.join("\n"), "utf-8");
+}
+
+async function showQueueAdd(
+  ctx: ExtensionCommandContext,
+  pi: ExtensionAPI,
+  basePath: string,
+  state: Awaited<ReturnType<typeof deriveState>>,
+): Promise<void> {
+  const milestoneIds = findMilestoneIds(basePath);
+
  // ── Build existing milestones context for the prompt ────────────────
  const existingContext = await buildExistingMilestonesContext(basePath, milestoneIds, state);

--- a/src/resources/extensions/gsd/index.ts
+++ b/src/resources/extensions/gsd/index.ts
@ -28,10 +28,11 @@ import { createBashTool, createWriteTool, createReadTool, createEditTool, isTool
 import { registerGSDCommand, loadToolApiKeys } from "./commands.js";
 import { registerExitCommand } from "./exit-command.js";
 import { registerWorktreeCommand, getWorktreeOriginalCwd, getActiveWorktreeName } from "./worktree-command.js";
+import { getActiveAutoWorktreeContext } from "./auto-worktree.js";
 import { saveFile, formatContinue, loadFile, parseContinue, parseSummary, loadActiveOverrides, formatOverridesSection } from "./files.js";
 import { loadPrompt } from "./prompt-loader.js";
 import { deriveState } from "./state.js";
-import { isAutoActive, isAutoPaused, handleAgentEnd, pauseAuto, getAutoDashboardData } from "./auto.js";
+import { isAutoActive, isAutoPaused, handleAgentEnd, pauseAuto, getAutoDashboardData, markToolStart, markToolEnd } from "./auto.js";
 import { saveActivityLog } from "./activity-log.js";
 import { checkAutoStartAfterDiscuss, getDiscussionMilestoneId } from "./guided-flow.js";
 import { GSDDashboardOverlay } from "./dashboard-overlay.js";
@ -47,10 +48,11 @@ import {
  resolveSlicePath, resolveSliceFile, resolveTaskFile, resolveTaskFiles, resolveTasksDir,
  relSliceFile, relSlicePath, relTaskFile,
  buildSliceFileName, buildMilestoneFileName, gsdRoot, resolveMilestonePath,
+  resolveGsdRootFile,
 } from "./paths.js";
 import { Key } from "@gsd/pi-tui";
 import { join } from "node:path";
-import { existsSync } from "node:fs";
+import { existsSync, readFileSync } from "node:fs";
 import { shortcutDesc } from "../shared/terminal.js";
 import { Text } from "@gsd/pi-tui";
 import { pauseAutoForProviderError } from "./provider-error-pause.js";
@ -272,6 +274,20 @@ export default function (pi: ExtensionAPI) {
      }
    }

+    // Load project knowledge if available
+    let knowledgeBlock = "";
+    const knowledgePath = resolveGsdRootFile(process.cwd(), "KNOWLEDGE");
+    if (existsSync(knowledgePath)) {
+      try {
+        const content = readFileSync(knowledgePath, "utf-8").trim();
+        if (content) {
+          knowledgeBlock = `\n\n[PROJECT KNOWLEDGE — Rules, patterns, and lessons learned]\n\n${content}`;
+        }
+      } catch {
+        // File read error — skip knowledge injection
+      }
+    }
+
    // Detect skills installed during this auto-mode session
    let newSkillsBlock = "";
    if (hasSkillSnapshot()) {
@ -287,6 +303,7 @@ export default function (pi: ExtensionAPI) {
    let worktreeBlock = "";
    const worktreeName = getActiveWorktreeName();
    const worktreeMainCwd = getWorktreeOriginalCwd();
+    const autoWorktree = getActiveAutoWorktreeContext();
    if (worktreeName && worktreeMainCwd) {
      worktreeBlock = [
        "",
@ -304,10 +321,27 @@ export default function (pi: ExtensionAPI) {
        "All file operations, bash commands, and GSD state resolve against the worktree path above.",
        "Use /worktree merge to merge changes back. Use /worktree return to switch back to the main tree.",
      ].join("\n");
+    } else if (autoWorktree) {
+      worktreeBlock = [
+        "",
+        "",
+        "[WORKTREE CONTEXT — OVERRIDES CURRENT WORKING DIRECTORY ABOVE]",
+        `IMPORTANT: Ignore the "Current working directory" shown earlier in this prompt.`,
+        `The actual current working directory is: ${process.cwd()}`,
+        "",
+        "You are working inside a GSD auto-worktree.",
+        `- Milestone worktree: ${autoWorktree.worktreeName}`,
+        `- Worktree path (this is the real cwd): ${process.cwd()}`,
+        `- Main project: ${autoWorktree.originalBase}`,
+        `- Branch: ${autoWorktree.branch}`,
+        "",
+        "All file operations, bash commands, and GSD state resolve against the worktree path above.",
+        "Write every .gsd artifact in the worktree path above, never in the main project tree.",
+      ].join("\n");
    }

    return {
-      systemPrompt: `${event.systemPrompt}\n\n[SYSTEM CONTEXT — GSD]\n\n${systemContent}${preferenceBlock}${newSkillsBlock}${worktreeBlock}`,
+      systemPrompt: `${event.systemPrompt}\n\n[SYSTEM CONTEXT — GSD]\n\n${systemContent}${preferenceBlock}${knowledgeBlock}${newSkillsBlock}${worktreeBlock}`,
      ...(injection
        ? {
          message: {
@ -542,6 +576,16 @@ export default function (pi: ExtensionAPI) {
    const existing = await loadFile(discussionPath) ?? `# ${milestoneId} Discussion Log\n\n`;
    await saveFile(discussionPath, existing + newBlock);
  });
+
+  // ── tool_execution_start/end: track in-flight tools for idle detection ──
+  pi.on("tool_execution_start", async (event) => {
+    if (!isAutoActive()) return;
+    markToolStart(event.toolCallId);
+  });
+
+  pi.on("tool_execution_end", async (event) => {
+    markToolEnd(event.toolCallId);
+  });
 }

 async function buildGuidedExecuteContextInjection(prompt: string, basePath: string): Promise<string | null> {
--- a/src/resources/extensions/gsd/metrics.ts
+++ b/src/resources/extensions/gsd/metrics.ts
@ -43,6 +43,8 @@ export interface UnitMetrics {
  contextWindowTokens?: number;
  truncationSections?: number;
  continueHereFired?: boolean;
+  tier?: string;           // complexity tier (light/standard/heavy) if dynamic routing active
+  modelDowngraded?: boolean; // true if dynamic routing used a cheaper model
 }

 /** Budget state passed to snapshotUnitMetrics for persistence in the metrics ledger. */
@ -115,7 +117,7 @@ export function snapshotUnitMetrics(
  unitId: string,
  startedAt: number,
  model: string,
-  budgetInfo?: BudgetInfo,
+  opts?: { tier?: string; modelDowngraded?: boolean; contextWindowTokens?: number; truncationSections?: number; continueHereFired?: boolean },
 ): UnitMetrics | null {
  if (!ledger) return null;

@ -168,11 +170,11 @@ export function snapshotUnitMetrics(
    toolCalls,
    assistantMessages,
    userMessages,
-    ...(budgetInfo && {
-      ...(budgetInfo.contextWindowTokens !== undefined && { contextWindowTokens: budgetInfo.contextWindowTokens }),
-      ...(budgetInfo.truncationSections !== undefined && { truncationSections: budgetInfo.truncationSections }),
-      ...(budgetInfo.continueHereFired !== undefined && { continueHereFired: budgetInfo.continueHereFired }),
-    }),
+    ...(opts?.tier ? { tier: opts.tier } : {}),
+    ...(opts?.modelDowngraded !== undefined ? { modelDowngraded: opts.modelDowngraded } : {}),
+    ...(opts?.contextWindowTokens !== undefined ? { contextWindowTokens: opts.contextWindowTokens } : {}),
+    ...(opts?.truncationSections !== undefined ? { truncationSections: opts.truncationSections } : {}),
+    ...(opts?.continueHereFired !== undefined ? { continueHereFired: opts.continueHereFired } : {}),
  };

  ledger.units.push(unit);
@ -321,6 +323,49 @@ export function getProjectTotals(units: UnitMetrics[]): ProjectTotals {
  return totals;
 }

+// ─── Tier Aggregation ────────────────────────────────────────────────────────
+
+export interface TierAggregate {
+  tier: string;
+  units: number;
+  tokens: TokenCounts;
+  cost: number;
+  downgraded: number;   // units that were downgraded by dynamic routing
+}
+
+export function aggregateByTier(units: UnitMetrics[]): TierAggregate[] {
+  const map = new Map<string, TierAggregate>();
+  for (const u of units) {
+    const tier = u.tier ?? "unknown";
+    let agg = map.get(tier);
+    if (!agg) {
+      agg = { tier, units: 0, tokens: emptyTokens(), cost: 0, downgraded: 0 };
+      map.set(tier, agg);
+    }
+    agg.units++;
+    agg.tokens = addTokens(agg.tokens, u.tokens);
+    agg.cost += u.cost;
+    if (u.modelDowngraded) agg.downgraded++;
+  }
+  const order = ["light", "standard", "heavy", "unknown"];
+  return order.map(t => map.get(t)).filter((a): a is TierAggregate => !!a);
+}
+
+/**
+ * Format a summary of savings from dynamic routing.
+ * Returns empty string if no units were downgraded.
+ */
+export function formatTierSavings(units: UnitMetrics[]): string {
+  const downgraded = units.filter(u => u.modelDowngraded);
+  if (downgraded.length === 0) return "";
+
+  const downgradedCost = downgraded.reduce((sum, u) => sum + u.cost, 0);
+  const totalUnits = units.filter(u => u.tier).length;
+  const pct = totalUnits > 0 ? Math.round((downgraded.length / totalUnits) * 100) : 0;
+
+  return `Dynamic routing: ${downgraded.length}/${totalUnits} units downgraded (${pct}%), cost: ${formatCost(downgradedCost)}`;
+}
+
 // ─── Formatting helpers ───────────────────────────────────────────────────────

 export function formatCost(cost: number): string {
--- a/src/resources/extensions/gsd/model-cost-table.ts
+++ b/src/resources/extensions/gsd/model-cost-table.ts
@ -0,0 +1,65 @@
+// GSD Extension — Model Cost Table
+// Static cost reference for known models, used by the dynamic router
+// for cross-provider cost comparison.
+//
+// Costs are approximate per-1K-token rates in USD (input tokens).
+// Updated with GSD releases. Users can override via preferences.
+
+export interface ModelCostEntry {
+  /** Model ID (bare, without provider prefix) */
+  id: string;
+  /** Approximate cost per 1K input tokens in USD */
+  inputPer1k: number;
+  /** Approximate cost per 1K output tokens in USD */
+  outputPer1k: number;
+  /** Last updated date */
+  updatedAt: string;
+}
+
+/**
+ * Bundled cost table for known models.
+ * Updated periodically with GSD releases.
+ */
+export const BUNDLED_COST_TABLE: ModelCostEntry[] = [
+  // Anthropic
+  { id: "claude-opus-4-6", inputPer1k: 0.015, outputPer1k: 0.075, updatedAt: "2025-03-15" },
+  { id: "claude-sonnet-4-6", inputPer1k: 0.003, outputPer1k: 0.015, updatedAt: "2025-03-15" },
+  { id: "claude-haiku-4-5", inputPer1k: 0.0008, outputPer1k: 0.004, updatedAt: "2025-03-15" },
+  { id: "claude-sonnet-4-5-20250514", inputPer1k: 0.003, outputPer1k: 0.015, updatedAt: "2025-03-15" },
+  { id: "claude-3-5-sonnet-latest", inputPer1k: 0.003, outputPer1k: 0.015, updatedAt: "2025-03-15" },
+  { id: "claude-3-5-haiku-latest", inputPer1k: 0.0008, outputPer1k: 0.004, updatedAt: "2025-03-15" },
+  { id: "claude-3-opus-latest", inputPer1k: 0.015, outputPer1k: 0.075, updatedAt: "2025-03-15" },
+
+  // OpenAI
+  { id: "gpt-4o", inputPer1k: 0.0025, outputPer1k: 0.01, updatedAt: "2025-03-15" },
+  { id: "gpt-4o-mini", inputPer1k: 0.00015, outputPer1k: 0.0006, updatedAt: "2025-03-15" },
+  { id: "o1", inputPer1k: 0.015, outputPer1k: 0.06, updatedAt: "2025-03-15" },
+  { id: "o3", inputPer1k: 0.015, outputPer1k: 0.06, updatedAt: "2025-03-15" },
+  { id: "gpt-4-turbo", inputPer1k: 0.01, outputPer1k: 0.03, updatedAt: "2025-03-15" },
+
+  // Google
+  { id: "gemini-2.0-flash", inputPer1k: 0.0001, outputPer1k: 0.0004, updatedAt: "2025-03-15" },
+  { id: "gemini-flash-2.0", inputPer1k: 0.0001, outputPer1k: 0.0004, updatedAt: "2025-03-15" },
+  { id: "gemini-2.5-pro", inputPer1k: 0.00125, outputPer1k: 0.005, updatedAt: "2025-03-15" },
+
+  // DeepSeek
+  { id: "deepseek-chat", inputPer1k: 0.00014, outputPer1k: 0.00028, updatedAt: "2025-03-15" },
+];
+
+/**
+ * Lookup cost for a model ID. Returns undefined if not found.
+ */
+export function lookupModelCost(modelId: string): ModelCostEntry | undefined {
+  const bareId = modelId.includes("/") ? modelId.split("/").pop()! : modelId;
+  return BUNDLED_COST_TABLE.find(e => e.id === bareId)
+    ?? BUNDLED_COST_TABLE.find(e => bareId.includes(e.id) || e.id.includes(bareId));
+}
+
+/**
+ * Compare two models by input cost. Returns negative if a is cheaper.
+ */
+export function compareModelCost(modelIdA: string, modelIdB: string): number {
+  const costA = lookupModelCost(modelIdA)?.inputPer1k ?? 999;
+  const costB = lookupModelCost(modelIdB)?.inputPer1k ?? 999;
+  return costA - costB;
+}
--- a/src/resources/extensions/gsd/model-router.ts
+++ b/src/resources/extensions/gsd/model-router.ts
@ -0,0 +1,256 @@
+// GSD Extension — Dynamic Model Router
+// Maps complexity tiers to models, enforcing downgrade-only semantics.
+// The user's configured model is always the ceiling.
+
+import type { ComplexityTier, ClassificationResult } from "./complexity-classifier.js";
+import { tierOrdinal } from "./complexity-classifier.js";
+import type { ResolvedModelConfig } from "./preferences.js";
+
+// ─── Types ───────────────────────────────────────────────────────────────────
+
+export interface DynamicRoutingConfig {
+  enabled?: boolean;
+  tier_models?: {
+    light?: string;
+    standard?: string;
+    heavy?: string;
+  };
+  escalate_on_failure?: boolean;   // default: true
+  budget_pressure?: boolean;       // default: true
+  cross_provider?: boolean;        // default: true
+  hooks?: boolean;                 // default: true
+}
+
+export interface RoutingDecision {
+  /** The model ID to use (may be downgraded from configured) */
+  modelId: string;
+  /** Fallback chain: [selected_model, ...configured_fallbacks, configured_primary] */
+  fallbacks: string[];
+  /** The complexity tier that drove this decision */
+  tier: ComplexityTier;
+  /** True if the model was downgraded from the configured primary */
+  wasDowngraded: boolean;
+  /** Human-readable reason for this decision */
+  reason: string;
+}
+
+// ─── Known Model Tiers ───────────────────────────────────────────────────────
+// Maps known model IDs to their capability tier. Used when tier_models is not
+// explicitly configured to pick the best available model for each tier.
+
+const MODEL_CAPABILITY_TIER: Record<string, ComplexityTier> = {
+  // Light-tier models (cheapest)
+  "claude-haiku-4-5": "light",
+  "claude-3-5-haiku-latest": "light",
+  "claude-3-haiku-20240307": "light",
+  "gpt-4o-mini": "light",
+  "gemini-2.0-flash": "light",
+  "gemini-flash-2.0": "light",
+
+  // Standard-tier models
+  "claude-sonnet-4-6": "standard",
+  "claude-sonnet-4-5-20250514": "standard",
+  "claude-3-5-sonnet-latest": "standard",
+  "gpt-4o": "standard",
+  "gemini-2.5-pro": "standard",
+  "deepseek-chat": "standard",
+
+  // Heavy-tier models (most capable)
+  "claude-opus-4-6": "heavy",
+  "claude-3-opus-latest": "heavy",
+  "gpt-4-turbo": "heavy",
+  "o1": "heavy",
+  "o3": "heavy",
+};
+
+// ─── Cost Table (per 1K input tokens, approximate USD) ───────────────────────
+// Used for cross-provider cost comparison when multiple providers offer
+// the same capability tier.
+
+const MODEL_COST_PER_1K_INPUT: Record<string, number> = {
+  "claude-haiku-4-5": 0.0008,
+  "claude-3-5-haiku-latest": 0.0008,
+  "claude-sonnet-4-6": 0.003,
+  "claude-sonnet-4-5-20250514": 0.003,
+  "claude-opus-4-6": 0.015,
+  "gpt-4o-mini": 0.00015,
+  "gpt-4o": 0.0025,
+  "gemini-2.0-flash": 0.0001,
+  "gemini-2.5-pro": 0.00125,
+  "deepseek-chat": 0.00014,
+};
+
+// ─── Public API ──────────────────────────────────────────────────────────────
+
+/**
+ * Resolve the model to use for a given complexity tier.
+ *
+ * Downgrade-only: the returned model is always equal to or cheaper than
+ * the user's configured primary model. Never upgrades beyond configuration.
+ *
+ * @param classification  The complexity classification result
+ * @param phaseConfig     The user's configured model for this phase (ceiling)
+ * @param routingConfig   Dynamic routing configuration
+ * @param availableModelIds  List of available model IDs (from registry)
+ */
+export function resolveModelForComplexity(
+  classification: ClassificationResult,
+  phaseConfig: ResolvedModelConfig | undefined,
+  routingConfig: DynamicRoutingConfig,
+  availableModelIds: string[],
+): RoutingDecision {
+  // If no phase config or routing disabled, pass through
+  if (!phaseConfig || !routingConfig.enabled) {
+    return {
+      modelId: phaseConfig?.primary ?? "",
+      fallbacks: phaseConfig?.fallbacks ?? [],
+      tier: classification.tier,
+      wasDowngraded: false,
+      reason: "dynamic routing disabled or no phase config",
+    };
+  }
+
+  const configuredPrimary = phaseConfig.primary;
+  const configuredTier = getModelTier(configuredPrimary);
+  const requestedTier = classification.tier;
+
+  // Downgrade-only: if requested tier >= configured tier, no change
+  if (tierOrdinal(requestedTier) >= tierOrdinal(configuredTier)) {
+    return {
+      modelId: configuredPrimary,
+      fallbacks: phaseConfig.fallbacks,
+      tier: requestedTier,
+      wasDowngraded: false,
+      reason: `tier ${requestedTier} >= configured ${configuredTier}`,
+    };
+  }
+
+  // Find the best model for the requested tier
+  const targetModelId = findModelForTier(
+    requestedTier,
+    routingConfig,
+    availableModelIds,
+    routingConfig.cross_provider !== false,
+  );
+
+  if (!targetModelId) {
+    // No suitable model found — use configured primary
+    return {
+      modelId: configuredPrimary,
+      fallbacks: phaseConfig.fallbacks,
+      tier: requestedTier,
+      wasDowngraded: false,
+      reason: `no ${requestedTier}-tier model available`,
+    };
+  }
+
+  // Build fallback chain: [downgraded_model, ...configured_fallbacks, configured_primary]
+  const fallbacks = [
+    ...phaseConfig.fallbacks.filter(f => f !== targetModelId),
+    configuredPrimary,
+  ].filter(f => f !== targetModelId);
+
+  return {
+    modelId: targetModelId,
+    fallbacks,
+    tier: requestedTier,
+    wasDowngraded: true,
+    reason: classification.reason,
+  };
+}
+
+/**
+ * Escalate to the next tier after a failure.
+ * Returns the new tier, or null if already at heavy (max).
+ */
+export function escalateTier(currentTier: ComplexityTier): ComplexityTier | null {
+  switch (currentTier) {
+    case "light": return "standard";
+    case "standard": return "heavy";
+    case "heavy": return null;
+  }
+}
+
+/**
+ * Get the default routing config (all features enabled).
+ */
+export function defaultRoutingConfig(): DynamicRoutingConfig {
+  return {
+    enabled: false,
+    escalate_on_failure: true,
+    budget_pressure: true,
+    cross_provider: true,
+    hooks: true,
+  };
+}
+
+// ─── Internal ────────────────────────────────────────────────────────────────
+
+function getModelTier(modelId: string): ComplexityTier {
+  // Strip provider prefix if present
+  const bareId = modelId.includes("/") ? modelId.split("/").pop()! : modelId;
+
+  // Check exact match first
+  if (MODEL_CAPABILITY_TIER[bareId]) return MODEL_CAPABILITY_TIER[bareId];
+
+  // Check if any known model ID is a prefix/suffix match
+  for (const [knownId, tier] of Object.entries(MODEL_CAPABILITY_TIER)) {
+    if (bareId.includes(knownId) || knownId.includes(bareId)) return tier;
+  }
+
+  // Unknown models are assumed heavy (safest assumption)
+  return "heavy";
+}
+
+function findModelForTier(
+  tier: ComplexityTier,
+  config: DynamicRoutingConfig,
+  availableModelIds: string[],
+  crossProvider: boolean,
+): string | null {
+  // 1. Check explicit tier_models config
+  const explicitModel = config.tier_models?.[tier];
+  if (explicitModel && availableModelIds.includes(explicitModel)) {
+    return explicitModel;
+  }
+  // Also check with provider prefix stripped
+  if (explicitModel) {
+    const match = availableModelIds.find(id => {
+      const bareAvail = id.includes("/") ? id.split("/").pop()! : id;
+      const bareExplicit = explicitModel.includes("/") ? explicitModel.split("/").pop()! : explicitModel;
+      return bareAvail === bareExplicit;
+    });
+    if (match) return match;
+  }
+
+  // 2. Auto-detect: find the cheapest available model in the requested tier
+  const candidates = availableModelIds
+    .filter(id => {
+      const modelTier = getModelTier(id);
+      return modelTier === tier;
+    })
+    .sort((a, b) => {
+      if (!crossProvider) return 0;
+      const costA = getModelCost(a);
+      const costB = getModelCost(b);
+      return costA - costB;
+    });
+
+  return candidates[0] ?? null;
+}
+
+function getModelCost(modelId: string): number {
+  const bareId = modelId.includes("/") ? modelId.split("/").pop()! : modelId;
+
+  if (MODEL_COST_PER_1K_INPUT[bareId] !== undefined) {
+    return MODEL_COST_PER_1K_INPUT[bareId];
+  }
+
+  // Check partial matches
+  for (const [knownId, cost] of Object.entries(MODEL_COST_PER_1K_INPUT)) {
+    if (bareId.includes(knownId) || knownId.includes(bareId)) return cost;
+  }
+
+  // Unknown cost — assume expensive to avoid routing to unknown cheap models
+  return 999;
+}
--- a/src/resources/extensions/gsd/paths.ts
+++ b/src/resources/extensions/gsd/paths.ts
@ -15,6 +15,9 @@ import { nativeScanGsdTree, type GsdTreeEntry } from "./native-parser-bridge.js"

 // ─── Directory Listing Cache ──────────────────────────────────────────────────

+/** Max entries before eviction. Prevents unbounded growth in long sessions (#611). */
+const DIR_CACHE_MAX = 200;
+
 const dirEntryCache = new Map<string, Dirent[]>();
 const dirListCache = new Map<string, string[]>();

@ -85,6 +88,7 @@ function cachedReaddirWithTypes(dirPath: string): Dirent[] {
          d.isSocket = () => false;
          return d;
        });
+        if (dirEntryCache.size >= DIR_CACHE_MAX) dirEntryCache.clear();
        dirEntryCache.set(dirPath, dirents);
        return dirents;
      }
@ -92,6 +96,7 @@ function cachedReaddirWithTypes(dirPath: string): Dirent[] {
  }

  const entries = readdirSync(dirPath, { withFileTypes: true });
+  if (dirEntryCache.size >= DIR_CACHE_MAX) dirEntryCache.clear();
  dirEntryCache.set(dirPath, entries);
  return entries;
 }
@ -107,6 +112,7 @@ function cachedReaddir(dirPath: string): string[] {
      const treeEntries = nativeTreeCache.get(key);
      if (treeEntries) {
        const names = treeEntries.map(e => e.name);
+        if (dirListCache.size >= DIR_CACHE_MAX) dirListCache.clear();
        dirListCache.set(dirPath, names);
        return names;
      }
@ -114,6 +120,7 @@ function cachedReaddir(dirPath: string): string[] {
  }

  const entries = readdirSync(dirPath);
+  if (dirListCache.size >= DIR_CACHE_MAX) dirListCache.clear();
  dirListCache.set(dirPath, entries);
  return entries;
 }
@ -248,6 +255,7 @@ export const GSD_ROOT_FILES = {
  STATE: "STATE.md",
  REQUIREMENTS: "REQUIREMENTS.md",
  OVERRIDES: "OVERRIDES.md",
+  KNOWLEDGE: "KNOWLEDGE.md",
 } as const;

 export type GSDRootFileKey = keyof typeof GSD_ROOT_FILES;
@ -259,6 +267,7 @@ const LEGACY_GSD_ROOT_FILES: Record<GSDRootFileKey, string> = {
  STATE: "state.md",
  REQUIREMENTS: "requirements.md",
  OVERRIDES: "overrides.md",
+  KNOWLEDGE: "knowledge.md",
 };

 export function gsdRoot(basePath: string): string {
--- a/src/resources/extensions/gsd/post-unit-hooks.ts
+++ b/src/resources/extensions/gsd/post-unit-hooks.ts
@ -60,7 +60,8 @@ export function checkPostUnitHooks(
  }

  // Don't trigger hooks for other hook units (prevent hook-on-hook chains)
-  if (completedUnitType.startsWith("hook/")) return null;
+  // Don't trigger hooks for triage units (prevent hook-on-triage chains)
+  if (completedUnitType.startsWith("hook/") || completedUnitType === "triage-captures") return null;

  // Check if any hooks are configured for this unit type
  const hooks = resolvePostUnitHooks().filter(h =>
--- a/src/resources/extensions/gsd/preferences.ts
+++ b/src/resources/extensions/gsd/preferences.ts
@ -1,9 +1,11 @@
-import { existsSync, readdirSync, readFileSync, statSync } from "node:fs";
+import { existsSync, readdirSync, readFileSync, statSync, writeFileSync } from "node:fs";
 import { homedir } from "node:os";
 import { isAbsolute, join } from "node:path";
 import { getAgentDir } from "@gsd/pi-coding-agent";
 import type { GitPreferences } from "./git-service.js";
 import type { PostUnitHookConfig, PreDispatchHookConfig, BudgetEnforcementMode, NotificationPreferences, TokenProfile, InlineLevel, PhaseSkipPreferences } from "./types.js";
+import type { DynamicRoutingConfig } from "./model-router.js";
+import { defaultRoutingConfig } from "./model-router.js";
 import { VALID_BRANCH_NAME } from "./git-service.js";

 const GLOBAL_PREFERENCES_PATH = join(homedir(), ".gsd", "preferences.md");
@ -36,8 +38,10 @@ const KNOWN_PREFERENCE_KEYS = new Set<string>([
  "git",
  "post_unit_hooks",
  "pre_dispatch_hooks",
+  "dynamic_routing",
  "token_profile",
  "phases",
+  "auto_visualize",
 ]);

 export interface GSDSkillRule {
@ -128,8 +132,10 @@ export interface GSDPreferences {
  git?: GitPreferences;
  post_unit_hooks?: PostUnitHookConfig[];
  pre_dispatch_hooks?: PreDispatchHookConfig[];
+  dynamic_routing?: DynamicRoutingConfig;
  token_profile?: TokenProfile;
  phases?: PhaseSkipPreferences;
+  auto_visualize?: boolean;
 }

 export interface LoadedGSDPreferences {
@ -674,6 +680,20 @@ export function resolveModelWithFallbacksForUnit(unitType: string): ResolvedMode
  };
 }

+/**
+ * Resolve the dynamic routing configuration from effective preferences.
+ * Returns the merged config with defaults applied.
+ */
+export function resolveDynamicRoutingConfig(): DynamicRoutingConfig {
+  const prefs = loadEffectiveGSDPreferences();
+  const configured = prefs?.preferences.dynamic_routing;
+  if (!configured) return defaultRoutingConfig();
+  return {
+    ...defaultRoutingConfig(),
+    ...configured,
+  };
+}
+
 export function resolveAutoSupervisorConfig(): AutoSupervisorConfig {
  const prefs = loadEffectiveGSDPreferences();
  const configured = prefs?.preferences.auto_supervisor ?? {};
@ -780,6 +800,9 @@ function mergePreferences(base: GSDPreferences, override: GSDPreferences): GSDPr
      : undefined,
    post_unit_hooks: mergePostUnitHooks(base.post_unit_hooks, override.post_unit_hooks),
    pre_dispatch_hooks: mergePreDispatchHooks(base.pre_dispatch_hooks, override.pre_dispatch_hooks),
+    dynamic_routing: (base.dynamic_routing || override.dynamic_routing)
+      ? { ...(base.dynamic_routing ?? {}), ...(override.dynamic_routing ?? {}) } as DynamicRoutingConfig
+      : undefined,
    token_profile: override.token_profile ?? base.token_profile,
    phases: (base.phases || override.phases)
      ? { ...(base.phases ?? {}), ...(override.phases ?? {}) }
@ -1100,6 +1123,56 @@ export function validatePreferences(preferences: GSDPreferences): {
    }
  }

+  // ─── Dynamic Routing ─────────────────────────────────────────────────
+  if (preferences.dynamic_routing !== undefined) {
+    if (typeof preferences.dynamic_routing === "object" && preferences.dynamic_routing !== null) {
+      const dr = preferences.dynamic_routing as unknown as Record<string, unknown>;
+      const validDr: Partial<DynamicRoutingConfig> = {};
+
+      if (dr.enabled !== undefined) {
+        if (typeof dr.enabled === "boolean") validDr.enabled = dr.enabled;
+        else errors.push("dynamic_routing.enabled must be a boolean");
+      }
+      if (dr.escalate_on_failure !== undefined) {
+        if (typeof dr.escalate_on_failure === "boolean") validDr.escalate_on_failure = dr.escalate_on_failure;
+        else errors.push("dynamic_routing.escalate_on_failure must be a boolean");
+      }
+      if (dr.budget_pressure !== undefined) {
+        if (typeof dr.budget_pressure === "boolean") validDr.budget_pressure = dr.budget_pressure;
+        else errors.push("dynamic_routing.budget_pressure must be a boolean");
+      }
+      if (dr.cross_provider !== undefined) {
+        if (typeof dr.cross_provider === "boolean") validDr.cross_provider = dr.cross_provider;
+        else errors.push("dynamic_routing.cross_provider must be a boolean");
+      }
+      if (dr.hooks !== undefined) {
+        if (typeof dr.hooks === "boolean") validDr.hooks = dr.hooks;
+        else errors.push("dynamic_routing.hooks must be a boolean");
+      }
+      if (dr.tier_models !== undefined) {
+        if (typeof dr.tier_models === "object" && dr.tier_models !== null) {
+          const tm = dr.tier_models as Record<string, unknown>;
+          const validTm: Record<string, string> = {};
+          for (const tier of ["light", "standard", "heavy"]) {
+            if (tm[tier] !== undefined) {
+              if (typeof tm[tier] === "string") validTm[tier] = tm[tier] as string;
+              else errors.push(`dynamic_routing.tier_models.${tier} must be a string`);
+            }
+          }
+          if (Object.keys(validTm).length > 0) validDr.tier_models = validTm as DynamicRoutingConfig["tier_models"];
+        } else {
+          errors.push("dynamic_routing.tier_models must be an object");
+        }
+      }
+
+      if (Object.keys(validDr).length > 0) {
+        validated.dynamic_routing = validDr as unknown as DynamicRoutingConfig;
+      }
+    } else {
+      errors.push("dynamic_routing must be an object");
+    }
+  }
+
  // ─── Git Preferences ───────────────────────────────────────────────────
  if (preferences.git && typeof preferences.git === "object") {
    const git: Record<string, unknown> = {};
@ -1252,3 +1325,61 @@ export function resolvePreDispatchHooks(): PreDispatchHookConfig[] {
  return (prefs?.preferences.pre_dispatch_hooks ?? [])
    .filter(h => h.enabled !== false);
 }
+
+/**
+ * Validate a model ID string.
+ * Returns true if the ID looks like a valid model identifier.
+ */
+export function validateModelId(modelId: string): boolean {
+  if (!modelId || typeof modelId !== "string") return false;
+  const trimmed = modelId.trim();
+  if (trimmed.length === 0 || trimmed.length > 256) return false;
+  // Allow alphanumeric, hyphens, underscores, dots, slashes, colons
+  return /^[a-zA-Z0-9\-_./:]+$/.test(trimmed);
+}
+
+/**
+ * Update the models section of the global GSD preferences file.
+ * Performs a safe read-modify-write: reads current content, updates the models
+ * YAML block, and writes back. Creates the file if it doesn't exist.
+ */
+export function updatePreferencesModels(models: GSDModelConfigV2): void {
+  const prefsPath = getGlobalGSDPreferencesPath();
+
+  let content = "";
+  if (existsSync(prefsPath)) {
+    content = readFileSync(prefsPath, "utf-8");
+  }
+
+  // Build the new models block
+  const lines: string[] = ["models:"];
+  for (const [phase, value] of Object.entries(models)) {
+    if (typeof value === "string") {
+      lines.push(`  ${phase}: ${value}`);
+    } else if (value && typeof value === "object") {
+      const config = value as GSDPhaseModelConfig;
+      lines.push(`  ${phase}:`);
+      lines.push(`    model: ${config.model}`);
+      if (config.provider) {
+        lines.push(`    provider: ${config.provider}`);
+      }
+      if (config.fallbacks && config.fallbacks.length > 0) {
+        lines.push(`    fallbacks:`);
+        for (const fb of config.fallbacks) {
+          lines.push(`      - ${fb}`);
+        }
+      }
+    }
+  }
+  const modelsBlock = lines.join("\n");
+
+  // Replace existing models block or append
+  const modelsRegex = /^models:[\s\S]*?(?=\n[a-z_]|\n*$)/m;
+  if (modelsRegex.test(content)) {
+    content = content.replace(modelsRegex, modelsBlock);
+  } else {
+    content = content.trimEnd() + "\n\n" + modelsBlock + "\n";
+  }
+
+  writeFileSync(prefsPath, content, "utf-8");
+}
--- a/src/resources/extensions/gsd/prompt-loader.ts
+++ b/src/resources/extensions/gsd/prompt-loader.ts
@ -7,15 +7,17 @@
 * Templates live at prompts/ relative to this module's directory.
 * They use {{variableName}} syntax for substitution.
 *
- * Templates are cached on first read per session. This prevents a running
- * session from being invalidated when another `gsd` launch overwrites
- * ~/.gsd/agent/ with newer templates via initResources(). Without caching,
- * the in-memory extension code (which knows variable set A) can read a
- * newer template from disk (which expects variable set B), causing a
- * "template declares {{X}} but no value was provided" crash mid-session.
+ * All templates are eagerly loaded into cache at module init via warmCache().
+ * This prevents a running session from being invalidated when another `gsd`
+ * launch overwrites ~/.gsd/agent/ with newer templates via initResources().
+ * Without eager caching, the in-memory extension code (which knows variable
+ * set A) can read a newer template from disk (which expects variable set B),
+ * causing a "template declares {{X}} but no value was provided" crash
+ * mid-session — especially for late-loading templates like complete-milestone
+ * that aren't read until the end of a long auto-mode run.
 */

-import { readFileSync } from "node:fs";
+import { readFileSync, readdirSync } from "node:fs";
 import { join, dirname } from "node:path";
 import { fileURLToPath } from "node:url";

@ -23,10 +25,44 @@ const __extensionDir = dirname(fileURLToPath(import.meta.url));
 const promptsDir = join(__extensionDir, "prompts");
 const templatesDir = join(__extensionDir, "templates");

-// Cache templates on first read — a running session uses the template versions
-// that were on disk when it first loaded them, immune to later overwrites.
+// Cache all templates eagerly at module load — a running session uses the
+// template versions that were on disk at startup, immune to later overwrites.
 const templateCache = new Map<string, string>();

+/**
+ * Eagerly read all .md files from prompts/ and templates/ into cache.
+ * Called once at module init so that every template is snapshot before
+ * a concurrent initResources() can overwrite files on disk.
+ */
+function warmCache(): void {
+  try {
+    for (const file of readdirSync(promptsDir)) {
+      if (!file.endsWith(".md")) continue;
+      const name = file.slice(0, -3);
+      if (!templateCache.has(name)) {
+        templateCache.set(name, readFileSync(join(promptsDir, file), "utf-8"));
+      }
+    }
+  } catch {
+    // prompts/ may not exist in test environments — lazy loading still works
+  }
+
+  try {
+    for (const file of readdirSync(templatesDir)) {
+      if (!file.endsWith(".md")) continue;
+      const cacheKey = `tpl:${file.slice(0, -3)}`;
+      if (!templateCache.has(cacheKey)) {
+        templateCache.set(cacheKey, readFileSync(join(templatesDir, file), "utf-8"));
+      }
+    }
+  } catch {
+    // templates/ may not exist in test environments — lazy loading still works
+  }
+}
+
+// Snapshot all templates at module load time
+warmCache();
+
 /**
 * Load a prompt template and substitute variables.
 *
--- a/src/resources/extensions/gsd/prompts/execute-task.md
+++ b/src/resources/extensions/gsd/prompts/execute-task.md
@ -54,11 +54,12 @@ Then:
    - Don't fix symptoms. Understand *why* something fails before changing code. A test that passes after a change you don't understand is luck, not a fix.
 11. **Blocker discovery:** If execution reveals that the remaining slice plan is fundamentally invalid — not just a bug or minor deviation, but a plan-invalidating finding like a wrong API, missing capability, or architectural mismatch — set `blocker_discovered: true` in the task summary frontmatter and describe the blocker clearly in the summary narrative. Do NOT set `blocker_discovered: true` for ordinary debugging, minor deviations, or issues that can be fixed within the current task or the remaining plan. This flag triggers an automatic replan of the slice.
 12. If you made an architectural, pattern, library, or observability decision during this task that downstream work should know about, append it to `.gsd/DECISIONS.md` (read the template at `~/.gsd/agent/extensions/gsd/templates/decisions.md` if the file doesn't exist yet). Not every task produces decisions — only append when a meaningful choice was made.
-13. Read the template at `~/.gsd/agent/extensions/gsd/templates/task-summary.md`
-14. Write `{{taskSummaryPath}}`
-15. Mark {{taskId}} done in `{{planPath}}` (change `[ ]` to `[x]`)
-16. Do not commit manually — the system auto-commits your changes after this unit completes.
-17. Update `.gsd/STATE.md`
+13. If you discover a non-obvious rule, recurring gotcha, or useful pattern during execution, append it to `.gsd/KNOWLEDGE.md`. Only add entries that would save future agents from repeating your investigation. Don't add obvious things.
+14. Read the template at `~/.gsd/agent/extensions/gsd/templates/task-summary.md`
+15. Write `{{taskSummaryPath}}`
+16. Mark {{taskId}} done in `{{planPath}}` (change `[ ]` to `[x]`)
+17. Do not commit manually — the system auto-commits your changes after this unit completes.
+18. Update `.gsd/STATE.md`

 All work stays in your working directory: `{{workingDirectory}}`.

--- a/src/resources/extensions/gsd/prompts/reassess-roadmap.md
+++ b/src/resources/extensions/gsd/prompts/reassess-roadmap.md
@ -16,6 +16,12 @@ All relevant context has been preloaded below — the current roadmap, completed

 {{inlinedContext}}

+## Deferred Captures
+
+The following user thoughts were captured during execution and deferred to future slices during triage. Consider whether any should influence the remaining roadmap:
+
+{{deferredCaptures}}
+
 If a `GSD Skill Preferences` block is present in system context, use it to decide which skills to load and follow during reassessment, without relaxing required verification or artifact rules.

 Then assess whether the remaining roadmap still makes sense given what was just built.
--- a/src/resources/extensions/gsd/prompts/replan-slice.md
+++ b/src/resources/extensions/gsd/prompts/replan-slice.md
@ -12,6 +12,14 @@ All relevant context has been preloaded below — the roadmap, current slice pla

 {{inlinedContext}}

+## Capture Context
+
+The following user-captured thoughts triggered or informed this replan:
+
+{{captureContext}}
+
+Consider these captures when rewriting the remaining tasks — they represent the user's real-time insights about what needs to change.
+
 ## Hard Constraints

 - **Do NOT renumber or remove completed tasks.** All `[x]` tasks and their IDs must remain exactly as they are in the plan.
--- a/src/resources/extensions/gsd/prompts/system.md
+++ b/src/resources/extensions/gsd/prompts/system.md
@ -65,6 +65,7 @@ Titles live inside file content (headings, frontmatter), not in file or director
  PROJECT.md            (living doc - what the project is right now)
  REQUIREMENTS.md       (requirement contract - tracks active/validated/deferred/out-of-scope)
  DECISIONS.md          (append-only register of architectural and pattern decisions)
+  KNOWLEDGE.md          (append-only register of project-specific rules, patterns, and lessons learned)
  OVERRIDES.md          (user-issued overrides that supersede plan content via /gsd steer)
  QUEUE.md              (append-only log of queued milestones via /gsd queue)
  STATE.md
@ -100,6 +101,7 @@ All auto-mode work happens inside a worktree at `.gsd/worktrees/<MID>/`. This is
 - **PROJECT.md** is a living document describing what the project is right now - current state only, updated at slice completion when stale
 - **REQUIREMENTS.md** tracks the requirement contract — requirements move between Active, Validated, Deferred, Blocked, and Out of Scope as slices prove or invalidate them. Update at slice completion when evidence supports a status change.
 - **DECISIONS.md** is an append-only register of architectural and pattern decisions - read it during planning/research, append to it during execution when a meaningful decision is made
+- **KNOWLEDGE.md** is an append-only register of project-specific rules, patterns, and lessons learned. Read it at the start of every unit. Append to it when you discover a recurring issue, a non-obvious pattern, or a rule that future agents should follow.
 - **CONTEXT.md** files (milestone or slice level) capture the brief — scope, goals, constraints, and key decisions from discussion. When present, they are the authoritative source for what a milestone or slice is trying to achieve. Read them before planning or executing.
 - **Milestones** are major project phases (M001, M002, ...)
 - **Slices** are demoable vertical increments (S01, S02, ...) ordered by risk. After each slice completes, the roadmap is reassessed before the next slice begins.
--- a/src/resources/extensions/gsd/prompts/triage-captures.md
+++ b/src/resources/extensions/gsd/prompts/triage-captures.md
@ -0,0 +1,62 @@
+You are triaging user-captured thoughts during a GSD session.
+
+## UNIT: Triage Captures
+
+The user captured thoughts during execution using `/gsd capture`. Your job is to classify each capture, present your proposals, get user confirmation, and update CAPTURES.md with the final classifications.
+
+## Pending Captures
+
+{{pendingCaptures}}
+
+## Current Slice Plan
+
+{{currentPlan}}
+
+## Current Roadmap
+
+{{roadmapContext}}
+
+## Classification Criteria
+
+For each capture, classify it as one of:
+
+- **quick-task**: Small, self-contained, no downstream impact. Can be done in minutes without modifying the plan. Examples: fix a typo, add a missing import, tweak a config value.
+- **inject**: Belongs in the current slice but wasn't planned. Needs a new task added to the slice plan. Examples: add error handling to a module being built, add a missing test case for current work.
+- **defer**: Belongs in a future slice or milestone. Not urgent for current work. Examples: performance optimization, feature that depends on unbuilt infrastructure, nice-to-have enhancement.
+- **replan**: Changes the shape of remaining work in the current slice. Existing incomplete tasks may need rewriting. Examples: "the approach is wrong, we need to use X instead of Y", discovering a fundamental constraint.
+- **note**: Informational only. No action needed right now. Good context for future reference. Examples: "remember that the API has a rate limit", observations about code quality.
+
+## Decision Guidelines
+
+- Prefer **quick-task** when the work is clearly small and self-contained.
+- Prefer **inject** over **replan** when only a new task is needed, not rewriting existing ones.
+- Prefer **defer** over **inject** when the work doesn't belong in the current slice's scope.
+- Use **replan** only when remaining incomplete tasks need to change — not just for adding work.
+- Use **note** for observations that don't require action.
+- When unsure between quick-task and inject, consider: will this take more than 10 minutes? If yes, inject.
+
+## Instructions
+
+1. **Classify** each pending capture using the criteria above.
+
+2. **Present** your classifications to the user using `ask_user_questions`. For each capture, show:
+   - The capture text
+   - Your proposed classification
+   - Your rationale
+   - If applicable, which files would be affected
+   
+   For captures classified as **note** or **defer**, auto-confirm without asking — these are low-impact.
+   For captures classified as **quick-task**, **inject**, or **replan**, ask the user to confirm or choose a different classification.
+
+3. **Update** `.gsd/CAPTURES.md` — for each capture, update its section with the confirmed classification:
+   - Change `**Status:** pending` to `**Status:** resolved`
+   - Add `**Classification:** <type>`
+   - Add `**Resolution:** <brief description of what will happen>`
+   - Add `**Rationale:** <why this classification>`
+   - Add `**Resolved:** <current ISO timestamp>`
+
+4. **Summarize** what was triaged: how many captures, what classifications were assigned, and what actions are pending (e.g., "2 quick-tasks ready for execution, 1 deferred to S03").
+
+**Important:** Do NOT execute any resolutions. Only classify and update CAPTURES.md. Resolution execution happens separately (in auto-mode dispatch or manually by the user).
+
+When done, say: "Triage complete."
--- a/src/resources/extensions/gsd/queue-order.ts
+++ b/src/resources/extensions/gsd/queue-order.ts
@ -0,0 +1,231 @@
+/**
+ * GSD Queue Order — Custom milestone execution ordering.
+ *
+ * Stores an explicit execution order in `.gsd/QUEUE-ORDER.json`.
+ * When present, `findMilestoneIds()` uses this order instead of
+ * the default numeric sort (milestoneIdSort).
+ *
+ * The file is committed to git (not gitignored) so ordering
+ * survives branch switches and is shared across sessions.
+ */
+
+import { readFileSync, writeFileSync, existsSync } from "node:fs";
+import { join } from "node:path";
+import { gsdRoot } from "./paths.js";
+import { milestoneIdSort } from "./guided-flow.js";
+
+// ─── Types ───────────────────────────────────────────────────────────────────
+
+interface QueueOrderFile {
+  order: string[];
+  updatedAt: string;
+}
+
+export interface DependencyViolation {
+  milestone: string;
+  dependsOn: string;
+  type: 'would_block' | 'circular' | 'missing_dep';
+  message: string;
+}
+
+export interface DependencyRedundancy {
+  milestone: string;
+  dependsOn: string;
+}
+
+export interface DependencyValidation {
+  valid: boolean;
+  violations: DependencyViolation[];
+  redundant: DependencyRedundancy[];
+}
+
+// ─── Path ────────────────────────────────────────────────────────────────────
+
+function queueOrderPath(basePath: string): string {
+  return join(gsdRoot(basePath), "QUEUE-ORDER.json");
+}
+
+// ─── Read / Write ────────────────────────────────────────────────────────────
+
+/**
+ * Load the custom queue order. Returns null if no file exists or if
+ * the file is corrupt/unreadable.
+ */
+export function loadQueueOrder(basePath: string): string[] | null {
+  const p = queueOrderPath(basePath);
+  if (!existsSync(p)) return null;
+  try {
+    const data: QueueOrderFile = JSON.parse(readFileSync(p, "utf-8"));
+    if (!Array.isArray(data.order)) return null;
+    return data.order;
+  } catch {
+    return null;
+  }
+}
+
+/**
+ * Save a custom queue order to disk.
+ */
+export function saveQueueOrder(basePath: string, order: string[]): void {
+  const data: QueueOrderFile = {
+    order,
+    updatedAt: new Date().toISOString(),
+  };
+  writeFileSync(queueOrderPath(basePath), JSON.stringify(data, null, 2) + "\n", "utf-8");
+}
+
+// ─── Sorting ─────────────────────────────────────────────────────────────────
+
+/**
+ * Sort milestone IDs respecting a custom order.
+ *
+ * - IDs present in `customOrder` appear in that exact sequence.
+ * - IDs on disk but NOT in `customOrder` are appended at the end,
+ *   sorted by the default `milestoneIdSort` (numeric).
+ * - IDs in `customOrder` but NOT on disk are silently skipped.
+ * - When `customOrder` is null, falls back to `milestoneIdSort`.
+ */
+export function sortByQueueOrder(ids: string[], customOrder: string[] | null): string[] {
+  if (!customOrder) return [...ids].sort(milestoneIdSort);
+
+  const idSet = new Set(ids);
+  const ordered: string[] = [];
+
+  // First: IDs from customOrder that exist on disk
+  for (const id of customOrder) {
+    if (idSet.has(id)) {
+      ordered.push(id);
+      idSet.delete(id);
+    }
+  }
+
+  // Then: remaining IDs not in customOrder, in default sort order
+  const remaining = [...idSet].sort(milestoneIdSort);
+  return [...ordered, ...remaining];
+}
+
+// ─── Pruning ─────────────────────────────────────────────────────────────────
+
+/**
+ * Remove IDs from the queue order file that are no longer valid
+ * (completed or deleted milestones). No-op if file doesn't exist.
+ */
+export function pruneQueueOrder(basePath: string, validIds: string[]): void {
+  const order = loadQueueOrder(basePath);
+  if (!order) return;
+
+  const validSet = new Set(validIds);
+  const pruned = order.filter(id => validSet.has(id));
+
+  if (pruned.length !== order.length) {
+    saveQueueOrder(basePath, pruned);
+  }
+}
+
+// ─── Validation ──────────────────────────────────────────────────────────────
+
+/**
+ * Validate a proposed queue order against dependency constraints.
+ *
+ * Checks:
+ * - would_block: A milestone is placed before one of its dependencies
+ * - circular: Two or more milestones form a dependency cycle
+ * - missing_dep: A milestone depends on an ID that doesn't exist
+ * - redundant: A dependency is satisfied by queue position (dep comes earlier)
+ */
+export function validateQueueOrder(
+  order: string[],
+  depsMap: Map<string, string[]>,
+  completedIds: Set<string>,
+): DependencyValidation {
+  const violations: DependencyViolation[] = [];
+  const redundant: DependencyRedundancy[] = [];
+
+  const positionMap = new Map<string, number>();
+  for (let i = 0; i < order.length; i++) {
+    positionMap.set(order[i], i);
+  }
+
+  const allKnownIds = new Set([...order, ...completedIds]);
+
+  for (const [mid, deps] of depsMap) {
+    const midPos = positionMap.get(mid);
+    if (midPos === undefined) continue; // not in pending order
+
+    for (const dep of deps) {
+      // Dep already completed — always satisfied
+      if (completedIds.has(dep)) continue;
+
+      // Dep doesn't exist anywhere
+      if (!allKnownIds.has(dep)) {
+        violations.push({
+          milestone: mid,
+          dependsOn: dep,
+          type: 'missing_dep',
+          message: `${mid} depends on ${dep}, but ${dep} does not exist.`,
+        });
+        continue;
+      }
+
+      const depPos = positionMap.get(dep);
+      if (depPos === undefined) continue; // dep not in pending order (edge case)
+
+      if (depPos > midPos) {
+        // Dep comes AFTER this milestone in the order — violation
+        violations.push({
+          milestone: mid,
+          dependsOn: dep,
+          type: 'would_block',
+          message: `${mid} cannot run before ${dep} — ${mid} depends_on: [${dep}].`,
+        });
+      } else {
+        // Dep comes before — satisfied by position, redundant
+        redundant.push({ milestone: mid, dependsOn: dep });
+      }
+    }
+  }
+
+  // Check for circular dependencies
+  const visited = new Set<string>();
+  const inStack = new Set<string>();
+
+  function hasCycle(node: string, path: string[]): string[] | null {
+    if (inStack.has(node)) return [...path, node];
+    if (visited.has(node)) return null;
+
+    visited.add(node);
+    inStack.add(node);
+
+    const deps = depsMap.get(node) ?? [];
+    for (const dep of deps) {
+      if (completedIds.has(dep)) continue;
+      const cycle = hasCycle(dep, [...path, node]);
+      if (cycle) return cycle;
+    }
+
+    inStack.delete(node);
+    return null;
+  }
+
+  for (const mid of order) {
+    if (!visited.has(mid)) {
+      const cycle = hasCycle(mid, []);
+      if (cycle) {
+        const cycleStr = cycle.join(' → ');
+        violations.push({
+          milestone: cycle[0],
+          dependsOn: cycle[cycle.length - 2],
+          type: 'circular',
+          message: `Circular dependency: ${cycleStr}`,
+        });
+        break; // one cycle report is enough
+      }
+    }
+  }
+
+  return {
+    valid: violations.length === 0,
+    violations,
+    redundant,
+  };
+}
--- a/src/resources/extensions/gsd/queue-reorder-ui.ts
+++ b/src/resources/extensions/gsd/queue-reorder-ui.ts
@ -0,0 +1,263 @@
+/**
+ * GSD Queue Reorder UI
+ *
+ * Interactive TUI overlay for reordering pending milestones.
+ * ↑/↓ navigates cursor. Space grabs/releases item for moving.
+ * While grabbed, ↑/↓ swaps the item with its neighbor.
+ * Enter confirms all changes. Esc cancels.
+ * Conflicting depends_on entries are auto-removed on confirm.
+ */
+
+import type { ExtensionContext } from "@gsd/pi-coding-agent";
+import { type Theme } from "@gsd/pi-coding-agent";
+import { Key, matchesKey, truncateToWidth, type TUI } from "@gsd/pi-tui";
+import { makeUI, GLYPH } from "../shared/ui.js";
+import { validateQueueOrder, type DependencyValidation } from "./queue-order.js";
+
+export interface ReorderItem {
+  id: string;
+  title: string;
+  dependsOn?: string[];
+}
+
+export interface ReorderResult {
+  order: string[];
+  /** depends_on entries to remove from CONTEXT.md files */
+  depsToRemove: Array<{ milestone: string; dep: string }>;
+}
+
+/**
+ * Show the queue reorder overlay.
+ * Returns the new order + deps to remove, or null if cancelled.
+ */
+export async function showQueueReorder(
+  ctx: ExtensionContext,
+  completed: ReorderItem[],
+  pending: ReorderItem[],
+): Promise<ReorderResult | null> {
+  if (!ctx.hasUI) return null;
+  if (pending.length < 2) return null;
+
+  return ctx.ui.custom<ReorderResult | null>((tui: TUI, theme: Theme, _kb, done) => {
+    const items = [...pending];
+    let cursor = 0;
+    let grabbed = false;
+    let cachedLines: string[] | undefined;
+    let validation: DependencyValidation;
+
+    // Mutable deps map — tracks removals during this session
+    const liveDeps = new Map<string, string[]>();
+    for (const item of [...completed, ...pending]) {
+      if (item.dependsOn && item.dependsOn.length > 0) {
+        liveDeps.set(item.id, [...item.dependsOn]);
+      }
+    }
+
+    const removedDeps: Array<{ milestone: string; dep: string }> = [];
+    const completedIds = new Set(completed.map(c => c.id));
+
+    function revalidate() {
+      validation = validateQueueOrder(items.map(i => i.id), liveDeps, completedIds);
+    }
+
+    revalidate();
+
+    function refresh() {
+      cachedLines = undefined;
+      tui.requestRender();
+    }
+
+    function swapItems(fromIdx: number, toIdx: number) {
+      if (toIdx < 0 || toIdx >= items.length) return;
+      const [item] = items.splice(fromIdx, 1);
+      items.splice(toIdx, 0, item);
+      cursor = toIdx;
+      revalidate();
+      refresh();
+    }
+
+    function removeDep(milestone: string, dep: string) {
+      const deps = liveDeps.get(milestone);
+      if (!deps) return;
+      const idx = deps.indexOf(dep);
+      if (idx >= 0) {
+        deps.splice(idx, 1);
+        if (deps.length === 0) liveDeps.delete(milestone);
+        removedDeps.push({ milestone, dep });
+        const item = items.find(i => i.id === milestone);
+        if (item?.dependsOn) {
+          item.dependsOn = item.dependsOn.filter(d => d !== dep);
+        }
+        revalidate();
+        refresh();
+      }
+    }
+
+    function handleInput(data: string) {
+      if (matchesKey(data, Key.escape) || matchesKey(data, Key.ctrl("c"))) {
+        done(null);
+        return;
+      }
+
+      // Confirm — auto-resolve would_block violations
+      if (matchesKey(data, Key.enter)) {
+        const wouldBlock = validation.violations.filter(v => v.type === 'would_block');
+        for (const v of wouldBlock) {
+          removeDep(v.milestone, v.dependsOn);
+        }
+        done({ order: items.map(i => i.id), depsToRemove: removedDeps });
+        return;
+      }
+
+      // Space — toggle grab mode
+      if (data === " ") {
+        grabbed = !grabbed;
+        refresh();
+        return;
+      }
+
+      // ↑/↓ — move grabbed item OR navigate cursor
+      if (matchesKey(data, Key.up)) {
+        if (grabbed) {
+          swapItems(cursor, cursor - 1);
+        } else {
+          cursor = Math.max(0, cursor - 1);
+          refresh();
+        }
+        return;
+      }
+      if (matchesKey(data, Key.down)) {
+        if (grabbed) {
+          swapItems(cursor, cursor + 1);
+        } else {
+          cursor = Math.min(items.length - 1, cursor + 1);
+          refresh();
+        }
+        return;
+      }
+
+      // 'd' — manually remove a dep on the cursor item
+      if (data === "d" || data === "D") {
+        const item = items[cursor];
+        const deps = liveDeps.get(item.id);
+        if (deps) {
+          const activeDep = deps.find(d => !completedIds.has(d));
+          if (activeDep) removeDep(item.id, activeDep);
+        }
+        return;
+      }
+    }
+
+    function render(width: number): string[] {
+      if (cachedLines) return cachedLines;
+
+      const ui = makeUI(theme, width);
+      const lines: string[] = [];
+      const push = (...rows: string[][]) => { for (const r of rows) lines.push(...r); };
+      const add = (s: string) => truncateToWidth(s, width);
+
+      const headerText = grabbed ? "  Queue Reorder — Moving Item" : "  Queue Reorder";
+      push(ui.bar(), ui.blank(), ui.header(headerText), ui.blank());
+
+      // Completed milestones (dimmed)
+      if (completed.length > 0) {
+        lines.push(add(theme.fg("dim", "  Completed:")));
+        for (const m of completed) {
+          const label = m.title && m.title !== m.id ? `${m.id}  ${m.title}` : m.id;
+          lines.push(add(`    ${theme.fg("dim", `${GLYPH.statusDone} ${label}`)}`));
+        }
+        push(ui.blank());
+      }
+
+      // Pending milestones
+      const queueLabel = grabbed ? "  Queue (space to release, ↑/↓ to move):" : "  Queue (space to grab, ↑/↓ to navigate):";
+      lines.push(add(theme.fg("text", queueLabel)));
+
+      const violatedPairs = new Set(
+        validation.violations.filter(v => v.type === 'would_block').map(v => `${v.milestone}:${v.dependsOn}`),
+      );
+      const redundantPairs = new Set(
+        validation.redundant.map(r => `${r.milestone}:${r.dependsOn}`),
+      );
+
+      for (let i = 0; i < items.length; i++) {
+        const item = items[i];
+        const isCursor = i === cursor;
+        const num = i + 1;
+        const label = item.title && item.title !== item.id ? `${item.id}  ${item.title}` : item.id;
+
+        if (isCursor && grabbed) {
+          lines.push(add(`  ${theme.fg("warning", `▸▸ ${num}. ${label}`)}`));
+        } else if (isCursor) {
+          lines.push(add(`  ${theme.fg("accent", `${GLYPH.cursor} ${num}. ${label}`)}`));
+        } else {
+          lines.push(add(`    ${theme.fg("text", `${num}. ${label}`)}`));
+        }
+
+        // depends_on annotations
+        const deps = liveDeps.get(item.id) ?? [];
+        for (const dep of deps) {
+          if (completedIds.has(dep)) continue;
+          const pairKey = `${item.id}:${dep}`;
+          if (violatedPairs.has(pairKey)) {
+            lines.push(add(`       ${theme.fg("warning", `${GLYPH.statusWarning} depends_on: ${dep} — auto-removed on confirm`)}`));
+          } else if (redundantPairs.has(pairKey)) {
+            lines.push(add(`       ${theme.fg("dim", `↳ depends_on: ${dep} (redundant)`)}`));
+          } else {
+            lines.push(add(`       ${theme.fg("dim", `↳ depends_on: ${dep}`)}`));
+          }
+        }
+
+        // Missing deps
+        for (const v of validation.violations.filter(v => v.milestone === item.id && v.type === 'missing_dep')) {
+          lines.push(add(`       ${theme.fg("error", `${GLYPH.statusWarning} depends_on: ${v.dependsOn} (does not exist)`)}`));
+        }
+      }
+
+      // Removed deps feedback
+      if (removedDeps.length > 0) {
+        push(ui.blank());
+        for (const r of removedDeps) {
+          lines.push(add(`  ${theme.fg("success", `${GLYPH.statusDone} Removed: ${r.milestone} depends_on ${r.dep}`)}`));
+        }
+      }
+
+      // Circular warning
+      const circ = validation.violations.find(v => v.type === 'circular');
+      if (circ) {
+        push(ui.blank());
+        lines.push(add(`  ${theme.fg("error", `${GLYPH.statusWarning} ${circ.message}`)}`));
+      }
+
+      push(ui.blank());
+
+      // Hints — context-sensitive based on grab state
+      const hints: string[] = [];
+      if (grabbed) {
+        hints.push("↑/↓ move item", "space release");
+      } else {
+        hints.push("↑/↓ navigate", "space grab");
+      }
+      const hasDeps = liveDeps.get(items[cursor]?.id)?.some(d => !completedIds.has(d));
+      if (hasDeps) hints.push("d del dep");
+
+      const wouldBlockCount = validation.violations.filter(v => v.type === 'would_block').length;
+      if (wouldBlockCount > 0) {
+        hints.push(`enter (fixes ${wouldBlockCount} dep)`);
+      } else {
+        hints.push("enter ok");
+      }
+      hints.push("esc");
+
+      push(ui.hints(hints), ui.bar());
+
+      cachedLines = lines;
+      return lines;
+    }
+
+    return { render, invalidate: () => { cachedLines = undefined; }, handleInput };
+  }, {
+    overlay: true,
+    overlayOptions: { width: "70%", minWidth: 50, maxHeight: "80%", anchor: "center" },
+  });
+}
--- a/src/resources/extensions/gsd/state.ts
+++ b/src/resources/extensions/gsd/state.ts
@ -224,9 +224,21 @@ async function _deriveStateImpl(basePath: string): Promise<GSDState> {
          const draftFile = resolveMilestoneFile(basePath, mid, "CONTEXT-DRAFT");
          if (draftFile) activeMilestoneHasDraft = true;
        }
-        activeMilestone = { id: mid, title: mid };
-        activeMilestoneFound = true;
-        registry.push({ id: mid, title: mid, status: 'active' });
+
+        // Check milestone-level dependencies before promoting to active.
+        // Without this, a queued milestone with depends_on in its CONTEXT
+        // frontmatter would be promoted to active even when its deps are unmet
+        // (the dep check only existed in the has-roadmap path previously).
+        const contextContent = contextFile ? await cachedLoadFile(contextFile) : null;
+        const deps = parseContextDependsOn(contextContent);
+        const depsUnmet = deps.some(dep => !completeMilestoneIds.has(dep));
+        if (depsUnmet) {
+          registry.push({ id: mid, title: mid, status: 'pending', dependsOn: deps });
+        } else {
+          activeMilestone = { id: mid, title: mid };
+          activeMilestoneFound = true;
+          registry.push({ id: mid, title: mid, status: 'active', ...(deps.length > 0 ? { dependsOn: deps } : {}) });
+        }
      } else {
        registry.push({ id: mid, title: mid, status: 'pending' });
      }
--- a/src/resources/extensions/gsd/templates/knowledge.md
+++ b/src/resources/extensions/gsd/templates/knowledge.md
@ -0,0 +1,19 @@
+# Project Knowledge
+
+Append-only register of project-specific rules, patterns, and lessons learned.
+Agents read this before every unit. Add entries when you discover something worth remembering.
+
+## Rules
+
+| # | Scope | Rule | Why | Added |
+|---|-------|------|-----|-------|
+
+## Patterns
+
+| # | Pattern | Where | Notes |
+|---|---------|-------|-------|
+
+## Lessons Learned
+
+| # | What Happened | Root Cause | Fix | Scope |
+|---|--------------|------------|-----|-------|
--- a/src/resources/extensions/gsd/templates/preferences.md
+++ b/src/resources/extensions/gsd/templates/preferences.md
@ -15,7 +15,21 @@ git:
  snapshots:
  pre_merge_check:
  commit_type:
+  main_branch:
+  merge_strategy:
+  isolation:
 unique_milestone_ids:
+budget_ceiling:
+budget_enforcement:
+context_pause_threshold:
+notifications:
+  enabled:
+  on_complete:
+  on_error:
+  on_budget:
+  on_milestone:
+  on_attention:
+uat_dispatch:
 ---

 # GSD Skill Preferences
--- a/src/resources/extensions/gsd/tests/auto-worktree.test.ts
+++ b/src/resources/extensions/gsd/tests/auto-worktree.test.ts
@ -17,6 +17,7 @@ import {
  getAutoWorktreePath,
  enterAutoWorktree,
  getAutoWorktreeOriginalBase,
+  getActiveAutoWorktreeContext,
 } from "../auto-worktree.ts";

 import { createTestContext } from "./test-helpers.ts";
@ -76,6 +77,15 @@ async function main(): Promise<void> {

    // ─── getAutoWorktreeOriginalBase ─────────────────────────────────
    assertEq(getAutoWorktreeOriginalBase(), tempDir, "originalBase returns temp dir");
+    assertEq(
+      getActiveAutoWorktreeContext(),
+      {
+        originalBase: tempDir,
+        worktreeName: "M003",
+        branch: "milestone/M003",
+      },
+      "active auto-worktree context reflects the worktree cwd",
+    );

    // ─── getAutoWorktreePath ─────────────────────────────────────────
    assertEq(getAutoWorktreePath(tempDir, "M003"), wtPath, "getAutoWorktreePath returns correct path");
@ -88,6 +98,7 @@ async function main(): Promise<void> {
    assertTrue(!existsSync(wtPath), "worktree directory removed after teardown");
    assertTrue(!isInAutoWorktree(tempDir), "isInAutoWorktree returns false after teardown");
    assertEq(getAutoWorktreeOriginalBase(), null, "originalBase is null after teardown");
+    assertEq(getActiveAutoWorktreeContext(), null, "active auto-worktree context clears after teardown");

    // ─── Re-entry: create again, exit without teardown, re-enter ─────
    console.log("\n=== re-entry ===");
@ -103,6 +114,15 @@ async function main(): Promise<void> {
    assertEq(process.cwd(), entered, "re-entered worktree via enterAutoWorktree");
    assertEq(getAutoWorktreeOriginalBase(), tempDir, "originalBase restored on re-entry");
    assertTrue(isInAutoWorktree(tempDir), "isInAutoWorktree true after re-entry");
+    assertEq(
+      getActiveAutoWorktreeContext(),
+      {
+        originalBase: tempDir,
+        worktreeName: "M003",
+        branch: "milestone/M003",
+      },
+      "active auto-worktree context is restored on re-entry",
+    );

    // Cleanup
    teardownAutoWorktree(tempDir, "M003");
--- a/src/resources/extensions/gsd/tests/captures.test.ts
+++ b/src/resources/extensions/gsd/tests/captures.test.ts
@ -0,0 +1,438 @@
+/**
+ * Unit tests for GSD Captures — file I/O, parsing, and worktree path resolution.
+ *
+ * Exercises the boundary contract that S02 (auto-mode dispatch) depends on:
+ * - appendCapture creates/appends entries to CAPTURES.md
+ * - loadAllCaptures / loadPendingCaptures parse and filter correctly
+ * - hasPendingCaptures does fast regex check without full parse
+ * - markCaptureResolved updates entry in place
+ * - resolveCapturesPath handles worktree paths
+ * - parseTriageOutput handles valid, malformed, and partial JSON
+ */
+
+import test from "node:test";
+import assert from "node:assert/strict";
+import { mkdirSync, readFileSync, writeFileSync, rmSync, existsSync } from "node:fs";
+import { join } from "node:path";
+import { tmpdir } from "node:os";
+import {
+  appendCapture,
+  loadAllCaptures,
+  loadPendingCaptures,
+  hasPendingCaptures,
+  markCaptureResolved,
+  resolveCapturesPath,
+  parseTriageOutput,
+} from "../captures.ts";
+
+function makeTempDir(prefix: string): string {
+  const dir = join(
+    tmpdir(),
+    `${prefix}-${Date.now()}-${Math.random().toString(36).slice(2)}`,
+  );
+  mkdirSync(dir, { recursive: true });
+  return dir;
+}
+
+// ─── appendCapture ────────────────────────────────────────────────────────────
+
+test("captures: appendCapture creates CAPTURES.md on first call", () => {
+  const tmp = makeTempDir("cap-create");
+  try {
+    const id = appendCapture(tmp, "first thought");
+    assert.ok(id.startsWith("CAP-"), "ID should start with CAP-");
+    assert.ok(
+      existsSync(join(tmp, ".gsd", "CAPTURES.md")),
+      "CAPTURES.md should exist",
+    );
+    const content = readFileSync(join(tmp, ".gsd", "CAPTURES.md"), "utf-8");
+    assert.ok(content.includes("# Captures"), "should have header");
+    assert.ok(content.includes(`### ${id}`), "should have entry heading");
+    assert.ok(
+      content.includes("**Text:** first thought"),
+      "should have text field",
+    );
+    assert.ok(
+      content.includes("**Status:** pending"),
+      "should have pending status",
+    );
+  } finally {
+    rmSync(tmp, { recursive: true, force: true });
+  }
+});
+
+test("captures: appendCapture appends to existing file", () => {
+  const tmp = makeTempDir("cap-append");
+  try {
+    const id1 = appendCapture(tmp, "thought one");
+    const id2 = appendCapture(tmp, "thought two");
+    assert.notStrictEqual(id1, id2, "IDs should be unique");
+
+    const content = readFileSync(join(tmp, ".gsd", "CAPTURES.md"), "utf-8");
+    assert.ok(content.includes(`### ${id1}`), "should have first entry");
+    assert.ok(content.includes(`### ${id2}`), "should have second entry");
+    assert.ok(
+      content.includes("**Text:** thought one"),
+      "should have first text",
+    );
+    assert.ok(
+      content.includes("**Text:** thought two"),
+      "should have second text",
+    );
+  } finally {
+    rmSync(tmp, { recursive: true, force: true });
+  }
+});
+
+// ─── loadAllCaptures / loadPendingCaptures ────────────────────────────────────
+
+test("captures: loadAllCaptures parses entries correctly", () => {
+  const tmp = makeTempDir("cap-load");
+  try {
+    appendCapture(tmp, "alpha");
+    appendCapture(tmp, "beta");
+
+    const all = loadAllCaptures(tmp);
+    assert.strictEqual(all.length, 2, "should have 2 entries");
+    assert.strictEqual(all[0].text, "alpha");
+    assert.strictEqual(all[1].text, "beta");
+    assert.strictEqual(all[0].status, "pending");
+    assert.strictEqual(all[1].status, "pending");
+  } finally {
+    rmSync(tmp, { recursive: true, force: true });
+  }
+});
+
+test("captures: loadAllCaptures returns empty array when no file", () => {
+  const tmp = makeTempDir("cap-nofile");
+  try {
+    const all = loadAllCaptures(tmp);
+    assert.strictEqual(all.length, 0);
+  } finally {
+    rmSync(tmp, { recursive: true, force: true });
+  }
+});
+
+test("captures: loadPendingCaptures filters resolved entries", () => {
+  const tmp = makeTempDir("cap-pending");
+  try {
+    const id1 = appendCapture(tmp, "pending one");
+    appendCapture(tmp, "pending two");
+
+    // Resolve the first one
+    markCaptureResolved(tmp, id1, "note", "acknowledged", "just a note");
+
+    const pending = loadPendingCaptures(tmp);
+    assert.strictEqual(pending.length, 1, "should have 1 pending");
+    assert.strictEqual(pending[0].text, "pending two");
+
+    const all = loadAllCaptures(tmp);
+    assert.strictEqual(all.length, 2, "all should still have 2");
+    assert.strictEqual(all[0].status, "resolved");
+    assert.strictEqual(all[1].status, "pending");
+  } finally {
+    rmSync(tmp, { recursive: true, force: true });
+  }
+});
+
+// ─── hasPendingCaptures ───────────────────────────────────────────────────────
+
+test("captures: hasPendingCaptures returns false when no file", () => {
+  const tmp = makeTempDir("cap-has-nofile");
+  try {
+    assert.strictEqual(hasPendingCaptures(tmp), false);
+  } finally {
+    rmSync(tmp, { recursive: true, force: true });
+  }
+});
+
+test("captures: hasPendingCaptures returns true with pending entries", () => {
+  const tmp = makeTempDir("cap-has-true");
+  try {
+    appendCapture(tmp, "something");
+    assert.strictEqual(hasPendingCaptures(tmp), true);
+  } finally {
+    rmSync(tmp, { recursive: true, force: true });
+  }
+});
+
+test("captures: hasPendingCaptures returns false when all resolved", () => {
+  const tmp = makeTempDir("cap-has-false");
+  try {
+    const id = appendCapture(tmp, "will resolve");
+    markCaptureResolved(tmp, id, "note", "done", "resolved it");
+    assert.strictEqual(hasPendingCaptures(tmp), false);
+  } finally {
+    rmSync(tmp, { recursive: true, force: true });
+  }
+});
+
+// ─── markCaptureResolved ──────────────────────────────────────────────────────
+
+test("captures: markCaptureResolved updates entry in place", () => {
+  const tmp = makeTempDir("cap-resolve");
+  try {
+    const id1 = appendCapture(tmp, "keep pending");
+    const id2 = appendCapture(tmp, "will resolve");
+    appendCapture(tmp, "also pending");
+
+    markCaptureResolved(tmp, id2, "quick-task", "executed inline", "small fix");
+
+    const all = loadAllCaptures(tmp);
+    assert.strictEqual(all.length, 3, "should still have 3 entries");
+
+    const resolved = all.find((c) => c.id === id2)!;
+    assert.strictEqual(resolved.status, "resolved");
+    assert.strictEqual(resolved.classification, "quick-task");
+    assert.strictEqual(resolved.resolution, "executed inline");
+    assert.strictEqual(resolved.rationale, "small fix");
+    assert.ok(resolved.resolvedAt, "should have resolved timestamp");
+
+    // Others should be unaffected
+    const kept = all.find((c) => c.id === id1)!;
+    assert.strictEqual(kept.status, "pending");
+    assert.strictEqual(kept.classification, undefined);
+  } finally {
+    rmSync(tmp, { recursive: true, force: true });
+  }
+});
+
+// ─── resolveCapturesPath ──────────────────────────────────────────────────────
+
+test("captures: resolveCapturesPath returns .gsd/CAPTURES.md for normal path", () => {
+  const base = join(tmpdir(), "cap-test-project");
+  const result = resolveCapturesPath(base);
+  assert.ok(result.endsWith(join(".gsd", "CAPTURES.md")));
+  assert.ok(result.startsWith(base));
+});
+
+test("captures: resolveCapturesPath resolves worktree path to project root", () => {
+  const base = join(tmpdir(), "cap-test-project");
+  const worktreePath = join(base, ".gsd", "worktrees", "M004");
+  const result = resolveCapturesPath(worktreePath);
+  assert.ok(
+    result.endsWith(join(".gsd", "CAPTURES.md")),
+    `should end with .gsd/CAPTURES.md, got: ${result}`,
+  );
+  // Should resolve to project root, not worktree root
+  assert.ok(
+    !result.includes("worktrees"),
+    `should not contain worktrees, got: ${result}`,
+  );
+  assert.ok(
+    result.startsWith(base),
+    `should start with ${base}, got: ${result}`,
+  );
+});
+
+// ─── parseTriageOutput ────────────────────────────────────────────────────────
+
+test("triage: parseTriageOutput handles valid JSON array", () => {
+  const input = JSON.stringify([
+    {
+      captureId: "CAP-abc123",
+      classification: "quick-task",
+      rationale: "Small fix",
+      affectedFiles: ["src/foo.ts"],
+    },
+    {
+      captureId: "CAP-def456",
+      classification: "defer",
+      rationale: "Future work",
+      targetSlice: "S03",
+    },
+  ]);
+
+  const results = parseTriageOutput(input);
+  assert.strictEqual(results.length, 2);
+  assert.strictEqual(results[0].captureId, "CAP-abc123");
+  assert.strictEqual(results[0].classification, "quick-task");
+  assert.deepStrictEqual(results[0].affectedFiles, ["src/foo.ts"]);
+  assert.strictEqual(results[1].classification, "defer");
+  assert.strictEqual(results[1].targetSlice, "S03");
+});
+
+test("triage: parseTriageOutput handles fenced code block", () => {
+  const input = `Here are my classifications:
+
+\`\`\`json
+[
+  {
+    "captureId": "CAP-aaa",
+    "classification": "note",
+    "rationale": "Just informational"
+  }
+]
+\`\`\`
+
+That's my analysis.`;
+
+  const results = parseTriageOutput(input);
+  assert.strictEqual(results.length, 1);
+  assert.strictEqual(results[0].captureId, "CAP-aaa");
+  assert.strictEqual(results[0].classification, "note");
+});
+
+test("triage: parseTriageOutput handles JSON with leading/trailing prose", () => {
+  const input = `I've analyzed the captures. Here are my results:
+[{"captureId": "CAP-bbb", "classification": "inject", "rationale": "Needs a new task"}]
+Let me know if you need changes.`;
+
+  const results = parseTriageOutput(input);
+  assert.strictEqual(results.length, 1);
+  assert.strictEqual(results[0].classification, "inject");
+});
+
+test("triage: parseTriageOutput returns empty array on malformed JSON", () => {
+  const results = parseTriageOutput("this is not json at all");
+  assert.strictEqual(results.length, 0);
+});
+
+test("triage: parseTriageOutput returns empty array on empty input", () => {
+  assert.strictEqual(parseTriageOutput("").length, 0);
+  assert.strictEqual(parseTriageOutput("  ").length, 0);
+});
+
+test("triage: parseTriageOutput filters invalid entries from partial results", () => {
+  const input = JSON.stringify([
+    {
+      captureId: "CAP-good",
+      classification: "note",
+      rationale: "Valid entry",
+    },
+    {
+      captureId: "CAP-bad",
+      classification: "invalid-type",
+      rationale: "Bad classification",
+    },
+    {
+      // Missing required fields
+      captureId: "CAP-incomplete",
+    },
+    {
+      captureId: "CAP-also-good",
+      classification: "replan",
+      rationale: "Needs restructuring",
+    },
+  ]);
+
+  const results = parseTriageOutput(input);
+  assert.strictEqual(results.length, 2, "should keep only valid entries");
+  assert.strictEqual(results[0].captureId, "CAP-good");
+  assert.strictEqual(results[1].captureId, "CAP-also-good");
+});
+
+test("triage: parseTriageOutput wraps single object in array", () => {
+  const input = JSON.stringify({
+    captureId: "CAP-single",
+    classification: "quick-task",
+    rationale: "Just one",
+  });
+
+  const results = parseTriageOutput(input);
+  assert.strictEqual(results.length, 1);
+  assert.strictEqual(results[0].captureId, "CAP-single");
+});
+
+test("triage: parseTriageOutput handles all five classification types", () => {
+  const types = [
+    "quick-task",
+    "inject",
+    "defer",
+    "replan",
+    "note",
+  ] as const;
+
+  const input = JSON.stringify(
+    types.map((t, i) => ({
+      captureId: `CAP-${i}`,
+      classification: t,
+      rationale: `Type: ${t}`,
+    })),
+  );
+
+  const results = parseTriageOutput(input);
+  assert.strictEqual(results.length, 5);
+  for (let i = 0; i < types.length; i++) {
+    assert.strictEqual(results[i].classification, types[i]);
+  }
+});
+
+// ─── Edge Cases ───────────────────────────────────────────────────────────────
+
+test("captures: appendCapture handles special characters in text", () => {
+  const tmp = makeTempDir("cap-special");
+  try {
+    const id = appendCapture(tmp, 'text with "quotes" and **bold** and `code`');
+    const all = loadAllCaptures(tmp);
+    assert.strictEqual(all.length, 1);
+    assert.ok(all[0].text.includes('"quotes"'), "should preserve quotes");
+    assert.ok(all[0].text.includes("**bold**"), "should preserve bold");
+  } finally {
+    rmSync(tmp, { recursive: true, force: true });
+  }
+});
+
+test("captures: markCaptureResolved is no-op for non-existent ID", () => {
+  const tmp = makeTempDir("cap-noop");
+  try {
+    appendCapture(tmp, "real capture");
+    // Should not throw
+    markCaptureResolved(tmp, "CAP-nonexistent", "note", "test", "test");
+    const all = loadAllCaptures(tmp);
+    assert.strictEqual(all.length, 1);
+    assert.strictEqual(all[0].status, "pending", "original should be unchanged");
+  } finally {
+    rmSync(tmp, { recursive: true, force: true });
+  }
+});
+
+test("captures: markCaptureResolved is no-op when no file exists", () => {
+  const tmp = makeTempDir("cap-nofile-resolve");
+  try {
+    // Should not throw
+    markCaptureResolved(tmp, "CAP-abc", "note", "test", "test");
+  } finally {
+    rmSync(tmp, { recursive: true, force: true });
+  }
+});
+
+test("captures: re-resolving a capture overwrites previous resolution", () => {
+  const tmp = makeTempDir("cap-reresolve");
+  try {
+    const id = appendCapture(tmp, "will re-resolve");
+    markCaptureResolved(tmp, id, "note", "first resolution", "first rationale");
+    markCaptureResolved(tmp, id, "inject", "second resolution", "second rationale");
+
+    const all = loadAllCaptures(tmp);
+    assert.strictEqual(all.length, 1);
+    assert.strictEqual(all[0].classification, "inject", "should have updated classification");
+    assert.strictEqual(all[0].resolution, "second resolution");
+    assert.strictEqual(all[0].rationale, "second rationale");
+  } finally {
+    rmSync(tmp, { recursive: true, force: true });
+  }
+});
+
+test("triage: parseTriageOutput preserves affectedFiles and targetSlice", () => {
+  const input = JSON.stringify([
+    {
+      captureId: "CAP-files",
+      classification: "quick-task",
+      rationale: "Has files",
+      affectedFiles: ["src/a.ts", "src/b.ts"],
+    },
+    {
+      captureId: "CAP-target",
+      classification: "defer",
+      rationale: "Has target",
+      targetSlice: "S04",
+    },
+  ]);
+
+  const results = parseTriageOutput(input);
+  assert.deepStrictEqual(results[0].affectedFiles, ["src/a.ts", "src/b.ts"]);
+  assert.strictEqual(results[0].targetSlice, undefined);
+  assert.strictEqual(results[1].targetSlice, "S04");
+  assert.strictEqual(results[1].affectedFiles, undefined);
+});
--- a/src/resources/extensions/gsd/tests/complexity-classifier.test.ts
+++ b/src/resources/extensions/gsd/tests/complexity-classifier.test.ts
@ -0,0 +1,181 @@
+import test from "node:test";
+import assert from "node:assert/strict";
+
+import { classifyUnitComplexity, tierLabel, tierOrdinal } from "../complexity-classifier.js";
+import type { ComplexityTier, TaskMetadata } from "../complexity-classifier.js";
+
+// ─── tierLabel ───────────────────────────────────────────────────────────────
+
+test("tierLabel returns correct short labels", () => {
+  assert.equal(tierLabel("light"), "L");
+  assert.equal(tierLabel("standard"), "S");
+  assert.equal(tierLabel("heavy"), "H");
+});
+
+// ─── tierOrdinal ─────────────────────────────────────────────────────────────
+
+test("tierOrdinal returns correct ordering", () => {
+  assert.ok(tierOrdinal("light") < tierOrdinal("standard"));
+  assert.ok(tierOrdinal("standard") < tierOrdinal("heavy"));
+});
+
+// ─── Unit Type Classification ────────────────────────────────────────────────
+
+test("complete-slice classifies as light", () => {
+  const result = classifyUnitComplexity("complete-slice", "M001/S01", "/tmp/fake");
+  assert.equal(result.tier, "light");
+});
+
+test("run-uat classifies as light", () => {
+  const result = classifyUnitComplexity("run-uat", "M001/S01", "/tmp/fake");
+  assert.equal(result.tier, "light");
+});
+
+test("research-milestone classifies as standard", () => {
+  const result = classifyUnitComplexity("research-milestone", "M001", "/tmp/fake");
+  assert.equal(result.tier, "standard");
+});
+
+test("research-slice classifies as standard", () => {
+  const result = classifyUnitComplexity("research-slice", "M001/S01", "/tmp/fake");
+  assert.equal(result.tier, "standard");
+});
+
+test("plan-milestone classifies as standard", () => {
+  const result = classifyUnitComplexity("plan-milestone", "M001", "/tmp/fake");
+  assert.equal(result.tier, "standard");
+});
+
+test("plan-slice classifies as standard", () => {
+  const result = classifyUnitComplexity("plan-slice", "M001/S01", "/tmp/fake");
+  assert.equal(result.tier, "standard");
+});
+
+test("replan-slice classifies as heavy", () => {
+  const result = classifyUnitComplexity("replan-slice", "M001/S01", "/tmp/fake");
+  assert.equal(result.tier, "heavy");
+});
+
+test("reassess-roadmap classifies as heavy", () => {
+  const result = classifyUnitComplexity("reassess-roadmap", "M001", "/tmp/fake");
+  assert.equal(result.tier, "heavy");
+});
+
+test("hook units classify as light", () => {
+  const result = classifyUnitComplexity("hook/verify", "M001/S01/T01", "/tmp/fake");
+  assert.equal(result.tier, "light");
+  assert.match(result.reason, /hook/);
+});
+
+test("unknown unit types default to standard", () => {
+  const result = classifyUnitComplexity("custom-thing", "M001", "/tmp/fake");
+  assert.equal(result.tier, "standard");
+});
+
+// ─── Task Metadata Classification ────────────────────────────────────────────
+
+test("execute-task with many dependencies classifies as heavy", () => {
+  const metadata: TaskMetadata = { dependencyCount: 4 };
+  const result = classifyUnitComplexity("execute-task", "M001/S01/T01", "/tmp/fake", undefined, metadata);
+  assert.equal(result.tier, "heavy");
+  assert.match(result.reason, /dependencies/);
+});
+
+test("execute-task with many files classifies as heavy", () => {
+  const metadata: TaskMetadata = { fileCount: 8 };
+  const result = classifyUnitComplexity("execute-task", "M001/S01/T01", "/tmp/fake", undefined, metadata);
+  assert.equal(result.tier, "heavy");
+  assert.match(result.reason, /files/);
+});
+
+test("execute-task with large estimated lines classifies as heavy", () => {
+  const metadata: TaskMetadata = { estimatedLines: 600 };
+  const result = classifyUnitComplexity("execute-task", "M001/S01/T01", "/tmp/fake", undefined, metadata);
+  assert.equal(result.tier, "heavy");
+  assert.match(result.reason, /lines/);
+});
+
+test("execute-task with docs tags classifies as light", () => {
+  const metadata: TaskMetadata = { tags: ["docs"] };
+  const result = classifyUnitComplexity("execute-task", "M001/S01/T01", "/tmp/fake", undefined, metadata);
+  assert.equal(result.tier, "light");
+});
+
+test("execute-task with single file modification classifies as light", () => {
+  const metadata: TaskMetadata = { fileCount: 1, isNewFile: false };
+  const result = classifyUnitComplexity("execute-task", "M001/S01/T01", "/tmp/fake", undefined, metadata);
+  assert.equal(result.tier, "light");
+});
+
+test("execute-task with no metadata classifies as standard", () => {
+  const result = classifyUnitComplexity("execute-task", "M001/S01/T01", "/tmp/fake");
+  assert.equal(result.tier, "standard");
+});
+
+// ─── Budget Pressure ─────────────────────────────────────────────────────────
+
+test("no budget pressure below 50%", () => {
+  const result = classifyUnitComplexity("research-slice", "M001/S01", "/tmp/fake", 0.3);
+  assert.equal(result.tier, "standard");
+  assert.equal(result.downgraded, false);
+});
+
+test("budget pressure at 50% downgrades standard to light", () => {
+  const result = classifyUnitComplexity("research-slice", "M001/S01", "/tmp/fake", 0.55);
+  assert.equal(result.tier, "light");
+  assert.equal(result.downgraded, true);
+  assert.match(result.reason, /budget pressure/);
+});
+
+test("budget pressure at 75% keeps heavy as heavy", () => {
+  const result = classifyUnitComplexity("replan-slice", "M001/S01", "/tmp/fake", 0.80);
+  assert.equal(result.tier, "heavy");
+  assert.equal(result.downgraded, false);
+});
+
+test("budget pressure at 90% downgrades heavy to standard", () => {
+  const result = classifyUnitComplexity("replan-slice", "M001/S01", "/tmp/fake", 0.95);
+  assert.equal(result.tier, "standard");
+  assert.equal(result.downgraded, true);
+});
+
+test("budget pressure at 90% downgrades standard to light", () => {
+  const result = classifyUnitComplexity("research-slice", "M001/S01", "/tmp/fake", 0.95);
+  assert.equal(result.tier, "light");
+  assert.equal(result.downgraded, true);
+});
+
+test("budget pressure at 90% downgrades light stays light", () => {
+  const result = classifyUnitComplexity("complete-slice", "M001/S01", "/tmp/fake", 0.95);
+  assert.equal(result.tier, "light");
+});
+
+// ─── Phase 4: Task Plan Introspection ────────────────────────────────────────
+
+test("execute-task with multiple complexity keywords classifies as heavy", () => {
+  const metadata: TaskMetadata = { complexityKeywords: ["migration", "security"] };
+  const result = classifyUnitComplexity("execute-task", "M001/S01/T01", "/tmp/fake", undefined, metadata);
+  assert.equal(result.tier, "heavy");
+  assert.match(result.reason, /migration/);
+  assert.match(result.reason, /security/);
+});
+
+test("execute-task with single complexity keyword classifies as standard", () => {
+  const metadata: TaskMetadata = { complexityKeywords: ["performance"] };
+  const result = classifyUnitComplexity("execute-task", "M001/S01/T01", "/tmp/fake", undefined, metadata);
+  assert.equal(result.tier, "standard");
+  assert.match(result.reason, /performance/);
+});
+
+test("execute-task with many code blocks classifies as heavy", () => {
+  const metadata: TaskMetadata = { codeBlockCount: 6 };
+  const result = classifyUnitComplexity("execute-task", "M001/S01/T01", "/tmp/fake", undefined, metadata);
+  assert.equal(result.tier, "heavy");
+  assert.match(result.reason, /code blocks/);
+});
+
+test("execute-task with few code blocks stays standard", () => {
+  const metadata: TaskMetadata = { codeBlockCount: 2 };
+  const result = classifyUnitComplexity("execute-task", "M001/S01/T01", "/tmp/fake", undefined, metadata);
+  assert.equal(result.tier, "standard");
+});
--- a/src/resources/extensions/gsd/tests/derive-state-deps.test.ts
+++ b/src/resources/extensions/gsd/tests/derive-state-deps.test.ts
@ -303,6 +303,105 @@ async function main(): Promise<void> {
    }
  }

+  // ─── Test Group 7: unique-id-deps ──────────────────────────────────────
+  // M004-0zjrg0 is complete, M005-b0m2hl depends_on M004-0zjrg0 → M005 should activate.
+  // Regression: parseContextDependsOn() used .toUpperCase(), converting "M004-0zjrg0"
+  // to "M004-0ZJRG0", breaking the case-sensitive lookup in completeMilestoneIds.
+  console.log('\n=== unique-id-deps: unique milestone IDs with lowercase hex suffix ===');
+  {
+    const base = createFixtureBase();
+    try {
+      // M004-0zjrg0: complete (all slices done + SUMMARY present)
+      writeRoadmap(base, 'M004-0zjrg0', `# M004-0zjrg0: First Unique Milestone
+
+**Vision:** Complete milestone with unique ID.
+
+## Slices
+
+- [x] **S01: Done** \`risk:low\` \`depends:[]\`
+  > After this: Done.
+`);
+      writeMilestoneSummary(base, 'M004-0zjrg0', '# M004-0zjrg0 Summary\n\nComplete.');
+
+      // M005-b0m2hl: depends on M004-0zjrg0 (lowercase hex suffix)
+      writeContext(base, 'M005-b0m2hl', 'depends_on: [M004-0zjrg0]');
+
+      const state = await deriveState(base);
+
+      assertEq(state.registry.find(e => e.id === 'M004-0zjrg0')?.status, 'complete',
+        'unique-id-deps: M004-0zjrg0 is complete');
+      assertEq(state.registry.find(e => e.id === 'M005-b0m2hl')?.status, 'active',
+        'unique-id-deps: M005-b0m2hl is active (dep on M004-0zjrg0 met)');
+      assertEq(state.activeMilestone?.id, 'M005-b0m2hl',
+        'unique-id-deps: activeMilestone is M005-b0m2hl');
+      assertTrue(state.phase !== 'blocked',
+        'unique-id-deps: phase is not blocked');
+    } finally {
+      cleanup(base);
+    }
+  }
+
+  // ─── Test Group 8: unique-id-deps-blocked ─────────────────────────────
+  // M004-0zjrg0 is NOT complete, M005-b0m2hl depends_on M004-0zjrg0 → M005 should be pending
+  console.log('\n=== unique-id-deps-blocked: unique ID dep not yet met ===');
+  {
+    const base = createFixtureBase();
+    try {
+      // M004-0zjrg0: incomplete (slice not done)
+      writeRoadmap(base, 'M004-0zjrg0', `# M004-0zjrg0: Incomplete Unique Milestone
+
+**Vision:** Still in progress.
+
+## Slices
+
+- [ ] **S01: In Progress** \`risk:low\` \`depends:[]\`
+  > After this: Done.
+`);
+      writeSlicePlan(base, 'M004-0zjrg0', 'S01', `# S01: In Progress
+
+**Goal:** Test dep blocking with unique IDs.
+
+## Tasks
+
+- [ ] **T01: Work** \`est:15m\`
+  Still doing work.
+`);
+
+      // M005-b0m2hl: depends on M004-0zjrg0 (still incomplete)
+      writeContext(base, 'M005-b0m2hl', 'depends_on: [M004-0zjrg0]');
+
+      const state = await deriveState(base);
+
+      assertEq(state.activeMilestone?.id, 'M004-0zjrg0',
+        'unique-id-deps-blocked: activeMilestone is M004-0zjrg0');
+      assertEq(state.registry.find(e => e.id === 'M005-b0m2hl')?.status, 'pending',
+        'unique-id-deps-blocked: M005-b0m2hl is pending (dep not met)');
+    } finally {
+      cleanup(base);
+    }
+  }
+
+  // ─── Test Group 9: parseContextDependsOn preserves case ───────────────
+  // Direct unit test: verify the parsed dep ID matches the input exactly
+  console.log('\n=== parseContextDependsOn: preserves case of unique IDs ===');
+  {
+    const { parseContextDependsOn } = await import('../files.ts');
+
+    const deps1 = parseContextDependsOn('---\ndepends_on: [M004-0zjrg0]\n---\n');
+    assertEq(deps1[0], 'M004-0zjrg0',
+      'parseContextDependsOn preserves lowercase hex suffix');
+
+    const deps2 = parseContextDependsOn('---\ndepends_on: [M001, M004-abc123]\n---\n');
+    assertEq(deps2[0], 'M001', 'preserves classic uppercase ID');
+    assertEq(deps2[1], 'M004-abc123', 'preserves mixed-case unique ID');
+
+    const deps3 = parseContextDependsOn('---\ndepends_on: []\n---\n');
+    assertEq(deps3.length, 0, 'empty deps returns empty array');
+
+    const deps4 = parseContextDependsOn(null);
+    assertEq(deps4.length, 0, 'null content returns empty array');
+  }
+
  report();
 }

--- a/src/resources/extensions/gsd/tests/feature-branch-lifecycle-integration.test.ts
+++ b/src/resources/extensions/gsd/tests/feature-branch-lifecycle-integration.test.ts
@ -0,0 +1,434 @@
+/**
+ * feature-branch-lifecycle.test.ts — Integration tests for the feature-branch workflow.
+ *
+ * Proves the core invariant: when auto-mode starts on a feature branch,
+ * the milestone worktree branches from that feature branch and merges
+ * back to it. `main` is never touched.
+ *
+ * Scenarios:
+ *   1. Full lifecycle: feature branch → worktree → slices → merge back to feature branch
+ *   2. Uncommitted changes on feature branch are included via pre-worktree commit
+ *   3. Unique milestone IDs (M001-abc123 format) work end-to-end
+ *   4. Main branch is completely untouched throughout
+ */
+
+import {
+  mkdtempSync, mkdirSync, writeFileSync, rmSync,
+  existsSync, realpathSync, readFileSync,
+} from "node:fs";
+import { join } from "node:path";
+import { tmpdir } from "node:os";
+import { execSync } from "node:child_process";
+
+import {
+  createAutoWorktree,
+  mergeMilestoneToMain,
+  autoWorktreeBranch,
+} from "../auto-worktree.ts";
+import { captureIntegrationBranch, getSliceBranchName } from "../worktree.ts";
+import { writeIntegrationBranch, readIntegrationBranch } from "../git-service.ts";
+import { nextMilestoneId, generateMilestoneSuffix } from "../guided-flow.ts";
+
+import { createTestContext } from "./test-helpers.ts";
+
+const { assertEq, assertTrue, assertMatch, report } = createTestContext();
+
+// ─── Helpers ────────────────────────────────────────────────────────────────
+
+function run(cmd: string, cwd: string): string {
+  return execSync(cmd, { cwd, stdio: ["ignore", "pipe", "pipe"], encoding: "utf-8" }).trim();
+}
+
+function commitCount(cwd: string, branch: string): number {
+  return parseInt(run(`git rev-list --count ${branch}`, cwd), 10);
+}
+
+function headSha(cwd: string, ref: string): string {
+  return run(`git rev-parse ${ref}`, cwd);
+}
+
+function branchExists(cwd: string, branch: string): boolean {
+  try {
+    run(`git show-ref --verify --quiet refs/heads/${branch}`, cwd);
+    return true;
+  } catch {
+    return false;
+  }
+}
+
+function allBranches(cwd: string): string[] {
+  return run("git branch --format='%(refname:short)'", cwd)
+    .split("\n")
+    .map(b => b.replace(/^'|'$/g, ""))
+    .filter(Boolean);
+}
+
+/**
+ * Create a temp repo with an initial commit on main and a feature branch.
+ * Returns { repo, featureBranch } with HEAD on the feature branch.
+ */
+function createFeatureBranchRepo(featureBranch: string): string {
+  const dir = realpathSync(mkdtempSync(join(tmpdir(), "gsd-fb-lifecycle-")));
+  run("git init", dir);
+  run("git config user.email test@test.com", dir);
+  run("git config user.name Test", dir);
+
+  // Initial commit on main
+  writeFileSync(join(dir, "README.md"), "# project\n");
+  mkdirSync(join(dir, ".gsd"), { recursive: true });
+  writeFileSync(join(dir, ".gsd", "STATE.md"), "# State\n");
+  run("git add .", dir);
+  run("git commit -m init", dir);
+  run("git branch -M main", dir);
+
+  // Create and switch to feature branch
+  run(`git checkout -b ${featureBranch}`, dir);
+
+  // Add a commit on the feature branch so it diverges from main
+  writeFileSync(join(dir, "feature-setup.ts"), "export const setup = true;\n");
+  run("git add .", dir);
+  run("git commit -m \"feat: feature branch setup\"", dir);
+
+  return dir;
+}
+
+function makeRoadmap(
+  milestoneId: string,
+  title: string,
+  slices: Array<{ id: string; title: string }>,
+): string {
+  const sliceLines = slices.map(s => `- [x] **${s.id}: ${s.title}**`).join("\n");
+  return `# ${milestoneId}: ${title}\n\n## Slices\n${sliceLines}\n`;
+}
+
+/** Add commits to a slice branch on the worktree, merge to milestone branch. */
+function addSliceToMilestone(
+  wtPath: string,
+  milestoneId: string,
+  sliceId: string,
+  sliceTitle: string,
+  commits: Array<{ file: string; content: string; message: string }>,
+): void {
+  const normalizedPath = wtPath.replaceAll("\\", "/");
+  const marker = "/.gsd/worktrees/";
+  const idx = normalizedPath.indexOf(marker);
+  const worktreeName = idx !== -1
+    ? normalizedPath.slice(idx + marker.length).split("/")[0]
+    : null;
+
+  const sliceBranch = getSliceBranchName(milestoneId, sliceId, worktreeName);
+
+  run(`git checkout -b ${sliceBranch}`, wtPath);
+  for (const c of commits) {
+    writeFileSync(join(wtPath, c.file), c.content);
+    run("git add .", wtPath);
+    run(`git commit -m "${c.message}"`, wtPath);
+  }
+  run(`git checkout milestone/${milestoneId}`, wtPath);
+  run(
+    `git merge --no-ff ${sliceBranch} -m "feat(${milestoneId}/${sliceId}): ${sliceTitle}"`,
+    wtPath,
+  );
+  run(`git branch -d ${sliceBranch}`, wtPath);
+}
+
+// ─── Tests ──────────────────────────────────────────────────────────────────
+
+async function main(): Promise<void> {
+  const savedCwd = process.cwd();
+  const tempDirs: string[] = [];
+
+  function fresh(featureBranch: string): string {
+    const d = createFeatureBranchRepo(featureBranch);
+    tempDirs.push(d);
+    return d;
+  }
+
+  try {
+    // ================================================================
+    // Test 1: Full feature-branch lifecycle with unique milestone IDs
+    //
+    // Start on f-new-shiny-thing with uncommitted changes, create
+    // worktree, add slices, merge back. Assert main is untouched.
+    // ================================================================
+    console.log("\n=== Feature-branch lifecycle with unique milestone IDs ===");
+    {
+      const featureBranch = "f-new-shiny-thing";
+      const repo = fresh(featureBranch);
+
+      // Generate a unique milestone ID (M001-xxxxxx format)
+      const milestoneId = nextMilestoneId([], true);
+      assertMatch(milestoneId, /^M001-[a-z0-9]{6}$/, "unique milestone ID format");
+
+      // Snapshot main before anything happens
+      const mainShaBefore = headSha(repo, "main");
+      const mainCommitsBefore = commitCount(repo, "main");
+
+      // ── Add uncommitted changes on the feature branch ──
+      // Simulates a user with dirty working tree when they start auto-mode.
+      writeFileSync(join(repo, "wip-config.ts"), "export const config = { debug: true };\n");
+      writeFileSync(join(repo, "wip-types.ts"), "export type AppState = { ready: boolean };\n");
+
+      // Verify files are uncommitted
+      const statusBefore = run("git status --short", repo);
+      assertTrue(statusBefore.includes("wip-config.ts"), "wip-config.ts is uncommitted");
+      assertTrue(statusBefore.includes("wip-types.ts"), "wip-types.ts is uncommitted");
+
+      // ── Simulate what startAuto does: commit dirty state, capture integration branch ──
+      // startAuto bootstraps .gsd/ which commits .gsd/ files. It also calls
+      // captureIntegrationBranch which commits META.json. But user's dirty
+      // files need to be committed first so the worktree branches from a
+      // commit that includes them.
+      //
+      // In production, the first dispatch unit (research-milestone) would
+      // auto-commit via autoCommitCurrentBranch. But the worktree is created
+      // BEFORE any unit runs. So we simulate the pre-worktree state:
+      // GSD bootstraps .gsd/ and captureIntegrationBranch commits metadata.
+      // The user's dirty files are NOT auto-committed pre-worktree — they
+      // stay in the original working directory.
+
+      // Create milestone directory (happens during guided-flow)
+      mkdirSync(join(repo, ".gsd", "milestones", milestoneId), { recursive: true });
+
+      // Write integration branch metadata (what captureIntegrationBranch does)
+      writeIntegrationBranch(repo, milestoneId, featureBranch);
+
+      // Verify integration branch recorded
+      const recorded = readIntegrationBranch(repo, milestoneId);
+      assertEq(recorded, featureBranch, "integration branch recorded as feature branch");
+
+      // Snapshot feature branch SHA after metadata commit (HEAD may have advanced)
+      const featureShaBeforeWorktree = headSha(repo, featureBranch);
+
+      // ── Create the auto-worktree ──
+      const wtPath = createAutoWorktree(repo, milestoneId);
+      tempDirs.push(wtPath);
+      assertTrue(existsSync(wtPath), "worktree directory created");
+
+      // Worktree should be on milestone/<unique-id> branch
+      const wtBranch = run("git branch --show-current", wtPath);
+      assertEq(wtBranch, `milestone/${milestoneId}`, "worktree is on milestone branch");
+
+      // Milestone branch should be rooted at the feature branch, not main
+      const milestoneBranchBase = headSha(repo, `milestone/${milestoneId}`);
+      assertEq(
+        milestoneBranchBase,
+        featureShaBeforeWorktree,
+        "milestone branch starts from feature branch HEAD",
+      );
+
+      // Feature-branch-only file should be in the worktree
+      assertTrue(
+        existsSync(join(wtPath, "feature-setup.ts")),
+        "feature branch file (feature-setup.ts) exists in worktree",
+      );
+
+      // Main should be completely untouched at this point
+      assertEq(headSha(repo, "main"), mainShaBefore, "main SHA unchanged after worktree creation");
+
+      // ── Do work in slices ──
+      addSliceToMilestone(wtPath, milestoneId, "S01", "Auth module", [
+        { file: "auth.ts", content: "export const auth = true;\n", message: "feat: add auth" },
+        { file: "auth-utils.ts", content: "export const hash = () => {};\n", message: "feat: auth utils" },
+      ]);
+      addSliceToMilestone(wtPath, milestoneId, "S02", "Dashboard", [
+        { file: "dashboard.ts", content: "export const dash = true;\n", message: "feat: add dashboard" },
+      ]);
+
+      // ── Merge milestone back to feature branch ──
+      const roadmap = makeRoadmap(milestoneId, "New shiny feature", [
+        { id: "S01", title: "Auth module" },
+        { id: "S02", title: "Dashboard" },
+      ]);
+
+      process.chdir(wtPath);
+      const result = mergeMilestoneToMain(repo, milestoneId, roadmap);
+      process.chdir(savedCwd);
+
+      // ── Assert: feature branch received the merge ──
+      const currentBranch = run("git branch --show-current", repo);
+      assertEq(currentBranch, featureBranch, "repo is on feature branch after merge");
+
+      // Exactly one new commit on feature branch (the squash merge)
+      const featureLog = run(`git log --oneline ${featureBranch}`, repo);
+      assertTrue(
+        featureLog.includes(`feat(${milestoneId})`),
+        "feature branch has milestone merge commit",
+      );
+
+      // Slice files are on the feature branch
+      assertTrue(existsSync(join(repo, "auth.ts")), "auth.ts on feature branch");
+      assertTrue(existsSync(join(repo, "dashboard.ts")), "dashboard.ts on feature branch");
+      assertTrue(existsSync(join(repo, "auth-utils.ts")), "auth-utils.ts on feature branch");
+
+      // Original feature branch file still present
+      assertTrue(existsSync(join(repo, "feature-setup.ts")), "feature-setup.ts still on feature branch");
+
+      // Commit message is well-formed
+      assertTrue(result.commitMessage.includes("New shiny feature"), "commit message has milestone title");
+      assertTrue(result.commitMessage.includes("S01: Auth module"), "commit message lists S01");
+      assertTrue(result.commitMessage.includes("S02: Dashboard"), "commit message lists S02");
+      assertTrue(
+        result.commitMessage.includes(`milestone/${milestoneId}`),
+        "commit message references milestone branch with unique ID",
+      );
+
+      // ── Assert: main is COMPLETELY untouched ──
+      assertEq(headSha(repo, "main"), mainShaBefore, "main SHA unchanged after merge");
+      assertEq(commitCount(repo, "main"), mainCommitsBefore, "main commit count unchanged");
+
+      // Main should NOT have any of the milestone files
+      run("git checkout main", repo);
+      assertTrue(!existsSync(join(repo, "auth.ts")), "auth.ts NOT on main");
+      assertTrue(!existsSync(join(repo, "dashboard.ts")), "dashboard.ts NOT on main");
+      assertTrue(!existsSync(join(repo, "feature-setup.ts")), "feature-setup.ts NOT on main");
+      run(`git checkout ${featureBranch}`, repo);
+
+      // ── Assert: worktree cleaned up ──
+      const worktreeDir = join(repo, ".gsd", "worktrees", milestoneId);
+      assertTrue(!existsSync(worktreeDir), "worktree directory removed");
+
+      // Milestone branch deleted
+      assertTrue(
+        !branchExists(repo, `milestone/${milestoneId}`),
+        "milestone branch deleted after merge",
+      );
+
+      // Only expected branches remain
+      const branches = allBranches(repo);
+      assertTrue(branches.includes("main"), "main branch exists");
+      assertTrue(branches.includes(featureBranch), "feature branch exists");
+      assertTrue(
+        !branches.some(b => b.startsWith("milestone/")),
+        "no milestone branches remain",
+      );
+    }
+
+    // ================================================================
+    // Test 2: Uncommitted .gsd/ planning files are available in worktree
+    //
+    // When auto-mode starts, .gsd/ files may be untracked/uncommitted.
+    // copyPlanningArtifacts should carry them into the worktree even if
+    // they weren't committed on the feature branch.
+    // ================================================================
+    console.log("\n=== Untracked planning files copied to worktree ===");
+    {
+      const featureBranch = "f-planning-test";
+      const repo = fresh(featureBranch);
+      const milestoneId = nextMilestoneId([], true);
+
+      // Write planning files that are NOT committed
+      mkdirSync(join(repo, ".gsd", "milestones", milestoneId, "slices", "S01", "tasks"), { recursive: true });
+      writeFileSync(
+        join(repo, ".gsd", "milestones", milestoneId, `${milestoneId}-ROADMAP.md`),
+        makeRoadmap(milestoneId, "Planning test", [{ id: "S01", title: "First" }]),
+      );
+      writeFileSync(
+        join(repo, ".gsd", "milestones", milestoneId, "slices", "S01", "S01-PLAN.md"),
+        "# S01: First\n\n**Goal:** Test\n**Demo:** Test\n\n## Tasks\n- [ ] **T01: Do it** `est:10m`\n",
+      );
+      writeFileSync(join(repo, ".gsd", "PROJECT.md"), "# Planning Test Project\n");
+      writeFileSync(join(repo, ".gsd", "DECISIONS.md"), "# Decisions\n\n## D001\nTest decision.\n");
+
+      // These files are untracked
+      assertTrue(run("git status --short", repo).length > 0, "repo has untracked files");
+
+      // Record integration branch and create worktree
+      writeIntegrationBranch(repo, milestoneId, featureBranch);
+      const wtPath = createAutoWorktree(repo, milestoneId);
+      tempDirs.push(wtPath);
+
+      // Planning files should exist in the worktree (via copyPlanningArtifacts)
+      assertTrue(
+        existsSync(join(wtPath, ".gsd", "milestones", milestoneId, `${milestoneId}-ROADMAP.md`)),
+        "ROADMAP.md copied to worktree",
+      );
+      assertTrue(
+        existsSync(join(wtPath, ".gsd", "milestones", milestoneId, "slices", "S01", "S01-PLAN.md")),
+        "S01-PLAN.md copied to worktree",
+      );
+      assertTrue(
+        existsSync(join(wtPath, ".gsd", "PROJECT.md")),
+        "PROJECT.md copied to worktree",
+      );
+      assertTrue(
+        existsSync(join(wtPath, ".gsd", "DECISIONS.md")),
+        "DECISIONS.md copied to worktree",
+      );
+
+      // Clean up: chdir back before teardown
+      process.chdir(savedCwd);
+    }
+
+    // ================================================================
+    // Test 3: Multiple milestones on the same feature branch
+    //
+    // Proves that unique IDs prevent collision when running successive
+    // milestones, and each merge lands on the feature branch.
+    // ================================================================
+    console.log("\n=== Multiple unique milestones on same feature branch ===");
+    {
+      const featureBranch = "f-multi-milestone";
+      const repo = fresh(featureBranch);
+
+      const mainShaBefore = headSha(repo, "main");
+
+      // First milestone
+      const mid1 = nextMilestoneId([], true);
+      mkdirSync(join(repo, ".gsd", "milestones", mid1), { recursive: true });
+      writeIntegrationBranch(repo, mid1, featureBranch);
+
+      const wt1 = createAutoWorktree(repo, mid1);
+      tempDirs.push(wt1);
+      addSliceToMilestone(wt1, mid1, "S01", "First milestone work", [
+        { file: "m1-feature.ts", content: "export const m1 = true;\n", message: "feat: m1" },
+      ]);
+      process.chdir(wt1);
+      mergeMilestoneToMain(repo, mid1, makeRoadmap(mid1, "First", [{ id: "S01", title: "First milestone work" }]));
+      process.chdir(savedCwd);
+
+      assertTrue(existsSync(join(repo, "m1-feature.ts")), "m1 file on feature branch");
+
+      // Second milestone — different unique ID
+      const mid2 = nextMilestoneId([mid1], true);
+      assertTrue(mid1 !== mid2, "second milestone has different ID");
+      assertMatch(mid2, /^M002-[a-z0-9]{6}$/, "second milestone is M002-xxxxxx");
+
+      mkdirSync(join(repo, ".gsd", "milestones", mid2), { recursive: true });
+      writeIntegrationBranch(repo, mid2, featureBranch);
+
+      const wt2 = createAutoWorktree(repo, mid2);
+      tempDirs.push(wt2);
+      addSliceToMilestone(wt2, mid2, "S01", "Second milestone work", [
+        { file: "m2-feature.ts", content: "export const m2 = true;\n", message: "feat: m2" },
+      ]);
+      process.chdir(wt2);
+      mergeMilestoneToMain(repo, mid2, makeRoadmap(mid2, "Second", [{ id: "S01", title: "Second milestone work" }]));
+      process.chdir(savedCwd);
+
+      // Both milestone files on feature branch
+      assertTrue(existsSync(join(repo, "m1-feature.ts")), "m1 file still on feature branch");
+      assertTrue(existsSync(join(repo, "m2-feature.ts")), "m2 file on feature branch");
+
+      // Main completely untouched
+      assertEq(headSha(repo, "main"), mainShaBefore, "main unchanged after two milestones");
+
+      // No milestone branches remain
+      const branches = allBranches(repo);
+      assertTrue(
+        !branches.some(b => b.startsWith("milestone/")),
+        "no milestone branches remain after two milestones",
+      );
+    }
+
+  } finally {
+    process.chdir(savedCwd);
+    for (const d of tempDirs) {
+      try { rmSync(d, { recursive: true, force: true }); } catch { /* ignore */ }
+    }
+  }
+
+  report();
+}
+
+main();
--- a/src/resources/extensions/gsd/tests/in-flight-tool-tracking.test.ts
+++ b/src/resources/extensions/gsd/tests/in-flight-tool-tracking.test.ts
@ -0,0 +1,79 @@
+/**
+ * In-flight tool tracking tests — verifies that markToolStart/markToolEnd
+ * correctly manage the in-flight tools set used by the idle watchdog to
+ * distinguish "agent waiting on long-running tool" from "agent is idle".
+ *
+ * Background: The idle watchdog checks every 15s for agent progress. Without
+ * in-flight tool tracking, agents waiting on await_job or async_bash (which
+ * can run 20+ minutes for evaluations, deployments, test suites) are falsely
+ * declared idle and interrupted by recovery steering messages.
+ *
+ * The fix hooks tool_execution_start/end events to track active tool calls.
+ * When tools are in-flight, the watchdog resets lastProgressAt instead of
+ * triggering idle recovery.
+ */
+
+import { markToolStart, markToolEnd, isAutoActive } from "../auto.ts";
+import { createTestContext } from './test-helpers.ts';
+
+const { assertEq, assertTrue, report } = createTestContext();
+
+// ═══ markToolStart / markToolEnd basic behavior ═════════════════════════════
+
+{
+  console.log("\n=== markToolStart: no-op when auto-mode is not active ===");
+  // When auto-mode is not active, markToolStart should silently ignore
+  // (the guard `if (!active) return` prevents set pollution outside auto-mode)
+  assertTrue(!isAutoActive(), "auto-mode should not be active in tests");
+  markToolStart("tool-1");
+  // We can't directly inspect the set, but markToolEnd should be a safe no-op
+  markToolEnd("tool-1");
+  // If we got here without error, the guard works
+  assertTrue(true, "markToolStart/markToolEnd are safe no-ops when inactive");
+}
+
+{
+  console.log("\n=== markToolEnd: no-op for unknown toolCallId ===");
+  // Set.delete on non-existent key is a no-op — verify no crash
+  markToolEnd("nonexistent-tool-call-id");
+  assertTrue(true, "markToolEnd handles unknown IDs gracefully");
+}
+
+{
+  console.log("\n=== markToolEnd: idempotent — double-end does not crash ===");
+  markToolEnd("some-id");
+  markToolEnd("some-id");
+  assertTrue(true, "double markToolEnd is safe");
+}
+
+// ═══ Integration contract: expected exports from auto.ts ═════════════════════
+
+{
+  console.log("\n=== auto.ts exports markToolStart and markToolEnd ===");
+  assertEq(typeof markToolStart, "function", "markToolStart should be a function");
+  assertEq(typeof markToolEnd, "function", "markToolEnd should be a function");
+}
+
+{
+  console.log("\n=== markToolStart accepts string toolCallId ===");
+  // Verify the function signature handles string input without error
+  // (when inactive, this is a no-op but should not throw)
+  try {
+    markToolStart("toolu_01ABC123");
+    assertTrue(true, "accepts standard Claude tool call ID format");
+  } catch (e) {
+    assertTrue(false, `should not throw: ${e}`);
+  }
+}
+
+{
+  console.log("\n=== markToolEnd accepts string toolCallId ===");
+  try {
+    markToolEnd("toolu_01ABC123");
+    assertTrue(true, "accepts standard Claude tool call ID format");
+  } catch (e) {
+    assertTrue(false, `should not throw: ${e}`);
+  }
+}
+
+report();
--- a/src/resources/extensions/gsd/tests/knowledge.test.ts
+++ b/src/resources/extensions/gsd/tests/knowledge.test.ts
@ -0,0 +1,161 @@
+/**
+ * Unit tests for KNOWLEDGE.md integration.
+ *
+ * Tests:
+ * - KNOWLEDGE is registered in GSD_ROOT_FILES
+ * - resolveGsdRootFile resolves KNOWLEDGE paths correctly
+ * - inlineGsdRootFile works with the KNOWLEDGE key
+ * - before_agent_start hook includes/omits knowledge block appropriately
+ */
+
+import test from 'node:test';
+import assert from 'node:assert/strict';
+import { mkdtempSync, mkdirSync, writeFileSync, readFileSync, rmSync } from 'node:fs';
+import { join } from 'node:path';
+import { tmpdir } from 'node:os';
+import { GSD_ROOT_FILES, resolveGsdRootFile } from '../paths.ts';
+import { inlineGsdRootFile } from '../auto-prompts.ts';
+import { appendKnowledge } from '../files.ts';
+
+// ─── KNOWLEDGE is registered in GSD_ROOT_FILES ─────────────────────────────
+
+test('knowledge: KNOWLEDGE key exists in GSD_ROOT_FILES', () => {
+  assert.ok('KNOWLEDGE' in GSD_ROOT_FILES, 'GSD_ROOT_FILES should have KNOWLEDGE key');
+  assert.strictEqual(GSD_ROOT_FILES.KNOWLEDGE, 'KNOWLEDGE.md');
+});
+
+// ─── resolveGsdRootFile resolves KNOWLEDGE.md ───────────────────────────────
+
+test('knowledge: resolveGsdRootFile returns canonical path when KNOWLEDGE.md exists', () => {
+  const tmp = mkdtempSync(join(tmpdir(), 'gsd-knowledge-'));
+  const gsdDir = join(tmp, '.gsd');
+  mkdirSync(gsdDir, { recursive: true });
+  writeFileSync(join(gsdDir, 'KNOWLEDGE.md'), '# Project Knowledge\n');
+
+  const resolved = resolveGsdRootFile(tmp, 'KNOWLEDGE');
+  assert.strictEqual(resolved, join(gsdDir, 'KNOWLEDGE.md'));
+
+  rmSync(tmp, { recursive: true, force: true });
+});
+
+test('knowledge: resolveGsdRootFile resolves when legacy knowledge.md exists', () => {
+  const tmp = mkdtempSync(join(tmpdir(), 'gsd-knowledge-'));
+  const gsdDir = join(tmp, '.gsd');
+  mkdirSync(gsdDir, { recursive: true });
+  writeFileSync(join(gsdDir, 'knowledge.md'), '# Project Knowledge\n');
+
+  const resolved = resolveGsdRootFile(tmp, 'KNOWLEDGE');
+  // On case-insensitive filesystems (macOS), canonical path matches;
+  // on case-sensitive (Linux), legacy path matches. Either is valid.
+  const canonical = join(gsdDir, 'KNOWLEDGE.md');
+  const legacy = join(gsdDir, 'knowledge.md');
+  assert.ok(
+    resolved === canonical || resolved === legacy,
+    `resolved path should be canonical or legacy, got: ${resolved}`,
+  );
+
+  rmSync(tmp, { recursive: true, force: true });
+});
+
+test('knowledge: resolveGsdRootFile returns canonical path when file does not exist', () => {
+  const tmp = mkdtempSync(join(tmpdir(), 'gsd-knowledge-'));
+  const gsdDir = join(tmp, '.gsd');
+  mkdirSync(gsdDir, { recursive: true });
+
+  const resolved = resolveGsdRootFile(tmp, 'KNOWLEDGE');
+  assert.strictEqual(resolved, join(gsdDir, 'KNOWLEDGE.md'));
+
+  rmSync(tmp, { recursive: true, force: true });
+});
+
+// ─── inlineGsdRootFile works with knowledge.md ─────────────────────────────
+
+test('knowledge: inlineGsdRootFile returns content when KNOWLEDGE.md exists', async () => {
+  const tmp = mkdtempSync(join(tmpdir(), 'gsd-knowledge-'));
+  const gsdDir = join(tmp, '.gsd');
+  mkdirSync(gsdDir, { recursive: true });
+  writeFileSync(join(gsdDir, 'KNOWLEDGE.md'), '# Project Knowledge\n\n## Rules\n\nK001: Use real DB');
+
+  const result = await inlineGsdRootFile(tmp, 'knowledge.md', 'Project Knowledge');
+  assert.ok(result !== null, 'should return content');
+  assert.ok(result!.includes('Project Knowledge'), 'should include label');
+  assert.ok(result!.includes('K001'), 'should include knowledge content');
+
+  rmSync(tmp, { recursive: true, force: true });
+});
+
+test('knowledge: inlineGsdRootFile returns null when KNOWLEDGE.md does not exist', async () => {
+  const tmp = mkdtempSync(join(tmpdir(), 'gsd-knowledge-'));
+  const gsdDir = join(tmp, '.gsd');
+  mkdirSync(gsdDir, { recursive: true });
+
+  const result = await inlineGsdRootFile(tmp, 'knowledge.md', 'Project Knowledge');
+  assert.strictEqual(result, null, 'should return null when file does not exist');
+
+  rmSync(tmp, { recursive: true, force: true });
+});
+
+// ─── appendKnowledge creates file and appends entries ──────────────────────
+
+test('knowledge: appendKnowledge creates KNOWLEDGE.md with rule when file does not exist', async () => {
+  const tmp = mkdtempSync(join(tmpdir(), 'gsd-knowledge-'));
+  const gsdDir = join(tmp, '.gsd');
+  mkdirSync(gsdDir, { recursive: true });
+
+  await appendKnowledge(tmp, 'rule', 'Use real DB for integration tests', 'M001/S01');
+
+  const content = readFileSync(join(gsdDir, 'KNOWLEDGE.md'), 'utf-8');
+  assert.ok(content.includes('# Project Knowledge'), 'should have header');
+  assert.ok(content.includes('K001'), 'should have K001 id');
+  assert.ok(content.includes('Use real DB for integration tests'), 'should have rule text');
+  assert.ok(content.includes('M001/S01'), 'should have scope');
+
+  rmSync(tmp, { recursive: true, force: true });
+});
+
+test('knowledge: appendKnowledge appends to existing KNOWLEDGE.md with auto-incrementing ID', async () => {
+  const tmp = mkdtempSync(join(tmpdir(), 'gsd-knowledge-'));
+  const gsdDir = join(tmp, '.gsd');
+  mkdirSync(gsdDir, { recursive: true });
+
+  // Create initial file with one rule
+  await appendKnowledge(tmp, 'rule', 'First rule', 'M001');
+  // Add second rule
+  await appendKnowledge(tmp, 'rule', 'Second rule', 'M001/S02');
+
+  const content = readFileSync(join(gsdDir, 'KNOWLEDGE.md'), 'utf-8');
+  assert.ok(content.includes('K001'), 'should have K001');
+  assert.ok(content.includes('K002'), 'should have K002');
+  assert.ok(content.includes('First rule'), 'should have first rule');
+  assert.ok(content.includes('Second rule'), 'should have second rule');
+
+  rmSync(tmp, { recursive: true, force: true });
+});
+
+test('knowledge: appendKnowledge handles pattern type', async () => {
+  const tmp = mkdtempSync(join(tmpdir(), 'gsd-knowledge-'));
+  const gsdDir = join(tmp, '.gsd');
+  mkdirSync(gsdDir, { recursive: true });
+
+  await appendKnowledge(tmp, 'pattern', 'Middleware chain for auth', 'M001');
+
+  const content = readFileSync(join(gsdDir, 'KNOWLEDGE.md'), 'utf-8');
+  assert.ok(content.includes('P001'), 'should have P001 id');
+  assert.ok(content.includes('Middleware chain for auth'), 'should have pattern text');
+
+  rmSync(tmp, { recursive: true, force: true });
+});
+
+test('knowledge: appendKnowledge handles lesson type', async () => {
+  const tmp = mkdtempSync(join(tmpdir(), 'gsd-knowledge-'));
+  const gsdDir = join(tmp, '.gsd');
+  mkdirSync(gsdDir, { recursive: true });
+
+  await appendKnowledge(tmp, 'lesson', 'API timeout on large payloads', 'M002');
+
+  const content = readFileSync(join(gsdDir, 'KNOWLEDGE.md'), 'utf-8');
+  assert.ok(content.includes('L001'), 'should have L001 id');
+  assert.ok(content.includes('API timeout on large payloads'), 'should have lesson text');
+
+  rmSync(tmp, { recursive: true, force: true });
+});
--- a/src/resources/extensions/gsd/tests/memory-leak-guards.test.ts
+++ b/src/resources/extensions/gsd/tests/memory-leak-guards.test.ts
@ -0,0 +1,87 @@
+/**
+ * memory-leak-guards.test.ts — Tests for #611 memory leak fixes.
+ *
+ * Verifies that module-level state accumulators are properly bounded
+ * and cleared to prevent OOM during long-running auto-mode sessions.
+ */
+
+import test from "node:test";
+import assert from "node:assert/strict";
+import { mkdtempSync, rmSync, existsSync, readdirSync, readFileSync } from "node:fs";
+import { join } from "node:path";
+import { tmpdir } from "node:os";
+
+import { saveActivityLog, clearActivityLogState } from "../activity-log.ts";
+import { clearPathCache } from "../paths.ts";
+import type { ExtensionContext } from "@gsd/pi-coding-agent";
+
+function createCtx(entries: unknown[]) {
+  return { sessionManager: { getEntries: () => entries } } as unknown as ExtensionContext;
+}
+
+// ─── activity-log: clearActivityLogState ─────────────────────────────────────
+
+test("clearActivityLogState resets dedup state so identical saves write again", () => {
+  clearActivityLogState();
+  const baseDir = mkdtempSync(join(tmpdir(), "gsd-memleak-test-"));
+  try {
+    const entries = [{ role: "assistant", content: "test entry" }];
+    const ctx = createCtx(entries);
+
+    // First save
+    saveActivityLog(ctx, baseDir, "execute-task", "M001/S01/T01");
+
+    const actDir = join(baseDir, ".gsd", "activity");
+    assert.equal(readdirSync(actDir).length, 1, "first save creates one file");
+
+    // Same content, same unit — deduped
+    saveActivityLog(ctx, baseDir, "execute-task", "M001/S01/T01");
+    assert.equal(readdirSync(actDir).length, 1, "dedup prevents duplicate write");
+
+    // Clear state
+    clearActivityLogState();
+
+    // Same content again — after clear, writes again (fresh state)
+    saveActivityLog(ctx, baseDir, "execute-task", "M001/S01/T01");
+    assert.equal(readdirSync(actDir).length, 2, "after clear, dedup state is reset");
+  } finally {
+    rmSync(baseDir, { recursive: true, force: true });
+  }
+});
+
+// ─── activity-log: streaming JSONL write ────────────────────────────────────
+
+test("saveActivityLog writes valid JSONL via streaming", () => {
+  clearActivityLogState();
+  const baseDir = mkdtempSync(join(tmpdir(), "gsd-memleak-jsonl-"));
+  try {
+    const entries = [
+      { type: "message", message: { role: "user", content: "hello" } },
+      { type: "message", message: { role: "assistant", content: "world" } },
+      { type: "message", message: { role: "user", content: "test" } },
+    ];
+    const ctx = createCtx(entries);
+
+    saveActivityLog(ctx, baseDir, "execute-task", "M002/S01/T01");
+
+    const actDir = join(baseDir, ".gsd", "activity");
+    const files = readdirSync(actDir);
+    assert.equal(files.length, 1, "one file written");
+
+    const content = readFileSync(join(actDir, files[0]), "utf-8");
+    const lines = content.trim().split("\n");
+    assert.equal(lines.length, 3, "three JSONL lines");
+
+    for (const line of lines) {
+      assert.doesNotThrow(() => JSON.parse(line), `line is valid JSON`);
+    }
+  } finally {
+    rmSync(baseDir, { recursive: true, force: true });
+  }
+});
+
+// ─── paths.ts: directory cache bounds ───────────────────────────────────────
+
+test("clearPathCache does not throw", () => {
+  assert.doesNotThrow(() => clearPathCache(), "clearPathCache should not throw");
+});
--- a/src/resources/extensions/gsd/tests/milestone-transition-worktree.test.ts
+++ b/src/resources/extensions/gsd/tests/milestone-transition-worktree.test.ts
@ -0,0 +1,144 @@
+/**
+ * milestone-transition-worktree.test.ts — Tests for #616 fix.
+ *
+ * Verifies that when auto-mode transitions between milestones, the
+ * worktree lifecycle is handled: old worktree merged, new worktree created.
+ *
+ * Uses source-level checks since the full auto-mode dispatch loop
+ * requires the @gsd/pi-coding-agent runtime.
+ */
+
+import test from "node:test";
+import assert from "node:assert/strict";
+import { mkdtempSync, mkdirSync, rmSync, writeFileSync, existsSync, realpathSync, readFileSync } from "node:fs";
+import { join } from "node:path";
+import { tmpdir } from "node:os";
+import { execSync } from "node:child_process";
+
+import { dirname } from "node:path";
+import { fileURLToPath } from "node:url";
+
+import {
+  createAutoWorktree,
+  teardownAutoWorktree,
+  isInAutoWorktree,
+  getAutoWorktreeOriginalBase,
+  mergeMilestoneToMain,
+} from "../auto-worktree.ts";
+
+const __dirname = dirname(fileURLToPath(import.meta.url));
+
+function run(command: string, cwd: string): string {
+  return execSync(command, { cwd, stdio: ["ignore", "pipe", "pipe"], encoding: "utf-8" }).trim();
+}
+
+function createTempRepo(): string {
+  const dir = realpathSync(mkdtempSync(join(tmpdir(), "gsd-mt-wt-test-")));
+  run("git init", dir);
+  run("git config user.email test@test.com", dir);
+  run("git config user.name Test", dir);
+  writeFileSync(join(dir, "README.md"), "# test\n");
+  run("git add .", dir);
+  run("git commit -m init", dir);
+  run("git branch -M main", dir);
+  return dir;
+}
+
+function createMilestoneArtifacts(dir: string, mid: string): void {
+  const msDir = join(dir, ".gsd", "milestones", mid);
+  mkdirSync(msDir, { recursive: true });
+  writeFileSync(join(msDir, "CONTEXT.md"), `# ${mid} Context\n`);
+  const roadmap = [
+    `# ${mid}: Test Milestone`,
+    "**Vision**: testing",
+    "## Success Criteria",
+    "- It works",
+    "## Slices",
+    "- [x] S01 — First slice",
+  ].join("\n");
+  writeFileSync(join(msDir, `${mid}-ROADMAP.md`), roadmap);
+}
+
+// ─── Milestone transition: worktree swap ─────────────────────────────────────
+
+test("worktree swap on milestone transition: merge old, create new", () => {
+  const savedCwd = process.cwd();
+  let tempDir = "";
+
+  try {
+    tempDir = createTempRepo();
+
+    // Set up M001 and M002 milestone artifacts
+    createMilestoneArtifacts(tempDir, "M001");
+    createMilestoneArtifacts(tempDir, "M002");
+    run("git add .", tempDir);
+    run("git commit -m \"add milestones\"", tempDir);
+
+    // Phase 1: Create worktree for M001 (simulates auto-mode start)
+    const wt1 = createAutoWorktree(tempDir, "M001");
+    assert.equal(process.cwd(), wt1, "cwd should be in M001 worktree");
+    assert.ok(isInAutoWorktree(tempDir), "should be in auto-worktree");
+    assert.equal(getAutoWorktreeOriginalBase(), tempDir, "original base preserved");
+
+    // Add a commit in M001 worktree to simulate work
+    writeFileSync(join(wt1, "feature-m001.txt"), "M001 work\n");
+    run("git add .", wt1);
+    run("git commit -m \"feat(M001): add feature\"", wt1);
+
+    // Phase 2: Simulate milestone transition — merge M001, exit worktree
+    const roadmapPath = join(tempDir, ".gsd", "milestones", "M001", "M001-ROADMAP.md");
+    const roadmapContent = readFileSync(roadmapPath, "utf-8");
+    mergeMilestoneToMain(tempDir, "M001", roadmapContent);
+
+    // After merge: cwd should be back at project root
+    assert.equal(process.cwd(), tempDir, "cwd restored to project root after merge");
+    assert.ok(!isInAutoWorktree(tempDir), "no longer in auto-worktree after merge");
+
+    // Verify M001 work was merged to main
+    const mainLog = run("git log --oneline -3", tempDir);
+    assert.ok(mainLog.includes("M001"), "M001 squash commit should be on main");
+
+    // Phase 3: Create new worktree for M002 (simulates new milestone)
+    const wt2 = createAutoWorktree(tempDir, "M002");
+    assert.equal(process.cwd(), wt2, "cwd should be in M002 worktree");
+    assert.ok(isInAutoWorktree(tempDir), "should be in M002 auto-worktree");
+
+    // The new worktree should have the M001 feature file (merged to main)
+    assert.ok(existsSync(join(wt2, "feature-m001.txt")), "M002 worktree inherits M001 merged work");
+
+    // Verify branch is correct
+    const branch = run("git branch --show-current", wt2);
+    assert.equal(branch, "milestone/M002", "M002 worktree on correct branch");
+
+    // Cleanup
+    teardownAutoWorktree(tempDir, "M002");
+  } finally {
+    process.chdir(savedCwd);
+    if (tempDir && existsSync(tempDir)) {
+      rmSync(tempDir, { recursive: true, force: true });
+    }
+  }
+});
+
+// ─── Verify the transition code path exists in auto.ts ──────────────────────
+
+test("auto.ts milestone transition block contains worktree lifecycle", () => {
+  const autoSrc = readFileSync(
+    join(__dirname, "..", "auto.ts"),
+    "utf-8",
+  );
+
+  // The fix adds worktree merge + create inside the milestone transition block
+  assert.ok(
+    autoSrc.includes("Worktree lifecycle on milestone transition"),
+    "auto.ts should contain the worktree lifecycle comment marker",
+  );
+  assert.ok(
+    autoSrc.includes("mergeMilestoneToMain") && autoSrc.includes("mid !== currentMilestoneId"),
+    "auto.ts should call mergeMilestoneToMain during milestone transition",
+  );
+  assert.ok(
+    autoSrc.includes("createAutoWorktree") && autoSrc.includes("Created auto-worktree for"),
+    "auto.ts should create new worktree for incoming milestone",
+  );
+});
--- a/src/resources/extensions/gsd/tests/model-cost-table.test.ts
+++ b/src/resources/extensions/gsd/tests/model-cost-table.test.ts
@ -0,0 +1,69 @@
+import test from "node:test";
+import assert from "node:assert/strict";
+
+import { lookupModelCost, compareModelCost, BUNDLED_COST_TABLE } from "../model-cost-table.js";
+
+// ─── lookupModelCost ─────────────────────────────────────────────────────────
+
+test("lookupModelCost finds exact match", () => {
+  const entry = lookupModelCost("claude-opus-4-6");
+  assert.ok(entry);
+  assert.equal(entry.id, "claude-opus-4-6");
+  assert.ok(entry.inputPer1k > 0);
+  assert.ok(entry.outputPer1k > 0);
+});
+
+test("lookupModelCost strips provider prefix", () => {
+  const entry = lookupModelCost("anthropic/claude-opus-4-6");
+  assert.ok(entry);
+  assert.equal(entry.id, "claude-opus-4-6");
+});
+
+test("lookupModelCost returns undefined for unknown model", () => {
+  const entry = lookupModelCost("totally-unknown-model");
+  assert.equal(entry, undefined);
+});
+
+test("lookupModelCost finds haiku", () => {
+  const entry = lookupModelCost("claude-haiku-4-5");
+  assert.ok(entry);
+  assert.ok(entry.inputPer1k < 0.001, "haiku should be cheap");
+});
+
+// ─── compareModelCost ────────────────────────────────────────────────────────
+
+test("haiku is cheaper than opus", () => {
+  assert.ok(compareModelCost("claude-haiku-4-5", "claude-opus-4-6") < 0);
+});
+
+test("opus is more expensive than sonnet", () => {
+  assert.ok(compareModelCost("claude-opus-4-6", "claude-sonnet-4-6") > 0);
+});
+
+test("same model has equal cost", () => {
+  assert.equal(compareModelCost("claude-opus-4-6", "claude-opus-4-6"), 0);
+});
+
+// ─── BUNDLED_COST_TABLE ──────────────────────────────────────────────────────
+
+test("cost table has entries for all major providers", () => {
+  const ids = BUNDLED_COST_TABLE.map(e => e.id);
+  // Anthropic
+  assert.ok(ids.includes("claude-opus-4-6"));
+  assert.ok(ids.includes("claude-sonnet-4-6"));
+  assert.ok(ids.includes("claude-haiku-4-5"));
+  // OpenAI
+  assert.ok(ids.includes("gpt-4o"));
+  assert.ok(ids.includes("gpt-4o-mini"));
+  // Google
+  assert.ok(ids.includes("gemini-2.0-flash"));
+});
+
+test("all cost table entries have valid data", () => {
+  for (const entry of BUNDLED_COST_TABLE) {
+    assert.ok(entry.id, `entry missing id`);
+    assert.ok(entry.inputPer1k >= 0, `${entry.id} inputPer1k should be >= 0`);
+    assert.ok(entry.outputPer1k >= 0, `${entry.id} outputPer1k should be >= 0`);
+    assert.ok(entry.updatedAt, `${entry.id} missing updatedAt`);
+  }
+});
--- a/src/resources/extensions/gsd/tests/model-router.test.ts
+++ b/src/resources/extensions/gsd/tests/model-router.test.ts
@ -0,0 +1,167 @@
+import test from "node:test";
+import assert from "node:assert/strict";
+
+import {
+  resolveModelForComplexity,
+  escalateTier,
+  defaultRoutingConfig,
+} from "../model-router.js";
+import type { DynamicRoutingConfig, RoutingDecision } from "../model-router.js";
+import type { ClassificationResult } from "../complexity-classifier.js";
+
+// ─── Helpers ─────────────────────────────────────────────────────────────────
+
+function makeClassification(tier: "light" | "standard" | "heavy", reason = "test"): ClassificationResult {
+  return { tier, reason, downgraded: false };
+}
+
+const AVAILABLE_MODELS = [
+  "claude-opus-4-6",
+  "claude-sonnet-4-6",
+  "claude-haiku-4-5",
+  "gpt-4o-mini",
+];
+
+// ─── Passthrough when disabled ───────────────────────────────────────────────
+
+test("returns configured model when routing is disabled", () => {
+  const config = { ...defaultRoutingConfig(), enabled: false };
+  const result = resolveModelForComplexity(
+    makeClassification("light"),
+    { primary: "claude-opus-4-6", fallbacks: [] },
+    config,
+    AVAILABLE_MODELS,
+  );
+  assert.equal(result.modelId, "claude-opus-4-6");
+  assert.equal(result.wasDowngraded, false);
+});
+
+test("returns configured model when no phase config", () => {
+  const config = { ...defaultRoutingConfig(), enabled: true };
+  const result = resolveModelForComplexity(
+    makeClassification("light"),
+    undefined,
+    config,
+    AVAILABLE_MODELS,
+  );
+  assert.equal(result.modelId, "");
+  assert.equal(result.wasDowngraded, false);
+});
+
+// ─── Downgrade-only semantics ────────────────────────────────────────────────
+
+test("does not downgrade when tier matches configured model tier", () => {
+  const config = { ...defaultRoutingConfig(), enabled: true };
+  const result = resolveModelForComplexity(
+    makeClassification("heavy"),
+    { primary: "claude-opus-4-6", fallbacks: [] },
+    config,
+    AVAILABLE_MODELS,
+  );
+  assert.equal(result.modelId, "claude-opus-4-6");
+  assert.equal(result.wasDowngraded, false);
+});
+
+test("does not upgrade beyond configured model", () => {
+  const config = { ...defaultRoutingConfig(), enabled: true };
+  // Configured model is sonnet (standard), classification says heavy
+  const result = resolveModelForComplexity(
+    makeClassification("heavy"),
+    { primary: "claude-sonnet-4-6", fallbacks: [] },
+    config,
+    AVAILABLE_MODELS,
+  );
+  assert.equal(result.modelId, "claude-sonnet-4-6");
+  assert.equal(result.wasDowngraded, false);
+});
+
+test("downgrades from opus to haiku for light tier", () => {
+  const config = { ...defaultRoutingConfig(), enabled: true };
+  const result = resolveModelForComplexity(
+    makeClassification("light"),
+    { primary: "claude-opus-4-6", fallbacks: [] },
+    config,
+    AVAILABLE_MODELS,
+  );
+  // Should pick haiku or gpt-4o-mini (cheapest light tier)
+  assert.ok(
+    result.modelId === "claude-haiku-4-5" || result.modelId === "gpt-4o-mini",
+    `Expected light-tier model, got ${result.modelId}`,
+  );
+  assert.equal(result.wasDowngraded, true);
+});
+
+test("downgrades from opus to sonnet for standard tier", () => {
+  const config = { ...defaultRoutingConfig(), enabled: true };
+  const result = resolveModelForComplexity(
+    makeClassification("standard"),
+    { primary: "claude-opus-4-6", fallbacks: [] },
+    config,
+    AVAILABLE_MODELS,
+  );
+  assert.equal(result.modelId, "claude-sonnet-4-6");
+  assert.equal(result.wasDowngraded, true);
+});
+
+// ─── Explicit tier_models ────────────────────────────────────────────────────
+
+test("uses explicit tier_models when configured", () => {
+  const config: DynamicRoutingConfig = {
+    ...defaultRoutingConfig(),
+    enabled: true,
+    tier_models: { light: "gpt-4o-mini", standard: "claude-sonnet-4-6" },
+  };
+  const result = resolveModelForComplexity(
+    makeClassification("light"),
+    { primary: "claude-opus-4-6", fallbacks: [] },
+    config,
+    AVAILABLE_MODELS,
+  );
+  assert.equal(result.modelId, "gpt-4o-mini");
+  assert.equal(result.wasDowngraded, true);
+});
+
+// ─── Fallback chain construction ─────────────────────────────────────────────
+
+test("fallback chain includes configured primary as last resort", () => {
+  const config = { ...defaultRoutingConfig(), enabled: true };
+  const result = resolveModelForComplexity(
+    makeClassification("light"),
+    { primary: "claude-opus-4-6", fallbacks: ["claude-sonnet-4-6"] },
+    config,
+    AVAILABLE_MODELS,
+  );
+  assert.ok(result.wasDowngraded);
+  // Fallbacks should include the configured fallbacks and primary
+  assert.ok(result.fallbacks.includes("claude-opus-4-6"), "primary should be in fallbacks");
+  assert.ok(result.fallbacks.includes("claude-sonnet-4-6"), "configured fallback should be in fallbacks");
+});
+
+// ─── Escalation ──────────────────────────────────────────────────────────────
+
+test("escalateTier moves light → standard", () => {
+  assert.equal(escalateTier("light"), "standard");
+});
+
+test("escalateTier moves standard → heavy", () => {
+  assert.equal(escalateTier("standard"), "heavy");
+});
+
+test("escalateTier returns null for heavy (max)", () => {
+  assert.equal(escalateTier("heavy"), null);
+});
+
+// ─── No suitable model available ─────────────────────────────────────────────
+
+test("falls back to configured model when no light-tier model available", () => {
+  const config = { ...defaultRoutingConfig(), enabled: true };
+  // Only heavy-tier models available
+  const result = resolveModelForComplexity(
+    makeClassification("light"),
+    { primary: "claude-opus-4-6", fallbacks: [] },
+    config,
+    ["claude-opus-4-6"],
+  );
+  assert.equal(result.modelId, "claude-opus-4-6");
+  assert.equal(result.wasDowngraded, false);
+});
--- a/src/resources/extensions/gsd/tests/preferences-wizard-fields.test.ts
+++ b/src/resources/extensions/gsd/tests/preferences-wizard-fields.test.ts
@ -0,0 +1,168 @@
+/**
+ * preferences-wizard-fields.test.ts — Validates that all wizard-configurable
+ * preference fields are properly validated and round-trip through the schema.
+ */
+
+import { createTestContext } from "./test-helpers.ts";
+import { validatePreferences } from "../preferences.ts";
+import type { GSDPreferences } from "../preferences.ts";
+
+const { assertEq, assertTrue, report } = createTestContext();
+
+async function main(): Promise<void> {
+  console.log("\n=== budget fields validate correctly ===");
+
+  {
+    const { preferences, errors } = validatePreferences({
+      budget_ceiling: 25.50,
+      budget_enforcement: "warn",
+      context_pause_threshold: 80,
+    });
+    assertEq(errors.length, 0, "valid budget fields produce no errors");
+    assertEq(preferences.budget_ceiling, 25.50, "budget_ceiling passes through");
+    assertEq(preferences.budget_enforcement, "warn", "budget_enforcement passes through");
+    assertEq(preferences.context_pause_threshold, 80, "context_pause_threshold passes through");
+  }
+
+  {
+    const { preferences, errors } = validatePreferences({
+      budget_enforcement: "pause",
+    });
+    assertEq(errors.length, 0, "budget_enforcement 'pause' is valid");
+    assertEq(preferences.budget_enforcement, "pause", "pause passes through");
+  }
+
+  {
+    const { preferences, errors } = validatePreferences({
+      budget_enforcement: "halt",
+    });
+    assertEq(errors.length, 0, "budget_enforcement 'halt' is valid");
+    assertEq(preferences.budget_enforcement, "halt", "halt passes through");
+  }
+
+  {
+    const { errors } = validatePreferences({
+      budget_enforcement: "invalid",
+    } as unknown as GSDPreferences);
+    assertTrue(errors.some(e => e.includes("budget_enforcement")), "invalid budget_enforcement rejected");
+  }
+
+  console.log("\n=== notification fields validate correctly ===");
+
+  {
+    const { preferences, errors } = validatePreferences({
+      notifications: {
+        enabled: true,
+        on_complete: false,
+        on_error: true,
+        on_budget: true,
+        on_milestone: false,
+        on_attention: true,
+      },
+    });
+    assertEq(errors.length, 0, "valid notifications produce no errors");
+    assertEq(preferences.notifications?.enabled, true, "notifications.enabled passes through");
+    assertEq(preferences.notifications?.on_complete, false, "notifications.on_complete passes through");
+    assertEq(preferences.notifications?.on_milestone, false, "notifications.on_milestone passes through");
+  }
+
+  {
+    const { errors } = validatePreferences({
+      notifications: "invalid",
+    } as unknown as GSDPreferences);
+    assertTrue(errors.some(e => e.includes("notifications")), "invalid notifications rejected");
+  }
+
+  console.log("\n=== git fields validate correctly ===");
+
+  {
+    const { preferences, errors } = validatePreferences({
+      git: {
+        auto_push: true,
+        push_branches: false,
+        remote: "upstream",
+        snapshots: true,
+        pre_merge_check: "auto",
+        commit_type: "feat",
+        main_branch: "develop",
+        merge_strategy: "squash",
+        isolation: "branch",
+      },
+    });
+    assertEq(errors.length, 0, "valid git fields produce no errors");
+    assertEq(preferences.git?.auto_push, true, "git.auto_push passes through");
+    assertEq(preferences.git?.push_branches, false, "git.push_branches passes through");
+    assertEq(preferences.git?.remote, "upstream", "git.remote passes through");
+    assertEq(preferences.git?.snapshots, true, "git.snapshots passes through");
+    assertEq(preferences.git?.pre_merge_check, "auto", "git.pre_merge_check passes through");
+    assertEq(preferences.git?.commit_type, "feat", "git.commit_type passes through");
+    assertEq(preferences.git?.main_branch, "develop", "git.main_branch passes through");
+    assertEq(preferences.git?.merge_strategy, "squash", "git.merge_strategy passes through");
+    assertEq(preferences.git?.isolation, "branch", "git.isolation passes through");
+  }
+
+  console.log("\n=== uat_dispatch validates correctly ===");
+
+  {
+    const { preferences, errors } = validatePreferences({ uat_dispatch: true });
+    assertEq(errors.length, 0, "valid uat_dispatch produces no errors");
+    assertEq(preferences.uat_dispatch, true, "uat_dispatch true passes through");
+  }
+
+  {
+    const { preferences, errors } = validatePreferences({ uat_dispatch: false });
+    assertEq(errors.length, 0, "valid uat_dispatch false produces no errors");
+    assertEq(preferences.uat_dispatch, false, "uat_dispatch false passes through");
+  }
+
+  console.log("\n=== unique_milestone_ids validates correctly ===");
+
+  {
+    const { preferences, errors } = validatePreferences({ unique_milestone_ids: true });
+    assertEq(errors.length, 0, "valid unique_milestone_ids produces no errors");
+    assertEq(preferences.unique_milestone_ids, true, "unique_milestone_ids passes through");
+  }
+
+  console.log("\n=== all wizard fields together produce no errors ===");
+
+  {
+    const fullPrefs: GSDPreferences = {
+      version: 1,
+      models: { research: "claude-opus-4-6", planning: "claude-sonnet-4-6" },
+      auto_supervisor: { soft_timeout_minutes: 15, idle_timeout_minutes: 5, hard_timeout_minutes: 25 },
+      git: {
+        main_branch: "main",
+        auto_push: true,
+        push_branches: false,
+        remote: "origin",
+        snapshots: true,
+        pre_merge_check: "auto",
+        commit_type: "feat",
+        merge_strategy: "squash",
+        isolation: "worktree",
+      },
+      skill_discovery: "suggest",
+      unique_milestone_ids: false,
+      budget_ceiling: 50,
+      budget_enforcement: "pause",
+      context_pause_threshold: 75,
+      notifications: {
+        enabled: true,
+        on_complete: true,
+        on_error: true,
+        on_budget: true,
+        on_milestone: true,
+        on_attention: true,
+      },
+      uat_dispatch: false,
+    };
+    const { errors, warnings } = validatePreferences(fullPrefs);
+    const unknownWarnings = warnings.filter(w => w.includes("unknown"));
+    assertEq(errors.length, 0, "full wizard prefs produce no errors");
+    assertEq(unknownWarnings.length, 0, "full wizard prefs produce no unknown-key warnings");
+  }
+
+  report();
+}
+
+main();
--- a/src/resources/extensions/gsd/tests/queue-order.test.ts
+++ b/src/resources/extensions/gsd/tests/queue-order.test.ts
@ -0,0 +1,204 @@
+import { mkdtempSync, mkdirSync, rmSync, writeFileSync, existsSync, readFileSync } from 'node:fs';
+import { join } from 'node:path';
+import { tmpdir } from 'node:os';
+
+import {
+  loadQueueOrder,
+  saveQueueOrder,
+  sortByQueueOrder,
+  pruneQueueOrder,
+  validateQueueOrder,
+} from '../queue-order.ts';
+import { createTestContext } from './test-helpers.ts';
+
+const { assertEq, assertTrue, report } = createTestContext();
+
+// ─── Fixture Helpers ───────────────────────────────────────────────────────
+
+function createFixtureBase(): string {
+  const base = mkdtempSync(join(tmpdir(), 'gsd-queue-order-'));
+  mkdirSync(join(base, '.gsd'), { recursive: true });
+  return base;
+}
+
+function cleanup(base: string): void {
+  rmSync(base, { recursive: true, force: true });
+}
+
+// ═══════════════════════════════════════════════════════════════════════════
+// sortByQueueOrder
+// ═══════════════════════════════════════════════════════════════════════════
+
+console.log('\n=== sortByQueueOrder ===');
+
+// Null order → default milestoneIdSort
+{
+  const result = sortByQueueOrder(['M003', 'M001', 'M002'], null);
+  assertEq(result, ['M001', 'M002', 'M003'], 'null order falls back to numeric sort');
+}
+
+// Custom order → exact sequence
+{
+  const result = sortByQueueOrder(['M001', 'M002', 'M003'], ['M003', 'M001', 'M002']);
+  assertEq(result, ['M003', 'M001', 'M002'], 'custom order produces exact sequence');
+}
+
+// Custom order with new IDs → appended at end in numeric order
+{
+  const result = sortByQueueOrder(['M001', 'M002', 'M003', 'M004'], ['M003', 'M001']);
+  assertEq(result, ['M003', 'M001', 'M002', 'M004'], 'new IDs appended in numeric order');
+}
+
+// Custom order with deleted IDs → silently skipped
+{
+  const result = sortByQueueOrder(['M001', 'M003'], ['M003', 'M002', 'M001']);
+  assertEq(result, ['M003', 'M001'], 'deleted IDs in order are skipped');
+}
+
+// Empty custom order → all IDs in numeric order
+{
+  const result = sortByQueueOrder(['M002', 'M001'], []);
+  assertEq(result, ['M001', 'M002'], 'empty custom order falls back to numeric sort');
+}
+
+// ═══════════════════════════════════════════════════════════════════════════
+// loadQueueOrder / saveQueueOrder
+// ═══════════════════════════════════════════════════════════════════════════
+
+console.log('\n=== loadQueueOrder / saveQueueOrder ===');
+
+// Load returns null when file doesn't exist
+{
+  const base = createFixtureBase();
+  assertEq(loadQueueOrder(base), null, 'returns null when file missing');
+  cleanup(base);
+}
+
+// Save then load round-trip
+{
+  const base = createFixtureBase();
+  saveQueueOrder(base, ['M003', 'M001', 'M002']);
+  const loaded = loadQueueOrder(base);
+  assertEq(loaded, ['M003', 'M001', 'M002'], 'round-trip preserves order');
+
+  // Verify file contains updatedAt
+  const raw = JSON.parse(readFileSync(join(base, '.gsd', 'QUEUE-ORDER.json'), 'utf-8'));
+  assertTrue(typeof raw.updatedAt === 'string' && raw.updatedAt.length > 0, 'file contains updatedAt');
+
+  cleanup(base);
+}
+
+// Load returns null on corrupt JSON
+{
+  const base = createFixtureBase();
+  writeFileSync(join(base, '.gsd', 'QUEUE-ORDER.json'), 'not json');
+  assertEq(loadQueueOrder(base), null, 'returns null on corrupt JSON');
+  cleanup(base);
+}
+
+// Load returns null when order field is not an array
+{
+  const base = createFixtureBase();
+  writeFileSync(join(base, '.gsd', 'QUEUE-ORDER.json'), '{"order": "invalid"}');
+  assertEq(loadQueueOrder(base), null, 'returns null when order is not array');
+  cleanup(base);
+}
+
+// ═══════════════════════════════════════════════════════════════════════════
+// pruneQueueOrder
+// ═══════════════════════════════════════════════════════════════════════════
+
+console.log('\n=== pruneQueueOrder ===');
+
+// Prune removes invalid IDs
+{
+  const base = createFixtureBase();
+  saveQueueOrder(base, ['M001', 'M002', 'M003']);
+  pruneQueueOrder(base, ['M001', 'M003']);
+  assertEq(loadQueueOrder(base), ['M001', 'M003'], 'prune removes invalid IDs');
+  cleanup(base);
+}
+
+// Prune no-ops when file doesn't exist
+{
+  const base = createFixtureBase();
+  pruneQueueOrder(base, ['M001']); // should not throw
+  assertTrue(!existsSync(join(base, '.gsd', 'QUEUE-ORDER.json')), 'prune does not create file');
+  cleanup(base);
+}
+
+// Prune no-ops when all IDs are valid
+{
+  const base = createFixtureBase();
+  saveQueueOrder(base, ['M001', 'M002']);
+  pruneQueueOrder(base, ['M001', 'M002', 'M003']);
+  assertEq(loadQueueOrder(base), ['M001', 'M002'], 'prune is no-op when all valid');
+  cleanup(base);
+}
+
+// ═══════════════════════════════════════════════════════════════════════════
+// validateQueueOrder
+// ═══════════════════════════════════════════════════════════════════════════
+
+console.log('\n=== validateQueueOrder ===');
+
+// Valid order with no dependencies
+{
+  const depsMap = new Map<string, string[]>();
+  const result = validateQueueOrder(['M001', 'M002'], depsMap, new Set());
+  assertTrue(result.valid, 'valid when no dependencies');
+  assertEq(result.violations.length, 0, 'no violations');
+  assertEq(result.redundant.length, 0, 'no redundancies');
+}
+
+// Dependency violation: M002 before M001, but M002 depends on M001
+{
+  const depsMap = new Map<string, string[]>([['M002', ['M001']]]);
+  const result = validateQueueOrder(['M002', 'M001'], depsMap, new Set());
+  assertTrue(!result.valid, 'invalid when dep violated');
+  assertEq(result.violations.length, 1, 'one violation');
+  assertEq(result.violations[0].type, 'would_block', 'violation type is would_block');
+  assertEq(result.violations[0].milestone, 'M002', 'violation milestone is M002');
+  assertEq(result.violations[0].dependsOn, 'M001', 'violation dep is M001');
+}
+
+// Redundant dependency: M002 depends on M001, M001 comes first in order
+{
+  const depsMap = new Map<string, string[]>([['M002', ['M001']]]);
+  const result = validateQueueOrder(['M001', 'M002'], depsMap, new Set());
+  assertTrue(result.valid, 'valid when dep satisfied by position');
+  assertEq(result.redundant.length, 1, 'one redundancy');
+  assertEq(result.redundant[0].milestone, 'M002', 'redundant milestone is M002');
+}
+
+// Completed dep is always satisfied
+{
+  const depsMap = new Map<string, string[]>([['M002', ['M001']]]);
+  const result = validateQueueOrder(['M002'], depsMap, new Set(['M001']));
+  assertTrue(result.valid, 'valid when dep is already completed');
+  assertEq(result.violations.length, 0, 'no violations for completed dep');
+}
+
+// Missing dependency
+{
+  const depsMap = new Map<string, string[]>([['M002', ['M099']]]);
+  const result = validateQueueOrder(['M001', 'M002'], depsMap, new Set());
+  assertTrue(!result.valid, 'invalid when dep does not exist');
+  assertEq(result.violations[0].type, 'missing_dep', 'violation type is missing_dep');
+}
+
+// Circular dependency
+{
+  const depsMap = new Map<string, string[]>([
+    ['M001', ['M002']],
+    ['M002', ['M001']],
+  ]);
+  const result = validateQueueOrder(['M001', 'M002'], depsMap, new Set());
+  assertTrue(!result.valid, 'invalid on circular dependency');
+  const circularViolation = result.violations.find(v => v.type === 'circular');
+  assertTrue(!!circularViolation, 'circular violation detected');
+}
+
+// ═══════════════════════════════════════════════════════════════════════════
+
+report();
--- a/src/resources/extensions/gsd/tests/queue-reorder-e2e.test.ts
+++ b/src/resources/extensions/gsd/tests/queue-reorder-e2e.test.ts
@ -0,0 +1,281 @@
+/**
+ * End-to-end integration tests for the Queue Reorder feature.
+ *
+ * Verifies the full chain: QUEUE-ORDER.json + findMilestoneIds() + deriveState()
+ * + depends_on removal from CONTEXT.md files.
+ *
+ * These tests simulate what happens when a user reorders milestones and confirms:
+ * 1. QUEUE-ORDER.json is written with the new order
+ * 2. depends_on is removed from CONTEXT.md frontmatter
+ * 3. deriveState() picks the correct milestone as active
+ * 4. A fresh deriveState() call (simulating new session) also works
+ */
+
+import { mkdtempSync, mkdirSync, rmSync, writeFileSync, readFileSync, existsSync } from 'node:fs';
+import { join } from 'node:path';
+import { tmpdir } from 'node:os';
+
+import { deriveState, invalidateStateCache } from '../state.ts';
+import { findMilestoneIds } from '../guided-flow.ts';
+import { saveQueueOrder, loadQueueOrder } from '../queue-order.ts';
+import { parseContextDependsOn } from '../files.ts';
+import { createTestContext } from './test-helpers.ts';
+
+const { assertEq, assertTrue, report } = createTestContext();
+
+// ─── Fixture Helpers ───────────────────────────────────────────────────────
+
+function createFixtureBase(): string {
+  const base = mkdtempSync(join(tmpdir(), 'gsd-reorder-e2e-'));
+  mkdirSync(join(base, '.gsd', 'milestones'), { recursive: true });
+  return base;
+}
+
+function cleanup(base: string): void {
+  rmSync(base, { recursive: true, force: true });
+}
+
+function writeMilestoneDir(base: string, mid: string): void {
+  mkdirSync(join(base, '.gsd', 'milestones', mid), { recursive: true });
+}
+
+function writeContext(base: string, mid: string, frontmatter: string, body: string = ''): void {
+  const dir = join(base, '.gsd', 'milestones', mid);
+  mkdirSync(dir, { recursive: true });
+  const fm = frontmatter ? `---\n${frontmatter}\n---\n\n` : '';
+  writeFileSync(join(dir, `${mid}-CONTEXT.md`), `${fm}# ${mid}: Test\n\n${body}`);
+}
+
+function writeCompleteMilestone(base: string, mid: string): void {
+  const dir = join(base, '.gsd', 'milestones', mid);
+  mkdirSync(dir, { recursive: true });
+  writeFileSync(join(dir, `${mid}-ROADMAP.md`), `# ${mid}: Complete
+
+**Vision:** Done.
+
+## Slices
+
+- [x] **S01: Done** \`risk:low\` \`depends:[]\`
+  > After this: Done.
+`);
+  writeFileSync(join(dir, `${mid}-SUMMARY.md`), `# ${mid} Summary\n\nComplete.`);
+}
+
+function readContextFile(base: string, mid: string): string {
+  return readFileSync(join(base, '.gsd', 'milestones', mid, `${mid}-CONTEXT.md`), 'utf-8');
+}
+
+// ═══════════════════════════════════════════════════════════════════════════
+// Test: Queue order changes milestone activation
+// ═══════════════════════════════════════════════════════════════════════════
+
+console.log('\n=== E2E: queue-order changes active milestone ===');
+{
+  const base = createFixtureBase();
+  try {
+    // Setup: M007 complete, M008 and M009 pending (no context, no roadmap)
+    writeCompleteMilestone(base, 'M007');
+    writeMilestoneDir(base, 'M008');
+    writeContext(base, 'M008', '', 'Multi-Session Parallel Orchestration');
+    writeMilestoneDir(base, 'M009');
+    writeContext(base, 'M009', '', 'Context-Budget Visibility');
+
+    // Without custom order: M008 comes first (numeric sort)
+    invalidateStateCache();
+    const stateBefore = await deriveState(base);
+    assertEq(stateBefore.activeMilestone?.id, 'M008', 'before reorder: M008 is active');
+
+    // Save custom order: M009 before M008
+    saveQueueOrder(base, ['M009', 'M008']);
+
+    // With custom order: M009 should be active
+    invalidateStateCache();
+    const stateAfter = await deriveState(base);
+    assertEq(stateAfter.activeMilestone?.id, 'M009', 'after reorder: M009 is active');
+
+    // findMilestoneIds respects the order
+    const ids = findMilestoneIds(base);
+    const m008Idx = ids.indexOf('M008');
+    const m009Idx = ids.indexOf('M009');
+    assertTrue(m009Idx < m008Idx, 'findMilestoneIds: M009 comes before M008');
+
+  } finally {
+    cleanup(base);
+  }
+}
+
+// ═══════════════════════════════════════════════════════════════════════════
+// Test: Reorder + depends_on removal = correct state
+// ═══════════════════════════════════════════════════════════════════════════
+
+console.log('\n=== E2E: reorder with depends_on removal ===');
+{
+  const base = createFixtureBase();
+  try {
+    // Setup: M007 complete, M008 depends_on M009, M009 no deps
+    writeCompleteMilestone(base, 'M007');
+    writeContext(base, 'M008', 'depends_on: [M009]', 'Multi-Session Parallel');
+    writeContext(base, 'M009', '', 'Context-Budget Visibility');
+
+    // Before: M008 depends on M009, so deriveState skips M008, M009 is active
+    invalidateStateCache();
+    const stateBefore = await deriveState(base);
+    assertEq(stateBefore.activeMilestone?.id, 'M009', 'before: M009 active (M008 dep-blocked)');
+
+    // Simulate reorder confirm: save order M009→M008, remove depends_on from M008
+    saveQueueOrder(base, ['M009', 'M008']);
+
+    // Remove depends_on from M008-CONTEXT.md (simulating what handleQueueReorder does)
+    const contextContent = readContextFile(base, 'M008');
+    const newContent = contextContent.replace(/---\ndepends_on: \[M009\]\n---\n\n/, '');
+    writeFileSync(join(base, '.gsd', 'milestones', 'M008', 'M008-CONTEXT.md'), newContent);
+
+    // Verify: depends_on is gone
+    const updatedContent = readContextFile(base, 'M008');
+    const deps = parseContextDependsOn(updatedContent);
+    assertEq(deps.length, 0, 'depends_on removed from M008-CONTEXT.md');
+
+    // Verify: deriveState still picks M009 (it's first in queue order)
+    invalidateStateCache();
+    const stateAfter = await deriveState(base);
+    assertEq(stateAfter.activeMilestone?.id, 'M009', 'after: M009 still active (first in queue)');
+
+    // Verify: M008 is now pending (not dep-blocked)
+    const m008Entry = stateAfter.registry.find(m => m.id === 'M008');
+    assertEq(m008Entry?.status, 'pending', 'M008 is pending (not dep-blocked)');
+    assertTrue(!m008Entry?.dependsOn || m008Entry.dependsOn.length === 0, 'M008 has no dependsOn');
+
+  } finally {
+    cleanup(base);
+  }
+}
+
+// ═══════════════════════════════════════════════════════════════════════════
+// Test: Fresh deriveState (simulating new session) respects queue order
+// ═══════════════════════════════════════════════════════════════════════════
+
+console.log('\n=== E2E: fresh session respects queue order ===');
+{
+  const base = createFixtureBase();
+  try {
+    writeCompleteMilestone(base, 'M007');
+    writeContext(base, 'M008', '', 'Parallel Orchestration');
+    writeContext(base, 'M009', '', 'Budget Visibility');
+
+    // Save queue order
+    saveQueueOrder(base, ['M009', 'M008']);
+
+    // Simulate fresh session — invalidate all caches
+    invalidateStateCache();
+
+    // Derive state — should read QUEUE-ORDER.json from disk
+    const state = await deriveState(base);
+    assertEq(state.activeMilestone?.id, 'M009', 'fresh session: M009 is active');
+
+    // Verify queue order persisted
+    const order = loadQueueOrder(base);
+    assertEq(order, ['M009', 'M008'], 'QUEUE-ORDER.json persisted correctly');
+
+  } finally {
+    cleanup(base);
+  }
+}
+
+// ═══════════════════════════════════════════════════════════════════════════
+// Test: Queue order with newly added milestones
+// ═══════════════════════════════════════════════════════════════════════════
+
+console.log('\n=== E2E: new milestones appended to queue ===');
+{
+  const base = createFixtureBase();
+  try {
+    writeCompleteMilestone(base, 'M007');
+    writeContext(base, 'M008', '', 'Parallel');
+    writeContext(base, 'M009', '', 'Visibility');
+
+    // Custom order only has M009, M008
+    saveQueueOrder(base, ['M009', 'M008']);
+
+    // Add M010 (not in queue order)
+    writeContext(base, 'M010', '', 'New feature');
+
+    invalidateStateCache();
+    const ids = findMilestoneIds(base);
+
+    // M009 first, M008 second, M010 appended at end
+    const m009Idx = ids.indexOf('M009');
+    const m008Idx = ids.indexOf('M008');
+    const m010Idx = ids.indexOf('M010');
+    assertTrue(m009Idx < m008Idx, 'M009 before M008');
+    assertTrue(m008Idx < m010Idx, 'M008 before M010 (new milestone appended)');
+
+    // M009 is still active (first non-complete in queue order)
+    const state = await deriveState(base);
+    assertEq(state.activeMilestone?.id, 'M009', 'M009 still active after M010 added');
+
+  } finally {
+    cleanup(base);
+  }
+}
+
+// ═══════════════════════════════════════════════════════════════════════════
+// Test: No queue order file = default numeric sort (backward compat)
+// ═══════════════════════════════════════════════════════════════════════════
+
+console.log('\n=== E2E: backward compat without QUEUE-ORDER.json ===');
+{
+  const base = createFixtureBase();
+  try {
+    writeCompleteMilestone(base, 'M007');
+    writeContext(base, 'M008', '', 'Parallel');
+    writeContext(base, 'M009', '', 'Visibility');
+
+    // No QUEUE-ORDER.json — default numeric sort
+    invalidateStateCache();
+    const state = await deriveState(base);
+    assertEq(state.activeMilestone?.id, 'M008', 'no queue order: M008 active (numeric)');
+
+    const ids = findMilestoneIds(base);
+    assertTrue(ids.indexOf('M008') < ids.indexOf('M009'), 'default sort: M008 before M009');
+
+  } finally {
+    cleanup(base);
+  }
+}
+
+// ═══════════════════════════════════════════════════════════════════════════
+// Test: depends_on inline array format removal
+// ═══════════════════════════════════════════════════════════════════════════
+
+console.log('\n=== E2E: depends_on inline format preserved after partial removal ===');
+{
+  const base = createFixtureBase();
+  try {
+    writeCompleteMilestone(base, 'M007');
+    // M008 depends on both M009 and M010
+    writeContext(base, 'M008', 'depends_on: [M009, M010]', 'Parallel');
+    writeContext(base, 'M009', '', 'Visibility');
+    writeContext(base, 'M010', '', 'Other');
+
+    // Verify both deps are parsed
+    const contentBefore = readContextFile(base, 'M008');
+    const depsBefore = parseContextDependsOn(contentBefore);
+    assertEq(depsBefore.length, 2, 'M008 has 2 deps before');
+
+    // Simulate removing only M009 dep (keep M010)
+    const content = readContextFile(base, 'M008');
+    const updated = content.replace('depends_on: [M009, M010]', 'depends_on: [M010]');
+    writeFileSync(join(base, '.gsd', 'milestones', 'M008', 'M008-CONTEXT.md'), updated);
+
+    // Verify only M010 remains
+    const contentAfter = readContextFile(base, 'M008');
+    const depsAfter = parseContextDependsOn(contentAfter);
+    assertEq(depsAfter.length, 1, 'M008 has 1 dep after removal');
+    assertEq(depsAfter[0], 'M010', 'remaining dep is M010');
+
+  } finally {
+    cleanup(base);
+  }
+}
+
+report();
--- a/src/resources/extensions/gsd/tests/remote-questions.test.ts
+++ b/src/resources/extensions/gsd/tests/remote-questions.test.ts
@ -1,9 +1,15 @@
 import test from "node:test";
 import assert from "node:assert/strict";
-import { parseSlackReply, parseDiscordResponse } from "../../remote-questions/format.ts";
+import { readFileSync } from "node:fs";
+import { join, dirname } from "node:path";
+import { fileURLToPath } from "node:url";
+import { parseSlackReply, parseDiscordResponse, formatForDiscord } from "../../remote-questions/format.ts";
 import { resolveRemoteConfig, isValidChannelId } from "../../remote-questions/config.ts";
 import { sanitizeError } from "../../remote-questions/manager.ts";

+const __filename = fileURLToPath(import.meta.url);
+const __dirname = dirname(__filename);
+
 test("parseSlackReply handles single-number single-question answers", () => {
  const result = parseSlackReply("2", [{
    id: "choice",
@ -153,3 +159,223 @@ test("sanitizeError preserves short safe messages", () => {
  assert.equal(sanitizeError("Connection refused"), "Connection refused");
 });

+
+// ═══════════════════════════════════════════════════════════════════════════
+// Discord Parity Tests
+// ═══════════════════════════════════════════════════════════════════════════
+
+test("formatForDiscord includes context source in footer when present", () => {
+  const prompt = {
+    id: "test-1",
+    channel: "discord" as const,
+    createdAt: Date.now(),
+    timeoutAt: Date.now() + 60000,
+    pollIntervalMs: 5000,
+    context: { source: "auto-mode-dispatch" },
+    questions: [{
+      id: "q1",
+      header: "Confirm",
+      question: "Proceed?",
+      options: [
+        { label: "Yes", description: "Continue" },
+        { label: "No", description: "Stop" },
+      ],
+      allowMultiple: false,
+    }],
+  };
+
+  const { embeds } = formatForDiscord(prompt);
+  assert.equal(embeds.length, 1);
+  assert.ok(embeds[0].footer?.text.includes("auto-mode-dispatch"), "footer should include context source");
+});
+
+test("formatForDiscord omits source from footer when context is absent", () => {
+  const prompt = {
+    id: "test-2",
+    channel: "discord" as const,
+    createdAt: Date.now(),
+    timeoutAt: Date.now() + 60000,
+    pollIntervalMs: 5000,
+    questions: [{
+      id: "q1",
+      header: "Choice",
+      question: "Pick one",
+      options: [
+        { label: "A", description: "Alpha" },
+        { label: "B", description: "Beta" },
+      ],
+      allowMultiple: false,
+    }],
+  };
+
+  const { embeds } = formatForDiscord(prompt);
+  assert.ok(!embeds[0].footer?.text.includes("Source:"), "footer should not include Source when context absent");
+});
+
+test("formatForDiscord multi-question footer includes question position", () => {
+  const prompt = {
+    id: "test-3",
+    channel: "discord" as const,
+    createdAt: Date.now(),
+    timeoutAt: Date.now() + 60000,
+    pollIntervalMs: 5000,
+    questions: [
+      {
+        id: "q1",
+        header: "First",
+        question: "Pick",
+        options: [{ label: "A", description: "a" }],
+        allowMultiple: false,
+      },
+      {
+        id: "q2",
+        header: "Second",
+        question: "Pick",
+        options: [{ label: "B", description: "b" }],
+        allowMultiple: false,
+      },
+    ],
+  };
+
+  const { embeds } = formatForDiscord(prompt);
+  assert.equal(embeds.length, 2);
+  assert.ok(embeds[0].footer?.text.includes("1/2"), "first embed footer should show 1/2");
+  assert.ok(embeds[1].footer?.text.includes("2/2"), "second embed footer should show 2/2");
+});
+
+test("formatForDiscord single-question generates reaction emojis", () => {
+  const prompt = {
+    id: "test-4",
+    channel: "discord" as const,
+    createdAt: Date.now(),
+    timeoutAt: Date.now() + 60000,
+    pollIntervalMs: 5000,
+    questions: [{
+      id: "q1",
+      header: "Pick",
+      question: "Choose",
+      options: [
+        { label: "A", description: "a" },
+        { label: "B", description: "b" },
+        { label: "C", description: "c" },
+      ],
+      allowMultiple: false,
+    }],
+  };
+
+  const { reactionEmojis } = formatForDiscord(prompt);
+  assert.equal(reactionEmojis.length, 3, "should generate 3 reaction emojis for 3 options");
+  assert.equal(reactionEmojis[0], "1️⃣");
+  assert.equal(reactionEmojis[1], "2️⃣");
+  assert.equal(reactionEmojis[2], "3️⃣");
+});
+
+test("formatForDiscord multi-question generates no reaction emojis", () => {
+  const prompt = {
+    id: "test-5",
+    channel: "discord" as const,
+    createdAt: Date.now(),
+    timeoutAt: Date.now() + 60000,
+    pollIntervalMs: 5000,
+    questions: [
+      {
+        id: "q1",
+        header: "First",
+        question: "Pick",
+        options: [{ label: "A", description: "a" }],
+        allowMultiple: false,
+      },
+      {
+        id: "q2",
+        header: "Second",
+        question: "Pick",
+        options: [{ label: "B", description: "b" }],
+        allowMultiple: false,
+      },
+    ],
+  };
+
+  const { reactionEmojis } = formatForDiscord(prompt);
+  assert.equal(reactionEmojis.length, 0, "multi-question should not generate reaction emojis");
+});
+
+test("parseDiscordResponse handles multi-question text reply via semicolons", () => {
+  const result = parseDiscordResponse([], "1;2", [
+    {
+      id: "first",
+      header: "First",
+      question: "Pick one",
+      allowMultiple: false,
+      options: [
+        { label: "Alpha", description: "A" },
+        { label: "Beta", description: "B" },
+      ],
+    },
+    {
+      id: "second",
+      header: "Second",
+      question: "Pick one",
+      allowMultiple: false,
+      options: [
+        { label: "Gamma", description: "G" },
+        { label: "Delta", description: "D" },
+      ],
+    },
+  ]);
+
+  assert.deepEqual(result.answers.first.answers, ["Alpha"]);
+  assert.deepEqual(result.answers.second.answers, ["Delta"]);
+});
+
+test("parseDiscordResponse handles multiple reactions for allowMultiple question", () => {
+  const result = parseDiscordResponse(
+    [{ emoji: "1️⃣", count: 1 }, { emoji: "3️⃣", count: 1 }],
+    null,
+    [{
+      id: "choice",
+      header: "Choice",
+      question: "Pick any",
+      allowMultiple: true,
+      options: [
+        { label: "Alpha", description: "A" },
+        { label: "Beta", description: "B" },
+        { label: "Gamma", description: "G" },
+      ],
+    }],
+  );
+
+  assert.deepEqual(result.answers.choice.answers, ["Alpha", "Gamma"]);
+});
+
+test("DiscordAdapter source-level: acknowledgeAnswer method exists", () => {
+  const adapterSrc = readFileSync(
+    join(__dirname, "..", "..", "remote-questions", "discord-adapter.ts"),
+    "utf-8",
+  );
+  assert.ok(adapterSrc.includes("async acknowledgeAnswer"), "should have acknowledgeAnswer method");
+  assert.ok(adapterSrc.includes("✅"), "should use checkmark emoji for acknowledgement");
+});
+
+test("DiscordAdapter source-level: resolves guild ID for message URLs", () => {
+  const adapterSrc = readFileSync(
+    join(__dirname, "..", "..", "remote-questions", "discord-adapter.ts"),
+    "utf-8",
+  );
+  assert.ok(adapterSrc.includes("guildId"), "should track guild ID");
+  assert.ok(adapterSrc.includes("guild_id"), "should read guild_id from channel info");
+  assert.ok(
+    adapterSrc.includes("discord.com/channels/"),
+    "should construct message URL with guild/channel/message format",
+  );
+});
+
+test("DiscordAdapter source-level: sendPrompt sets threadUrl in ref", () => {
+  const adapterSrc = readFileSync(
+    join(__dirname, "..", "..", "remote-questions", "discord-adapter.ts"),
+    "utf-8",
+  );
+  assert.ok(
+    adapterSrc.includes("threadUrl: messageUrl"),
+    "sendPrompt should set threadUrl to the constructed message URL",
+  );
+});
--- a/src/resources/extensions/gsd/tests/routing-history.test.ts
+++ b/src/resources/extensions/gsd/tests/routing-history.test.ts
@ -1,87 +1,240 @@
-/**
- * Routing History — structural tests for adaptive learning module.
- *
- * Verifies routing-history.ts exports and structure from #579.
- * Uses source-level checks to avoid @gsd/pi-coding-agent import chain.
- */
-
 import test from "node:test";
 import assert from "node:assert/strict";
-import { readFileSync } from "node:fs";
-import { join, dirname } from "node:path";
-import { fileURLToPath } from "node:url";
+import { mkdirSync, rmSync, writeFileSync, readFileSync } from "node:fs";
+import { join } from "node:path";
+import { tmpdir } from "node:os";

-const __dirname = dirname(fileURLToPath(import.meta.url));
-const historySrc = readFileSync(join(__dirname, "..", "routing-history.ts"), "utf-8");
+import {
+  initRoutingHistory,
+  resetRoutingHistory,
+  recordOutcome,
+  recordFeedback,
+  getAdaptiveTierAdjustment,
+  clearRoutingHistory,
+  getRoutingHistory,
+} from "../routing-history.js";

-// ═══════════════════════════════════════════════════════════════════════════
-// Module Exports
-// ═══════════════════════════════════════════════════════════════════════════
+// ─── Test Setup ──────────────────────────────────────────────────────────────

-test("routing-history: exports initRoutingHistory", () => {
-  assert.ok(historySrc.includes("export function initRoutingHistory"), "should export initRoutingHistory");
+function makeTmpDir(): string {
+  const dir = join(tmpdir(), `gsd-routing-test-${Date.now()}-${Math.random().toString(36).slice(2)}`);
+  mkdirSync(join(dir, ".gsd"), { recursive: true });
+  return dir;
+}
+
+function cleanup(dir: string): void {
+  try { rmSync(dir, { recursive: true, force: true }); } catch {}
+  resetRoutingHistory();
+}
+
+// ─── recordOutcome ───────────────────────────────────────────────────────────
+
+test("recordOutcome tracks success and failure counts", () => {
+  const dir = makeTmpDir();
+  try {
+    initRoutingHistory(dir);
+    recordOutcome("execute-task", "standard", true);
+    recordOutcome("execute-task", "standard", true);
+    recordOutcome("execute-task", "standard", false);
+
+    const history = getRoutingHistory();
+    assert.ok(history);
+    const pattern = history.patterns["execute-task"];
+    assert.ok(pattern);
+    assert.equal(pattern.standard.success, 2);
+    assert.equal(pattern.standard.fail, 1);
+  } finally {
+    cleanup(dir);
+  }
 });

-test("routing-history: exports recordOutcome", () => {
-  assert.ok(historySrc.includes("export function recordOutcome"), "should export recordOutcome");
+test("recordOutcome tracks tag-specific patterns", () => {
+  const dir = makeTmpDir();
+  try {
+    initRoutingHistory(dir);
+    recordOutcome("execute-task", "light", true, ["docs"]);
+
+    const history = getRoutingHistory();
+    assert.ok(history);
+    assert.ok(history.patterns["execute-task:docs"]);
+    assert.equal(history.patterns["execute-task:docs"].light.success, 1);
+  } finally {
+    cleanup(dir);
+  }
 });

-test("routing-history: exports recordFeedback", () => {
-  assert.ok(historySrc.includes("export function recordFeedback"), "should export recordFeedback");
+test("recordOutcome applies rolling window", () => {
+  const dir = makeTmpDir();
+  try {
+    initRoutingHistory(dir);
+    // Record 60 successes — should be capped to 50
+    for (let i = 0; i < 60; i++) {
+      recordOutcome("execute-task", "standard", true);
+    }
+
+    const history = getRoutingHistory();
+    assert.ok(history);
+    const total = history.patterns["execute-task"].standard.success +
+                  history.patterns["execute-task"].standard.fail;
+    assert.ok(total <= 50, `total ${total} should be <= 50`);
+  } finally {
+    cleanup(dir);
+  }
 });

-test("routing-history: exports getAdaptiveTierAdjustment", () => {
-  assert.ok(historySrc.includes("export function getAdaptiveTierAdjustment"), "should export getAdaptiveTierAdjustment");
+// ─── getAdaptiveTierAdjustment ───────────────────────────────────────────────
+
+test("no adjustment when insufficient data", () => {
+  const dir = makeTmpDir();
+  try {
+    initRoutingHistory(dir);
+    recordOutcome("execute-task", "light", false);
+    // Only 1 data point — not enough
+    const adj = getAdaptiveTierAdjustment("execute-task", "light");
+    assert.equal(adj, null);
+  } finally {
+    cleanup(dir);
+  }
 });

-test("routing-history: exports resetRoutingHistory", () => {
-  assert.ok(historySrc.includes("export function resetRoutingHistory"), "should export resetRoutingHistory");
+test("bumps tier when failure rate exceeds threshold", () => {
+  const dir = makeTmpDir();
+  try {
+    initRoutingHistory(dir);
+    // Record high failure rate at light tier
+    recordOutcome("execute-task", "light", false);
+    recordOutcome("execute-task", "light", false);
+    recordOutcome("execute-task", "light", true);
+    // 2/3 = 66% failure rate > 20% threshold
+
+    const adj = getAdaptiveTierAdjustment("execute-task", "light");
+    assert.equal(adj, "standard");
+  } finally {
+    cleanup(dir);
+  }
 });

-// ═══════════════════════════════════════════════════════════════════════════
-// Design Constants
-// ═══════════════════════════════════════════════════════════════════════════
-
-test("routing-history: uses rolling window of 50 entries", () => {
-  assert.ok(historySrc.includes("ROLLING_WINDOW = 50"), "should use 50-entry rolling window");
+test("no adjustment when success rate is high", () => {
+  const dir = makeTmpDir();
+  try {
+    initRoutingHistory(dir);
+    for (let i = 0; i < 10; i++) {
+      recordOutcome("execute-task", "light", true);
+    }
+    const adj = getAdaptiveTierAdjustment("execute-task", "light");
+    assert.equal(adj, null);
+  } finally {
+    cleanup(dir);
+  }
 });

-test("routing-history: failure threshold is 20%", () => {
-  assert.ok(historySrc.includes("FAILURE_THRESHOLD = 0.20"), "should use 20% failure threshold");
+test("tag-specific patterns take precedence", () => {
+  const dir = makeTmpDir();
+  try {
+    initRoutingHistory(dir);
+    // Base pattern has high success rate (tagged calls also count toward base)
+    for (let i = 0; i < 15; i++) {
+      recordOutcome("execute-task", "light", true);
+    }
+    // But docs-tagged tasks fail at light
+    recordOutcome("execute-task", "light", false, ["docs"]);
+    recordOutcome("execute-task", "light", false, ["docs"]);
+    recordOutcome("execute-task", "light", true, ["docs"]);
+
+    // With tags, should bump (docs pattern: 1/3 success = 66% failure)
+    const adj = getAdaptiveTierAdjustment("execute-task", "light", ["docs"]);
+    assert.equal(adj, "standard");
+
+    // Without tags, should not bump (base: 16/18 success = 11% failure)
+    const adjBase = getAdaptiveTierAdjustment("execute-task", "light");
+    assert.equal(adjBase, null);
+  } finally {
+    cleanup(dir);
+  }
 });

-test("routing-history: feedback weight is 2x", () => {
-  assert.ok(historySrc.includes("FEEDBACK_WEIGHT = 2"), "feedback should count 2x");
+// ─── recordFeedback ──────────────────────────────────────────────────────────
+
+test("recordFeedback stores feedback entries", () => {
+  const dir = makeTmpDir();
+  try {
+    initRoutingHistory(dir);
+    recordFeedback("execute-task", "M001/S01/T01", "standard", "over");
+
+    const history = getRoutingHistory();
+    assert.ok(history);
+    assert.equal(history.feedback.length, 1);
+    assert.equal(history.feedback[0].rating, "over");
+    assert.equal(history.feedback[0].tier, "standard");
+  } finally {
+    cleanup(dir);
+  }
 });

-// ═══════════════════════════════════════════════════════════════════════════
-// Type Structure
-// ═══════════════════════════════════════════════════════════════════════════
+test("recordFeedback 'under' increases failure count at tier", () => {
+  const dir = makeTmpDir();
+  try {
+    initRoutingHistory(dir);
+    recordFeedback("execute-task", "M001/S01/T01", "light", "under");

-test("routing-history: imports ComplexityTier from types.ts", () => {
-  assert.ok(
-    historySrc.includes('from "./types.js"') && historySrc.includes("ComplexityTier"),
-    "should import ComplexityTier from types.ts",
-  );
+    const history = getRoutingHistory();
+    assert.ok(history);
+    // "under" adds 2 (FEEDBACK_WEIGHT) failures
+    assert.equal(history.patterns["execute-task"].light.fail, 2);
+  } finally {
+    cleanup(dir);
+  }
 });

-test("routing-history: defines RoutingHistoryData interface", () => {
-  assert.ok(historySrc.includes("interface RoutingHistoryData"), "should define RoutingHistoryData");
+test("recordFeedback 'over' increases success count at lower tier", () => {
+  const dir = makeTmpDir();
+  try {
+    initRoutingHistory(dir);
+    recordFeedback("execute-task", "M001/S01/T01", "standard", "over");
+
+    const history = getRoutingHistory();
+    assert.ok(history);
+    // "over" at standard → adds 2 successes at light
+    assert.equal(history.patterns["execute-task"].light.success, 2);
+  } finally {
+    cleanup(dir);
+  }
 });

-test("routing-history: defines FeedbackEntry interface", () => {
-  assert.ok(historySrc.includes("interface FeedbackEntry"), "should define FeedbackEntry");
+// ─── clearRoutingHistory ─────────────────────────────────────────────────────
+
+test("clearRoutingHistory resets all data", () => {
+  const dir = makeTmpDir();
+  try {
+    initRoutingHistory(dir);
+    recordOutcome("execute-task", "light", true);
+    clearRoutingHistory(dir);
+
+    const history = getRoutingHistory();
+    assert.ok(history);
+    assert.deepEqual(history.patterns, {});
+    assert.deepEqual(history.feedback, []);
+  } finally {
+    cleanup(dir);
+  }
 });

-// ═══════════════════════════════════════════════════════════════════════════
-// Persistence
-// ═══════════════════════════════════════════════════════════════════════════
+// ─── Persistence ─────────────────────────────────────────────────────────────

-test("routing-history: persists to routing-history.json", () => {
-  assert.ok(historySrc.includes("routing-history.json"), "should persist to routing-history.json");
-});
+test("routing history persists to disk and reloads", () => {
+  const dir = makeTmpDir();
+  try {
+    initRoutingHistory(dir);
+    recordOutcome("execute-task", "standard", true);
+    recordOutcome("execute-task", "standard", true);
+    resetRoutingHistory();

-test("routing-history: has save and load functions", () => {
-  assert.ok(historySrc.includes("saveHistory") || historySrc.includes("function save"), "should have save");
-  assert.ok(historySrc.includes("loadHistory") || historySrc.includes("function load"), "should have load");
+    // Reload from disk
+    initRoutingHistory(dir);
+    const history = getRoutingHistory();
+    assert.ok(history);
+    assert.equal(history.patterns["execute-task"].standard.success, 2);
+  } finally {
+    cleanup(dir);
+  }
 });
--- a/src/resources/extensions/gsd/tests/stale-worktree-cwd.test.ts
+++ b/src/resources/extensions/gsd/tests/stale-worktree-cwd.test.ts
@ -0,0 +1,139 @@
+/**
+ * stale-worktree-cwd.test.ts — Tests for #608 fix.
+ *
+ * Verifies that when process.cwd() is inside a stale .gsd/worktrees/ path,
+ * startAuto escapes back to the project root before proceeding.
+ */
+
+import test from "node:test";
+import assert from "node:assert/strict";
+import { mkdtempSync, mkdirSync, rmSync, existsSync, realpathSync, writeFileSync } from "node:fs";
+import { join, sep } from "node:path";
+import { tmpdir } from "node:os";
+import { execSync } from "node:child_process";
+
+import {
+  createAutoWorktree,
+  teardownAutoWorktree,
+  mergeMilestoneToMain,
+} from "../auto-worktree.ts";
+
+function run(command: string, cwd: string): string {
+  return execSync(command, { cwd, stdio: ["ignore", "pipe", "pipe"], encoding: "utf-8" }).trim();
+}
+
+function createTempRepo(): string {
+  const dir = realpathSync(mkdtempSync(join(tmpdir(), "stale-wt-test-")));
+  run("git init", dir);
+  run("git config user.email test@test.com", dir);
+  run("git config user.name Test", dir);
+  writeFileSync(join(dir, "README.md"), "# test\n");
+  run("git add .", dir);
+  run("git commit -m init", dir);
+  run("git branch -M main", dir);
+  return dir;
+}
+
+// ─── escapeStaleWorktree is called by startAuto, test the detection logic ────
+
+test("detects stale worktree path and extracts project root", () => {
+  // Simulate the path pattern: /project/.gsd/worktrees/M004/...
+  const projectRoot = "/Users/test/myproject";
+  const stalePath = `${projectRoot}${sep}.gsd${sep}worktrees${sep}M004`;
+
+  const marker = `${sep}.gsd${sep}worktrees${sep}`;
+  const idx = stalePath.indexOf(marker);
+
+  assert.ok(idx !== -1, "marker found in stale path");
+  assert.equal(stalePath.slice(0, idx), projectRoot, "project root extracted correctly");
+});
+
+test("does not trigger on normal project path", () => {
+  const normalPath = "/Users/test/myproject";
+  const marker = `${sep}.gsd${sep}worktrees${sep}`;
+  const idx = normalPath.indexOf(marker);
+
+  assert.equal(idx, -1, "marker not found in normal path");
+});
+
+// ─── Integration: mergeMilestoneToMain restores cwd ─────────────────────────
+
+test("mergeMilestoneToMain restores cwd to project root", () => {
+  const savedCwd = process.cwd();
+  let tempDir = "";
+
+  try {
+    tempDir = createTempRepo();
+
+    // Create milestone planning artifacts
+    const msDir = join(tempDir, ".gsd", "milestones", "M050");
+    mkdirSync(msDir, { recursive: true });
+    writeFileSync(join(msDir, "CONTEXT.md"), "# M050 Context\n");
+    const roadmap = [
+      "# M050: Test Milestone",
+      "**Vision**: testing",
+      "## Success Criteria",
+      "- It works",
+      "## Slices",
+      "- [x] S01 — First slice",
+    ].join("\n");
+    writeFileSync(join(msDir, "ROADMAP.md"), roadmap);
+    run("git add .", tempDir);
+    run("git commit -m \"add milestone\"", tempDir);
+
+    // Create auto-worktree (enters the worktree dir)
+    const wtPath = createAutoWorktree(tempDir, "M050");
+    assert.equal(process.cwd(), wtPath, "cwd is in worktree after create");
+
+    // Add a change in the worktree
+    writeFileSync(join(wtPath, "feature.txt"), "new feature\n");
+    run("git add .", wtPath);
+    run("git commit -m \"feat: add feature\"", wtPath);
+
+    // Merge back — should restore cwd to tempDir
+    mergeMilestoneToMain(tempDir, "M050", roadmap);
+
+    assert.equal(process.cwd(), tempDir, "cwd restored to project root after merge");
+    assert.ok(!existsSync(wtPath), "worktree directory removed after merge");
+  } finally {
+    process.chdir(savedCwd);
+    if (tempDir && existsSync(tempDir)) {
+      rmSync(tempDir, { recursive: true, force: true });
+    }
+  }
+});
+
+// ─── Integration: stale worktree directory is detectable ────────────────────
+
+test("process.cwd() inside removed worktree is recoverable", () => {
+  const savedCwd = process.cwd();
+  let tempDir = "";
+
+  try {
+    tempDir = createTempRepo();
+
+    // Create a .gsd/worktrees/M099 directory to simulate stale state
+    const staleWtDir = join(tempDir, ".gsd", "worktrees", "M099");
+    mkdirSync(staleWtDir, { recursive: true });
+
+    // Enter the stale directory
+    process.chdir(staleWtDir);
+    const cwdBefore = process.cwd();
+    assert.ok(cwdBefore.includes(`${sep}.gsd${sep}worktrees${sep}`), "cwd is inside worktree dir");
+
+    // Simulate escapeStaleWorktree logic
+    const marker = `${sep}.gsd${sep}worktrees${sep}`;
+    const idx = cwdBefore.indexOf(marker);
+    assert.ok(idx !== -1, "marker found");
+
+    const projectRoot = cwdBefore.slice(0, idx);
+    process.chdir(projectRoot);
+
+    assert.equal(process.cwd(), tempDir, "successfully escaped to project root");
+  } finally {
+    process.chdir(savedCwd);
+    if (tempDir && existsSync(tempDir)) {
+      rmSync(tempDir, { recursive: true, force: true });
+    }
+  }
+});
--- a/src/resources/extensions/gsd/tests/triage-dispatch.test.ts
+++ b/src/resources/extensions/gsd/tests/triage-dispatch.test.ts
@ -0,0 +1,224 @@
+/**
+ * Triage dispatch ordering contract tests.
+ *
+ * These tests verify structural invariants of the triage integration
+ * by inspecting the actual source code of auto.ts and post-unit-hooks.ts.
+ * Full behavioral testing requires the @gsd/pi-coding-agent runtime.
+ */
+
+import test from "node:test";
+import assert from "node:assert/strict";
+import { readFileSync } from "node:fs";
+import { join, dirname } from "node:path";
+import { fileURLToPath } from "node:url";
+
+const __dirname = dirname(fileURLToPath(import.meta.url));
+const autoPath = join(__dirname, "..", "auto.ts");
+const hooksPath = join(__dirname, "..", "post-unit-hooks.ts");
+const autoPromptsPath = join(__dirname, "..", "auto-prompts.ts");
+
+const autoSrc = readFileSync(autoPath, "utf-8");
+const hooksSrc = readFileSync(hooksPath, "utf-8");
+const autoPromptsSrc = (() => { try { return readFileSync(autoPromptsPath, "utf-8"); } catch { return autoSrc; } })();
+
+// ─── Hook exclusion ──────────────────────────────────────────────────────────
+
+test("dispatch: triage-captures excluded from post-unit hook triggering", () => {
+  // post-unit-hooks.ts must return null for triage-captures unit type
+  assert.ok(
+    hooksSrc.includes('"triage-captures"'),
+    "post-unit-hooks.ts should reference triage-captures",
+  );
+  assert.ok(
+    hooksSrc.includes('completedUnitType === "triage-captures"'),
+    "should check for triage-captures in the hook exclusion guard",
+  );
+});
+
+// ─── Triage check placement ──────────────────────────────────────────────────
+
+test("dispatch: triage check appears after hook section and before stepMode check", () => {
+  const hookRetryIndex = autoSrc.indexOf("isRetryPending()");
+  // Find the triage check in handleAgentEnd (not in getAutoDashboardData)
+  const triageCheckIndex = autoSrc.indexOf("Triage check: dispatch triage unit");
+  const stepModeIndex = autoSrc.indexOf("In step mode, pause and show a wizard");
+
+  assert.ok(hookRetryIndex > 0, "hook retry check should exist");
+  assert.ok(triageCheckIndex > 0, "triage check block should exist");
+  assert.ok(stepModeIndex > 0, "step mode check should exist");
+
+  assert.ok(
+    triageCheckIndex > hookRetryIndex,
+    "triage check should come after hook retry check",
+  );
+  assert.ok(
+    triageCheckIndex < stepModeIndex,
+    "triage check should come before stepMode check",
+  );
+});
+
+// ─── Guard conditions ────────────────────────────────────────────────────────
+
+test("dispatch: triage check guards against step mode", () => {
+  // The triage block should check !stepMode
+  const triageBlock = autoSrc.slice(
+    autoSrc.indexOf("Triage check: dispatch triage unit"),
+    autoSrc.indexOf("In step mode, pause and show a wizard"),
+  );
+  assert.ok(
+    triageBlock.includes("!stepMode"),
+    "triage block should guard against step mode",
+  );
+});
+
+test("dispatch: triage check guards against hook unit types", () => {
+  const triageBlock = autoSrc.slice(
+    autoSrc.indexOf("Triage check: dispatch triage unit"),
+    autoSrc.indexOf("In step mode, pause and show a wizard"),
+  );
+  assert.ok(
+    triageBlock.includes('!currentUnit.type.startsWith("hook/")'),
+    "triage block should not fire for hook units",
+  );
+});
+
+test("dispatch: triage check guards against triage-on-triage", () => {
+  const triageBlock = autoSrc.slice(
+    autoSrc.indexOf("Triage check: dispatch triage unit"),
+    autoSrc.indexOf("In step mode, pause and show a wizard"),
+  );
+  assert.ok(
+    triageBlock.includes('currentUnit.type !== "triage-captures"'),
+    "triage block should not fire for triage units",
+  );
+});
+
+test("dispatch: triage check guards against quick-task triggering triage", () => {
+  const triageBlock = autoSrc.slice(
+    autoSrc.indexOf("Triage check: dispatch triage unit"),
+    autoSrc.indexOf("In step mode, pause and show a wizard"),
+  );
+  assert.ok(
+    triageBlock.includes('currentUnit.type !== "quick-task"'),
+    "triage block should not fire for quick-task units",
+  );
+});
+
+test("dispatch: triage dispatch uses early-return pattern", () => {
+  const triageBlock = autoSrc.slice(
+    autoSrc.indexOf("Triage check: dispatch triage unit"),
+    autoSrc.indexOf("In step mode, pause and show a wizard"),
+  );
+  assert.ok(
+    triageBlock.includes("return; // handleAgentEnd will fire again"),
+    "triage dispatch should return after sending message",
+  );
+});
+
+test("dispatch: triage imports hasPendingCaptures and loadPendingCaptures", () => {
+  assert.ok(
+    autoSrc.includes('hasPendingCaptures, loadPendingCaptures, countPendingCaptures') &&
+    autoSrc.includes('from "./captures.js"'),
+    "auto.ts should import capture functions including countPendingCaptures",
+  );
+});
+
+// ─── Prompt integration ──────────────────────────────────────────────────────
+
+test("dispatch: replan prompt builder loads capture context", () => {
+  const src = autoPromptsSrc;
+  assert.ok(
+    src.includes("loadReplanCaptures"),
+    "buildReplanSlicePrompt should load replan captures",
+  );
+  assert.ok(
+    src.includes("captureContext"),
+    "buildReplanSlicePrompt should pass captureContext to template",
+  );
+});
+
+test("dispatch: reassess prompt builder loads deferred captures", () => {
+  const src = autoPromptsSrc;
+  assert.ok(
+    src.includes("loadDeferredCaptures"),
+    "buildReassessRoadmapPrompt should load deferred captures",
+  );
+  assert.ok(
+    src.includes("deferredCaptures"),
+    "buildReassessRoadmapPrompt should pass deferredCaptures to template",
+  );
+});
+
+// ─── Prompt templates ────────────────────────────────────────────────────────
+
+test("dispatch: replan prompt template includes captureContext variable", () => {
+  const promptPath = join(__dirname, "..", "prompts", "replan-slice.md");
+  const prompt = readFileSync(promptPath, "utf-8");
+  assert.ok(
+    prompt.includes("{{captureContext}}"),
+    "replan-slice.md should include {{captureContext}}",
+  );
+});
+
+test("dispatch: reassess prompt template includes deferredCaptures variable", () => {
+  const promptPath = join(__dirname, "..", "prompts", "reassess-roadmap.md");
+  const prompt = readFileSync(promptPath, "utf-8");
+  assert.ok(
+    prompt.includes("{{deferredCaptures}}"),
+    "reassess-roadmap.md should include {{deferredCaptures}}",
+  );
+});
+
+test("dispatch: triage prompt template exists and has classification criteria", () => {
+  const promptPath = join(__dirname, "..", "prompts", "triage-captures.md");
+  const prompt = readFileSync(promptPath, "utf-8");
+  assert.ok(prompt.includes("quick-task"), "should have quick-task classification");
+  assert.ok(prompt.includes("inject"), "should have inject classification");
+  assert.ok(prompt.includes("defer"), "should have defer classification");
+  assert.ok(prompt.includes("replan"), "should have replan classification");
+  assert.ok(prompt.includes("note"), "should have note classification");
+  assert.ok(prompt.includes("{{pendingCaptures}}"), "should have pending captures variable");
+});
+
+// ─── Dashboard integration ───────────────────────────────────────────────────
+
+test("dashboard: AutoDashboardData includes pendingCaptureCount field", () => {
+  assert.ok(
+    autoSrc.includes("pendingCaptureCount"),
+    "auto.ts should have pendingCaptureCount in AutoDashboardData",
+  );
+});
+
+test("dashboard: getAutoDashboardData computes pendingCaptureCount", () => {
+  assert.ok(
+    autoSrc.includes("pendingCaptureCount = countPendingCaptures") ||
+    autoSrc.includes("pendingCaptureCount = countPendingCaptures(basePath)"),
+    "getAutoDashboardData should compute pendingCaptureCount from countPendingCaptures (single-read)",
+  );
+});
+
+test("dashboard: overlay renders pending captures badge", () => {
+  const overlayPath = join(__dirname, "..", "dashboard-overlay.ts");
+  const overlaySrc = readFileSync(overlayPath, "utf-8");
+  assert.ok(
+    overlaySrc.includes("pendingCaptureCount"),
+    "dashboard-overlay.ts should reference pendingCaptureCount",
+  );
+  assert.ok(
+    overlaySrc.includes("pending capture"),
+    "dashboard-overlay.ts should show pending captures text",
+  );
+});
+
+test("dashboard: overlay labels triage-captures and quick-task unit types", () => {
+  const overlayPath = join(__dirname, "..", "dashboard-overlay.ts");
+  const overlaySrc = readFileSync(overlayPath, "utf-8");
+  assert.ok(
+    overlaySrc.includes('"triage-captures"'),
+    "unitLabel should handle triage-captures",
+  );
+  assert.ok(
+    overlaySrc.includes('"quick-task"'),
+    "unitLabel should handle quick-task",
+  );
+});
--- a/src/resources/extensions/gsd/tests/triage-resolution.test.ts
+++ b/src/resources/extensions/gsd/tests/triage-resolution.test.ts
@ -0,0 +1,215 @@
+/**
+ * Unit tests for GSD Triage Resolution — resolution execution and file overlap detection.
+ */
+
+import test from "node:test";
+import assert from "node:assert/strict";
+import { mkdirSync, readFileSync, writeFileSync, rmSync, existsSync } from "node:fs";
+import { join } from "node:path";
+import { tmpdir } from "node:os";
+import { appendCapture, markCaptureResolved, loadAllCaptures } from "../captures.ts";
+// Import only the functions that don't depend on @gsd/pi-coding-agent
+// (triage-ui.ts imports next-action-ui.ts which imports the unavailable package)
+import { executeInject, executeReplan, detectFileOverlap, loadDeferredCaptures, loadReplanCaptures, buildQuickTaskPrompt } from "../triage-resolution.ts";
+
+function makeTempDir(prefix: string): string {
+  const dir = join(
+    tmpdir(),
+    `${prefix}-${Date.now()}-${Math.random().toString(36).slice(2)}`,
+  );
+  mkdirSync(dir, { recursive: true });
+  return dir;
+}
+
+function setupPlanFile(tmp: string, mid: string, sid: string, content: string): string {
+  const planDir = join(tmp, ".gsd", "milestones", mid, "slices", sid);
+  mkdirSync(planDir, { recursive: true });
+  const planPath = join(planDir, `${sid}-PLAN.md`);
+  writeFileSync(planPath, content, "utf-8");
+  return planPath;
+}
+
+const SAMPLE_PLAN = `# S01: Test Slice
+
+**Goal:** Test
+**Demo:** Test
+
+## Must-Haves
+
+- Something works
+
+## Tasks
+
+- [x] **T01: First task** \`est:1h\`
+  - Why: Setup
+  - Files: \`src/foo.ts\`, \`src/bar.ts\`
+  - Do: Build it
+  - Done when: Tests pass
+
+- [ ] **T02: Second task** \`est:1h\`
+  - Why: Feature
+  - Files: \`src/baz.ts\`, \`src/qux.ts\`
+  - Do: Build it
+  - Done when: Tests pass
+
+- [ ] **T03: Third task** \`est:30m\`
+  - Why: Polish
+  - Files: \`src/qux.ts\`, \`src/config.ts\`
+  - Do: Build it
+  - Done when: Tests pass
+
+## Files Likely Touched
+
+- \`src/foo.ts\`
+- \`src/bar.ts\`
+`;
+
+// ─── executeInject ────────────────────────────────────────────────────────────
+
+test("resolution: executeInject appends a new task to the plan", () => {
+  const tmp = makeTempDir("res-inject");
+  try {
+    const planPath = setupPlanFile(tmp, "M001", "S01", SAMPLE_PLAN);
+    const captureId = appendCapture(tmp, "add retry logic");
+    const captures = loadAllCaptures(tmp);
+    const capture = captures[0];
+
+    const newId = executeInject(tmp, "M001", "S01", capture);
+
+    assert.strictEqual(newId, "T04", "should be T04 (next after T03)");
+
+    const updated = readFileSync(planPath, "utf-8");
+    assert.ok(updated.includes("**T04:"), "should have T04 in plan");
+    assert.ok(updated.includes(capture.text), "should include capture text");
+    assert.ok(updated.includes("## Files Likely Touched"), "should preserve files section");
+
+    // T04 should appear before Files Likely Touched
+    const t04Pos = updated.indexOf("**T04:");
+    const filesPos = updated.indexOf("## Files Likely Touched");
+    assert.ok(t04Pos < filesPos, "T04 should be before Files section");
+  } finally {
+    rmSync(tmp, { recursive: true, force: true });
+  }
+});
+
+test("resolution: executeInject returns null when plan doesn't exist", () => {
+  const tmp = makeTempDir("res-inject-noplan");
+  try {
+    const captureId = appendCapture(tmp, "some task");
+    const captures = loadAllCaptures(tmp);
+    const result = executeInject(tmp, "M001", "S01", captures[0]);
+    assert.strictEqual(result, null);
+  } finally {
+    rmSync(tmp, { recursive: true, force: true });
+  }
+});
+
+// ─── executeReplan ────────────────────────────────────────────────────────────
+
+test("resolution: executeReplan writes REPLAN-TRIGGER.md", () => {
+  const tmp = makeTempDir("res-replan");
+  try {
+    setupPlanFile(tmp, "M001", "S01", SAMPLE_PLAN);
+    const captureId = appendCapture(tmp, "approach is wrong, need different strategy");
+    const captures = loadAllCaptures(tmp);
+    const capture = captures[0];
+
+    const result = executeReplan(tmp, "M001", "S01", capture);
+    assert.strictEqual(result, true);
+
+    const triggerPath = join(
+      tmp, ".gsd", "milestones", "M001", "slices", "S01", "S01-REPLAN-TRIGGER.md",
+    );
+    assert.ok(existsSync(triggerPath), "trigger file should exist");
+
+    const content = readFileSync(triggerPath, "utf-8");
+    assert.ok(content.includes(capture.id), "should include capture ID");
+    assert.ok(content.includes(capture.text), "should include capture text");
+    assert.ok(content.includes("# Replan Trigger"), "should have header");
+  } finally {
+    rmSync(tmp, { recursive: true, force: true });
+  }
+});
+
+// ─── detectFileOverlap ───────────────────────────────────────────────────────
+
+test("resolution: detectFileOverlap finds overlapping incomplete tasks", () => {
+  const overlaps = detectFileOverlap(["src/qux.ts"], SAMPLE_PLAN);
+  assert.deepStrictEqual(overlaps, ["T02", "T03"]);
+});
+
+test("resolution: detectFileOverlap ignores completed tasks", () => {
+  // T01 is [x] and uses src/foo.ts — should NOT be returned
+  const overlaps = detectFileOverlap(["src/foo.ts"], SAMPLE_PLAN);
+  assert.deepStrictEqual(overlaps, []);
+});
+
+test("resolution: detectFileOverlap returns empty when no overlap", () => {
+  const overlaps = detectFileOverlap(["src/unrelated.ts"], SAMPLE_PLAN);
+  assert.deepStrictEqual(overlaps, []);
+});
+
+test("resolution: detectFileOverlap returns empty for empty affected files", () => {
+  assert.deepStrictEqual(detectFileOverlap([], SAMPLE_PLAN), []);
+});
+
+test("resolution: detectFileOverlap is case-insensitive", () => {
+  const overlaps = detectFileOverlap(["SRC/QUX.TS"], SAMPLE_PLAN);
+  assert.deepStrictEqual(overlaps, ["T02", "T03"]);
+});
+
+// ─── loadDeferredCaptures / loadReplanCaptures ───────────────────────────────
+
+test("resolution: loadDeferredCaptures returns only deferred captures", () => {
+  const tmp = makeTempDir("res-deferred");
+  try {
+    const id1 = appendCapture(tmp, "deferred one");
+    const id2 = appendCapture(tmp, "note one");
+    const id3 = appendCapture(tmp, "deferred two");
+
+    markCaptureResolved(tmp, id1, "defer", "deferred to S03", "future work");
+    markCaptureResolved(tmp, id2, "note", "acknowledged", "just a note");
+    markCaptureResolved(tmp, id3, "defer", "deferred to S04", "later");
+
+    const deferred = loadDeferredCaptures(tmp);
+    assert.strictEqual(deferred.length, 2);
+    assert.strictEqual(deferred[0].id, id1);
+    assert.strictEqual(deferred[1].id, id3);
+  } finally {
+    rmSync(tmp, { recursive: true, force: true });
+  }
+});
+
+test("resolution: loadReplanCaptures returns only replan captures", () => {
+  const tmp = makeTempDir("res-replan-load");
+  try {
+    const id1 = appendCapture(tmp, "needs replan");
+    const id2 = appendCapture(tmp, "just a note");
+
+    markCaptureResolved(tmp, id1, "replan", "replan triggered", "approach changed");
+    markCaptureResolved(tmp, id2, "note", "acknowledged", "info only");
+
+    const replans = loadReplanCaptures(tmp);
+    assert.strictEqual(replans.length, 1);
+    assert.strictEqual(replans[0].id, id1);
+  } finally {
+    rmSync(tmp, { recursive: true, force: true });
+  }
+});
+
+// ─── buildQuickTaskPrompt ────────────────────────────────────────────────────
+
+test("resolution: buildQuickTaskPrompt includes capture text and ID", () => {
+  const prompt = buildQuickTaskPrompt({
+    id: "CAP-abc123",
+    text: "add retry logic to OAuth",
+    timestamp: "2026-03-15T20:00:00Z",
+    status: "resolved",
+    classification: "quick-task",
+  });
+
+  assert.ok(prompt.includes("CAP-abc123"), "should include capture ID");
+  assert.ok(prompt.includes("add retry logic to OAuth"), "should include capture text");
+  assert.ok(prompt.includes("Quick Task"), "should have Quick Task header");
+  assert.ok(prompt.includes("Do NOT modify"), "should warn about plan files");
+});
--- a/src/resources/extensions/gsd/tests/visualizer-data.test.ts
+++ b/src/resources/extensions/gsd/tests/visualizer-data.test.ts
@ -0,0 +1,198 @@
+// Tests for GSD visualizer data loader.
+// Verifies the VisualizerData interface shape and source-file contracts.
+
+import { readFileSync } from "node:fs";
+import { join, dirname } from "node:path";
+import { fileURLToPath } from "node:url";
+import { createTestContext } from "./test-helpers.ts";
+
+const __dirname = dirname(fileURLToPath(import.meta.url));
+const { assertTrue, report } = createTestContext();
+
+const dataPath = join(__dirname, "..", "visualizer-data.ts");
+const dataSrc = readFileSync(dataPath, "utf-8");
+
+console.log("\n=== visualizer-data.ts source contracts ===");
+
+// Interface exports
+assertTrue(
+  dataSrc.includes("export interface VisualizerData"),
+  "exports VisualizerData interface",
+);
+
+assertTrue(
+  dataSrc.includes("export interface VisualizerMilestone"),
+  "exports VisualizerMilestone interface",
+);
+
+assertTrue(
+  dataSrc.includes("export interface VisualizerSlice"),
+  "exports VisualizerSlice interface",
+);
+
+assertTrue(
+  dataSrc.includes("export interface VisualizerTask"),
+  "exports VisualizerTask interface",
+);
+
+// Function export
+assertTrue(
+  dataSrc.includes("export async function loadVisualizerData"),
+  "exports loadVisualizerData function",
+);
+
+// Data source usage
+assertTrue(
+  dataSrc.includes("deriveState"),
+  "uses deriveState for state derivation",
+);
+
+assertTrue(
+  dataSrc.includes("findMilestoneIds"),
+  "uses findMilestoneIds to enumerate milestones",
+);
+
+assertTrue(
+  dataSrc.includes("parseRoadmap"),
+  "uses parseRoadmap for roadmap parsing",
+);
+
+assertTrue(
+  dataSrc.includes("parsePlan"),
+  "uses parsePlan for plan parsing",
+);
+
+assertTrue(
+  dataSrc.includes("getLedger"),
+  "uses getLedger for in-memory metrics",
+);
+
+assertTrue(
+  dataSrc.includes("loadLedgerFromDisk"),
+  "uses loadLedgerFromDisk as fallback",
+);
+
+assertTrue(
+  dataSrc.includes("getProjectTotals"),
+  "uses getProjectTotals for aggregation",
+);
+
+assertTrue(
+  dataSrc.includes("aggregateByPhase"),
+  "uses aggregateByPhase",
+);
+
+assertTrue(
+  dataSrc.includes("aggregateBySlice"),
+  "uses aggregateBySlice",
+);
+
+assertTrue(
+  dataSrc.includes("aggregateByModel"),
+  "uses aggregateByModel",
+);
+
+// Interface fields
+assertTrue(
+  dataSrc.includes("dependsOn: string[]"),
+  "VisualizerMilestone has dependsOn field",
+);
+
+assertTrue(
+  dataSrc.includes("depends: string[]"),
+  "VisualizerSlice has depends field",
+);
+
+assertTrue(
+  dataSrc.includes("totals: ProjectTotals | null"),
+  "VisualizerData has nullable totals",
+);
+
+assertTrue(
+  dataSrc.includes("units: UnitMetrics[]"),
+  "VisualizerData has units array",
+);
+
+// Verify overlay source exists and imports data module
+const overlayPath = join(__dirname, "..", "visualizer-overlay.ts");
+const overlaySrc = readFileSync(overlayPath, "utf-8");
+
+console.log("\n=== visualizer-overlay.ts source contracts ===");
+
+assertTrue(
+  overlaySrc.includes("export class GSDVisualizerOverlay"),
+  "exports GSDVisualizerOverlay class",
+);
+
+assertTrue(
+  overlaySrc.includes("loadVisualizerData"),
+  "overlay uses loadVisualizerData",
+);
+
+assertTrue(
+  overlaySrc.includes("renderProgressView"),
+  "overlay delegates to renderProgressView",
+);
+
+assertTrue(
+  overlaySrc.includes("renderDepsView"),
+  "overlay delegates to renderDepsView",
+);
+
+assertTrue(
+  overlaySrc.includes("renderMetricsView"),
+  "overlay delegates to renderMetricsView",
+);
+
+assertTrue(
+  overlaySrc.includes("renderTimelineView"),
+  "overlay delegates to renderTimelineView",
+);
+
+assertTrue(
+  overlaySrc.includes("handleInput"),
+  "overlay has handleInput method",
+);
+
+assertTrue(
+  overlaySrc.includes("dispose"),
+  "overlay has dispose method",
+);
+
+assertTrue(
+  overlaySrc.includes("wrapInBox"),
+  "overlay has wrapInBox helper",
+);
+
+assertTrue(
+  overlaySrc.includes("activeTab"),
+  "overlay tracks active tab",
+);
+
+assertTrue(
+  overlaySrc.includes("scrollOffsets"),
+  "overlay tracks per-tab scroll offsets",
+);
+
+// Verify commands.ts integration
+const commandsPath = join(__dirname, "..", "commands.ts");
+const commandsSrc = readFileSync(commandsPath, "utf-8");
+
+console.log("\n=== commands.ts integration ===");
+
+assertTrue(
+  commandsSrc.includes('"visualize"'),
+  "commands.ts has visualize in subcommands array",
+);
+
+assertTrue(
+  commandsSrc.includes("GSDVisualizerOverlay"),
+  "commands.ts imports GSDVisualizerOverlay",
+);
+
+assertTrue(
+  commandsSrc.includes("handleVisualize"),
+  "commands.ts has handleVisualize handler",
+);
+
+report();
--- a/src/resources/extensions/gsd/tests/visualizer-views.test.ts
+++ b/src/resources/extensions/gsd/tests/visualizer-views.test.ts
@ -0,0 +1,255 @@
+// Tests for GSD visualizer view renderers.
+// Tests the pure view functions with mock data — no file I/O.
+
+import {
+  renderProgressView,
+  renderDepsView,
+  renderMetricsView,
+  renderTimelineView,
+} from "../visualizer-views.js";
+import type { VisualizerData } from "../visualizer-data.js";
+import { createTestContext } from "./test-helpers.ts";
+
+const { assertEq, assertTrue, report } = createTestContext();
+
+// ─── Mock theme ─────────────────────────────────────────────────────────────
+
+const mockTheme = {
+  fg: (_color: string, text: string) => text,
+  bold: (text: string) => text,
+} as any;
+
+// ─── Test data factories ────────────────────────────────────────────────────
+
+function makeVisualizerData(overrides: Partial<VisualizerData> = {}): VisualizerData {
+  return {
+    milestones: [],
+    phase: "executing",
+    totals: null,
+    byPhase: [],
+    bySlice: [],
+    byModel: [],
+    units: [],
+    ...overrides,
+  };
+}
+
+// ─── renderProgressView ─────────────────────────────────────────────────────
+
+console.log("\n=== renderProgressView ===");
+
+{
+  const data = makeVisualizerData({
+    milestones: [
+      {
+        id: "M001",
+        title: "First Milestone",
+        status: "active",
+        dependsOn: [],
+        slices: [
+          {
+            id: "S01",
+            title: "Core Types",
+            done: true,
+            active: false,
+            risk: "low",
+            depends: [],
+            tasks: [],
+          },
+          {
+            id: "S02",
+            title: "State Engine",
+            done: false,
+            active: true,
+            risk: "high",
+            depends: ["S01"],
+            tasks: [
+              { id: "T01", title: "Dispatch Loop", done: false, active: true },
+              { id: "T02", title: "Session Mgmt", done: true, active: false },
+            ],
+          },
+          {
+            id: "S03",
+            title: "Dashboard",
+            done: false,
+            active: false,
+            risk: "medium",
+            depends: ["S02"],
+            tasks: [],
+          },
+        ],
+      },
+      {
+        id: "M002",
+        title: "Plugin Arch",
+        status: "pending",
+        dependsOn: ["M001"],
+        slices: [],
+      },
+    ],
+  });
+
+  const lines = renderProgressView(data, mockTheme, 80);
+  assertTrue(lines.length > 0, "progress view produces output");
+  assertTrue(lines.some(l => l.includes("M001")), "shows milestone M001");
+  assertTrue(lines.some(l => l.includes("S01")), "shows slice S01");
+  assertTrue(lines.some(l => l.includes("T01")), "shows task T01 for active slice");
+  assertTrue(lines.some(l => l.includes("M002")), "shows milestone M002");
+  assertTrue(lines.some(l => l.includes("depends on M001")), "shows dependency note");
+}
+
+{
+  const data = makeVisualizerData({ milestones: [] });
+  const lines = renderProgressView(data, mockTheme, 80);
+  assertEq(lines.length, 0, "empty milestones produce no lines");
+}
+
+// ─── renderDepsView ─────────────────────────────────────────────────────────
+
+console.log("\n=== renderDepsView ===");
+
+{
+  const data = makeVisualizerData({
+    milestones: [
+      {
+        id: "M001",
+        title: "First",
+        status: "active",
+        dependsOn: [],
+        slices: [
+          { id: "S01", title: "A", done: false, active: true, risk: "low", depends: [], tasks: [] },
+          { id: "S02", title: "B", done: false, active: false, risk: "low", depends: ["S01"], tasks: [] },
+        ],
+      },
+      {
+        id: "M002",
+        title: "Second",
+        status: "pending",
+        dependsOn: ["M001"],
+        slices: [],
+      },
+    ],
+  });
+
+  const lines = renderDepsView(data, mockTheme, 80);
+  assertTrue(lines.length > 0, "deps view produces output");
+  assertTrue(lines.some(l => l.includes("M001") && l.includes("M002")), "shows milestone dep edge");
+  assertTrue(lines.some(l => l.includes("S01") && l.includes("S02")), "shows slice dep edge");
+}
+
+{
+  const data = makeVisualizerData({
+    milestones: [
+      { id: "M001", title: "Only", status: "active", dependsOn: [], slices: [] },
+    ],
+  });
+
+  const lines = renderDepsView(data, mockTheme, 80);
+  assertTrue(lines.some(l => l.includes("No milestone dependencies")), "shows no-deps message");
+}
+
+// ─── renderMetricsView ──────────────────────────────────────────────────────
+
+console.log("\n=== renderMetricsView ===");
+
+{
+  const data = makeVisualizerData({
+    totals: {
+      units: 5,
+      tokens: { input: 1000, output: 500, cacheRead: 200, cacheWrite: 100, total: 1800 },
+      cost: 2.50,
+      duration: 60000,
+      toolCalls: 15,
+      assistantMessages: 10,
+      userMessages: 5,
+    },
+    byPhase: [
+      {
+        phase: "execution",
+        units: 3,
+        tokens: { input: 600, output: 300, cacheRead: 100, cacheWrite: 50, total: 1050 },
+        cost: 1.50,
+        duration: 40000,
+      },
+      {
+        phase: "planning",
+        units: 2,
+        tokens: { input: 400, output: 200, cacheRead: 100, cacheWrite: 50, total: 750 },
+        cost: 1.00,
+        duration: 20000,
+      },
+    ],
+    byModel: [
+      {
+        model: "claude-opus-4-6",
+        units: 5,
+        tokens: { input: 1000, output: 500, cacheRead: 200, cacheWrite: 100, total: 1800 },
+        cost: 2.50,
+      },
+    ],
+  });
+
+  const lines = renderMetricsView(data, mockTheme, 80);
+  assertTrue(lines.length > 0, "metrics view produces output");
+  assertTrue(lines.some(l => l.includes("$2.50")), "shows total cost");
+  assertTrue(lines.some(l => l.includes("execution")), "shows phase name");
+  assertTrue(lines.some(l => l.includes("claude-opus-4-6")), "shows model name");
+}
+
+{
+  const data = makeVisualizerData({ totals: null });
+  const lines = renderMetricsView(data, mockTheme, 80);
+  assertTrue(lines.some(l => l.includes("No metrics data")), "shows no-data message");
+}
+
+// ─── renderTimelineView ─────────────────────────────────────────────────────
+
+console.log("\n=== renderTimelineView ===");
+
+{
+  const now = Date.now();
+  const data = makeVisualizerData({
+    units: [
+      {
+        type: "execute-task",
+        id: "M001/S01/T01",
+        model: "claude-opus-4-6",
+        startedAt: now - 120000,
+        finishedAt: now - 60000,
+        tokens: { input: 500, output: 200, cacheRead: 100, cacheWrite: 50, total: 850 },
+        cost: 0.42,
+        toolCalls: 5,
+        assistantMessages: 3,
+        userMessages: 1,
+      },
+      {
+        type: "plan-slice",
+        id: "M001/S02",
+        model: "claude-opus-4-6",
+        startedAt: now - 60000,
+        finishedAt: now - 30000,
+        tokens: { input: 300, output: 150, cacheRead: 50, cacheWrite: 25, total: 525 },
+        cost: 0.18,
+        toolCalls: 2,
+        assistantMessages: 2,
+        userMessages: 1,
+      },
+    ],
+  });
+
+  const lines = renderTimelineView(data, mockTheme, 80);
+  assertTrue(lines.length >= 2, "timeline view produces lines for each unit");
+  assertTrue(lines.some(l => l.includes("execute-task")), "shows unit type");
+  assertTrue(lines.some(l => l.includes("M001/S01/T01")), "shows unit id");
+  assertTrue(lines.some(l => l.includes("$0.42")), "shows unit cost");
+}
+
+{
+  const data = makeVisualizerData({ units: [] });
+  const lines = renderTimelineView(data, mockTheme, 80);
+  assertTrue(lines.some(l => l.includes("No execution history")), "shows empty message");
+}
+
+// ─── Report ─────────────────────────────────────────────────────────────────
+
+report();
--- a/src/resources/extensions/gsd/triage-resolution.ts
+++ b/src/resources/extensions/gsd/triage-resolution.ts
@ -0,0 +1,200 @@
+/**
+ * GSD Triage Resolution — Execute triage classifications
+ *
+ * Provides resolution executors for each capture classification type:
+ *
+ * - inject: appends a new task to the current slice plan
+ * - replan: writes REPLAN-TRIGGER.md so next dispatchNextUnit enters replanning-slice
+ * - defer/note: query helpers for loading deferred/replan captures
+ *
+ * Also provides detectFileOverlap() for surfacing downstream impact on quick tasks.
+ */
+
+import { existsSync, readFileSync, writeFileSync } from "node:fs";
+import { join } from "node:path";
+import type { Classification, CaptureEntry } from "./captures.js";
+import {
+  loadPendingCaptures,
+  loadAllCaptures,
+  markCaptureResolved,
+} from "./captures.js";
+
+// ─── Resolution Executors ─────────────────────────────────────────────────────
+
+/**
+ * Inject a new task into the current slice plan.
+ * Reads the plan, finds the highest task ID, appends a new task entry.
+ * Returns the new task ID, or null if injection failed.
+ */
+export function executeInject(
+  basePath: string,
+  mid: string,
+  sid: string,
+  capture: CaptureEntry,
+): string | null {
+  try {
+    // Resolve the plan file path
+    const planPath = join(basePath, ".gsd", "milestones", mid, "slices", sid, `${sid}-PLAN.md`);
+    if (!existsSync(planPath)) return null;
+
+    const content = readFileSync(planPath, "utf-8");
+
+    // Find the highest existing task ID
+    const taskMatches = [...content.matchAll(/- \[[ x]\] \*\*T(\d+):/g)];
+    if (taskMatches.length === 0) return null;
+
+    const maxId = Math.max(...taskMatches.map(m => parseInt(m[1], 10)));
+    const newId = `T${String(maxId + 1).padStart(2, "0")}`;
+
+    // Build the new task entry
+    const newTask = [
+      `- [ ] **${newId}: ${capture.text}** \`est:30m\``,
+      `  - Why: Injected from capture ${capture.id} during triage`,
+      `  - Do: ${capture.text}`,
+      `  - Done when: Capture intent fulfilled`,
+    ].join("\n");
+
+    // Find the last task entry and append after it
+    // Look for the "## Files Likely Touched" section as the boundary
+    const filesSection = content.indexOf("## Files Likely Touched");
+    if (filesSection !== -1) {
+      const updated = content.slice(0, filesSection) + newTask + "\n\n" + content.slice(filesSection);
+      writeFileSync(planPath, updated, "utf-8");
+    } else {
+      // No Files section — append at end
+      writeFileSync(planPath, content.trimEnd() + "\n\n" + newTask + "\n", "utf-8");
+    }
+
+    return newId;
+  } catch {
+    return null;
+  }
+}
+
+/**
+ * Trigger replanning by writing a REPLAN-TRIGGER.md marker file.
+ * The existing state.ts derivation detects this and sets phase to "replanning-slice".
+ * Returns true if the trigger was written successfully.
+ */
+export function executeReplan(
+  basePath: string,
+  mid: string,
+  sid: string,
+  capture: CaptureEntry,
+): boolean {
+  try {
+    const triggerPath = join(
+      basePath, ".gsd", "milestones", mid, "slices", sid, `${sid}-REPLAN-TRIGGER.md`,
+    );
+    const content = [
+      `# Replan Trigger`,
+      ``,
+      `**Source:** Capture ${capture.id}`,
+      `**Capture:** ${capture.text}`,
+      `**Rationale:** ${capture.rationale ?? "User-initiated replan via capture triage"}`,
+      `**Triggered:** ${new Date().toISOString()}`,
+      ``,
+      `This file was created by the triage pipeline. The next dispatch cycle`,
+      `will detect it and enter the replanning-slice phase.`,
+    ].join("\n");
+
+    writeFileSync(triggerPath, content, "utf-8");
+    return true;
+  } catch {
+    return false;
+  }
+}
+
+// ─── File Overlap Detection ───────────────────────────────────────────────────
+
+/**
+ * Detect file overlap between a capture's affected files and planned tasks.
+ *
+ * Parses the slice plan for task file references and returns task IDs
+ * whose files overlap with the capture's affected files.
+ *
+ * @param affectedFiles - Files the capture would touch
+ * @param planContent - Content of the slice plan.md
+ * @returns Array of task IDs (e.g., ["T03", "T04"]) whose files overlap
+ */
+export function detectFileOverlap(
+  affectedFiles: string[],
+  planContent: string,
+): string[] {
+  if (!affectedFiles || affectedFiles.length === 0) return [];
+
+  const overlappingTasks: string[] = [];
+
+  // Normalize affected files for comparison
+  const normalizedAffected = new Set(
+    affectedFiles.map(f => f.replace(/^\.\//, "").toLowerCase()),
+  );
+
+  // Parse plan for incomplete tasks and their file references
+  const taskPattern = /- \[ \] \*\*(T\d+):[^*]*\*\*/g;
+  const tasks = [...planContent.matchAll(taskPattern)];
+
+  for (const taskMatch of tasks) {
+    const taskId = taskMatch[1];
+    const taskStart = taskMatch.index!;
+
+    // Find the end of this task (next task or end of section)
+    const nextTask = planContent.indexOf("- [", taskStart + 1);
+    const sectionEnd = planContent.indexOf("##", taskStart + 1);
+    const taskEnd = Math.min(
+      nextTask === -1 ? planContent.length : nextTask,
+      sectionEnd === -1 ? planContent.length : sectionEnd,
+    );
+
+    const taskContent = planContent.slice(taskStart, taskEnd);
+
+    // Extract file references — look for backtick-quoted paths
+    const fileRefs = [...taskContent.matchAll(/`([^`]+\.[a-z]+)`/g)]
+      .map(m => m[1].replace(/^\.\//, "").toLowerCase());
+
+    // Check for overlap
+    const hasOverlap = fileRefs.some(f => normalizedAffected.has(f));
+    if (hasOverlap) {
+      overlappingTasks.push(taskId);
+    }
+  }
+
+  return overlappingTasks;
+}
+
+/**
+ * Load deferred captures (classification === "defer") for injection into
+ * reassess-roadmap prompts.
+ */
+export function loadDeferredCaptures(basePath: string): CaptureEntry[] {
+  return loadAllCaptures(basePath).filter(c => c.classification === "defer");
+}
+
+/**
+ * Load replan-triggering captures for injection into replan-slice prompts.
+ */
+export function loadReplanCaptures(basePath: string): CaptureEntry[] {
+  return loadAllCaptures(basePath).filter(c => c.classification === "replan");
+}
+
+/**
+ * Build a quick-task execution prompt from a capture.
+ */
+export function buildQuickTaskPrompt(capture: CaptureEntry): string {
+  return [
+    `You are executing a quick one-off task captured during a GSD auto-mode session.`,
+    ``,
+    `## Quick Task`,
+    ``,
+    `**Capture ID:** ${capture.id}`,
+    `**Task:** ${capture.text}`,
+    ``,
+    `## Instructions`,
+    ``,
+    `1. Execute this task as a small, self-contained change.`,
+    `2. Do NOT modify any \`.gsd/\` plan files — this is a one-off, not a planned task.`,
+    `3. Commit your changes with a descriptive message.`,
+    `4. Keep changes minimal and focused on the capture text.`,
+    `5. When done, say: "Quick task complete."`,
+  ].join("\n");
+}
--- a/src/resources/extensions/gsd/triage-ui.ts
+++ b/src/resources/extensions/gsd/triage-ui.ts
@ -0,0 +1,175 @@
+/**
+ * GSD Triage UI — Confirmation flow for programmatic triage results
+ *
+ * Used by auto-mode dispatch (S02) when triage fires between tasks.
+ * For manual `/gsd triage`, the LLM session handles confirmation directly.
+ *
+ * This module provides `showTriageConfirmation` which presents each
+ * triage result to the user via `showNextAction` and returns the
+ * confirmed classifications.
+ */
+
+import type { ExtensionCommandContext } from "@gsd/pi-coding-agent";
+import { showNextAction } from "../shared/next-action-ui.js";
+import type { CaptureEntry, Classification, TriageResult } from "./captures.js";
+import { markCaptureResolved } from "./captures.js";
+
+// ─── Types ────────────────────────────────────────────────────────────────────
+
+export interface ConfirmedTriage {
+  captureId: string;
+  classification: Classification;
+  rationale: string;
+  affectedFiles?: string[];
+  targetSlice?: string;
+  userOverride: boolean;  // true if user changed the proposed classification
+}
+
+// ─── Classification Labels ────────────────────────────────────────────────────
+
+const CLASSIFICATION_LABELS: Record<Classification, { label: string; description: string }> = {
+  "quick-task": {
+    label: "Quick task",
+    description: "Execute as a one-off at the next seam — no plan modification.",
+  },
+  "inject": {
+    label: "Inject into plan",
+    description: "Add a new task to the current slice plan.",
+  },
+  "defer": {
+    label: "Defer",
+    description: "Move to a future slice or milestone — not urgent now.",
+  },
+  "replan": {
+    label: "Replan slice",
+    description: "Remaining tasks need rewriting — triggers slice replan.",
+  },
+  "note": {
+    label: "Note",
+    description: "Informational only — no action needed.",
+  },
+};
+
+const ALL_CLASSIFICATIONS: Classification[] = [
+  "quick-task", "inject", "defer", "replan", "note",
+];
+
+// ─── Public API ───────────────────────────────────────────────────────────────
+
+/**
+ * Present triage results to the user for confirmation.
+ *
+ * For each capture:
+ * - note/defer: auto-confirm (no user interaction needed)
+ * - quick-task/inject/replan: show confirmation UI with proposed + alternatives
+ *
+ * Returns confirmed results with final classifications.
+ * Updates CAPTURES.md with resolved status.
+ *
+ * @param fileOverlaps - Map of captureId → list of planned task IDs whose files overlap
+ */
+export async function showTriageConfirmation(
+  ctx: ExtensionCommandContext,
+  triageResults: TriageResult[],
+  captures: CaptureEntry[],
+  basePath: string,
+  fileOverlaps?: Map<string, string[]>,
+): Promise<ConfirmedTriage[]> {
+  const confirmed: ConfirmedTriage[] = [];
+  const captureMap = new Map(captures.map(c => [c.id, c]));
+
+  for (const result of triageResults) {
+    const capture = captureMap.get(result.captureId);
+    if (!capture) continue;
+
+    // Auto-confirm note and defer — low-impact, no plan modification
+    if (result.classification === "note" || result.classification === "defer") {
+      const resolution = result.classification === "note"
+        ? "acknowledged as note"
+        : `deferred${result.targetSlice ? ` to ${result.targetSlice}` : ""}`;
+
+      markCaptureResolved(
+        basePath,
+        result.captureId,
+        result.classification,
+        resolution,
+        result.rationale,
+      );
+
+      confirmed.push({
+        captureId: result.captureId,
+        classification: result.classification,
+        rationale: result.rationale,
+        affectedFiles: result.affectedFiles,
+        targetSlice: result.targetSlice,
+        userOverride: false,
+      });
+      continue;
+    }
+
+    // Build summary lines for the confirmation UI
+    const summary: string[] = [
+      `"${capture.text}"`,
+      "",
+      `Proposed: **${CLASSIFICATION_LABELS[result.classification].label}** — ${result.rationale}`,
+    ];
+
+    // Add file overlap warning if present
+    const overlaps = fileOverlaps?.get(result.captureId);
+    if (overlaps && overlaps.length > 0) {
+      summary.push("");
+      summary.push(`⚠ Touches files planned for ${overlaps.join(", ")} — consider inject or defer`);
+    }
+
+    if (result.affectedFiles && result.affectedFiles.length > 0) {
+      summary.push("");
+      summary.push(`Files: ${result.affectedFiles.join(", ")}`);
+    }
+
+    // Build action options — proposed first (recommended), then alternatives
+    const proposed = result.classification;
+    const actions = ALL_CLASSIFICATIONS.map(cls => ({
+      id: cls,
+      label: CLASSIFICATION_LABELS[cls].label,
+      description: CLASSIFICATION_LABELS[cls].description,
+      recommended: cls === proposed,
+    }));
+
+    const choice = await showNextAction(ctx as any, {
+      title: `Triage: ${result.captureId}`,
+      summary,
+      actions,
+      notYetMessage: "Capture will remain pending for later triage.",
+    });
+
+    if (choice === "not_yet") {
+      // User skipped — leave capture pending
+      continue;
+    }
+
+    const finalClassification = choice as Classification;
+    const userOverride = finalClassification !== proposed;
+    const resolution = userOverride
+      ? `user chose ${finalClassification} (was ${proposed})`
+      : `confirmed as ${finalClassification}`;
+
+    markCaptureResolved(
+      basePath,
+      result.captureId,
+      finalClassification,
+      resolution,
+      userOverride ? `User override: ${result.rationale}` : result.rationale,
+    );
+
+    confirmed.push({
+      captureId: result.captureId,
+      classification: finalClassification,
+      rationale: result.rationale,
+      affectedFiles: result.affectedFiles,
+      targetSlice: result.targetSlice,
+      userOverride,
+    });
+  }
+
+  return confirmed;
+}
--- a/Show more
+++ b/Show more