From 330e5200bc3e5dd2757cba51f6aaff9edbc510cf Mon Sep 17 00:00:00 2001
From: Tom Boucher <trekkie@nomorestars.com>
Date: Mon, 16 Mar 2026 11:00:58 -0400
Subject: [PATCH] docs: add v2.18/v2.19 feature documentation (#631)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

New docs:
- dynamic-model-routing.md — complexity classification, tier models,
  escalation, budget pressure, cost table, adaptive learning
- captures-triage.md — fire-and-forget capture, triage pipeline,
  classification types, dashboard integration, worktree awareness
- visualizer.md — four-tab TUI overlay (progress, deps, metrics,
  timeline), controls, auto-refresh, auto_visualize preference

Updated docs:
- README.md — added links to three new docs
- commands.md — added capture, triage, visualize, knowledge, queue reorder
- configuration.md — added dynamic_routing and auto_visualize settings,
  updated full example with new config options
- auto-mode.md — added capture, visualize sections, dashboard badge,
  dynamic model routing reference
- architecture.md — updated dispatch pipeline (routing + captures steps),
  added key modules table for v2.19
- cost-management.md — added dynamic routing and visualizer tips
---
 docs/README.md                |   3 +
 docs/architecture.md          |  46 +++++++++---
 docs/auto-mode.md             |  21 ++++++
 docs/captures-triage.md       |  82 ++++++++++++++++++++++
 docs/commands.md              |   6 +-
 docs/configuration.md         |  37 +++++++++-
 docs/cost-management.md       |   2 +
 docs/dynamic-model-routing.md | 127 ++++++++++++++++++++++++++++++++++
 docs/visualizer.md            |  92 ++++++++++++++++++++++++
 9 files changed, 403 insertions(+), 13 deletions(-)
 create mode 100644 docs/captures-triage.md
 create mode 100644 docs/dynamic-model-routing.md
 create mode 100644 docs/visualizer.md

diff --git a/docs/README.md b/docs/README.md
index ce50fd528..0bba640de 100644
--- a/docs/README.md
+++ b/docs/README.md
@@ -12,6 +12,9 @@ Welcome to the GSD documentation. This covers everything from getting started to
 | [Remote Questions](./remote-questions.md) | Discord and Slack integration for headless auto-mode |
 | [Configuration](./configuration.md) | Preferences, model selection, git settings, and token profiles |
 | [Token Optimization](./token-optimization.md) | Token profiles, context compression, complexity routing, and adaptive learning (v2.17) |
+| [Dynamic Model Routing](./dynamic-model-routing.md) | Complexity-based model selection, cost tables, escalation, and budget pressure (v2.19) |
+| [Captures & Triage](./captures-triage.md) | Fire-and-forget thought capture during auto-mode with automated triage (v2.19) |
+| [Workflow Visualizer](./visualizer.md) | Interactive TUI overlay for progress, dependencies, metrics, and timeline (v2.19) |
 | [Cost Management](./cost-management.md) | Budget ceilings, cost tracking, projections, and enforcement modes |
 | [Git Strategy](./git-strategy.md) | Worktree isolation, branching model, and merge behavior |
 | [Working in Teams](./working-in-teams.md) | Unique milestone IDs, `.gitignore` setup, and shared planning artifacts |
diff --git a/docs/architecture.md b/docs/architecture.md
index 38ec524a2..3fc29d2ca 100644
--- a/docs/architecture.md
+++ b/docs/architecture.md
@@ -92,17 +92,41 @@ Performance-critical operations use a Rust N-API engine:
 The auto mode dispatch pipeline:
 
 ```
-1. Read disk state (STATE.md, roadmap, plans)
-2. Determine next unit type and ID
-3. Classify complexity → select model tier
-4. Apply budget pressure adjustments
-5. Check routing history for adaptive adjustments
-6. Resolve effective model (with fallbacks)
-7. Build dispatch prompt (applying inline level compression)
-8. Create fresh agent session
-9. Inject prompt and let LLM execute
-10. On completion: snapshot metrics, verify artifacts, persist state
-11. Loop to step 1
+1.  Read disk state (STATE.md, roadmap, plans)
+2.  Determine next unit type and ID
+3.  Classify complexity → select model tier
+4.  Apply budget pressure adjustments
+5.  Check routing history for adaptive adjustments
+6.  Dynamic model routing (if enabled) → select cheapest model for tier
+7.  Resolve effective model (with fallbacks)
+8.  Check pending captures → triage if needed
+9.  Build dispatch prompt (applying inline level compression)
+10. Create fresh agent session
+11. Inject prompt and let LLM execute
+12. On completion: snapshot metrics, verify artifacts, persist state
+13. Loop to step 1
 ```
 
 Phase skipping (from token profile) gates steps 2-3: if a phase is skipped, the corresponding unit type is never dispatched.
+
+## Key Modules (v2.19)
+
+| Module | Purpose |
+|--------|---------|
+| `auto.ts` | Auto-mode state machine and orchestration |
+| `auto-dispatch.ts` | Declarative dispatch table (phase → unit mapping) |
+| `auto-prompts.ts` | Prompt builders with inline level compression |
+| `auto-worktree.ts` | Worktree lifecycle (create, enter, merge, teardown) |
+| `complexity-classifier.ts` | Unit complexity classification (light/standard/heavy) |
+| `model-router.ts` | Dynamic model routing with cost-aware selection |
+| `model-cost-table.ts` | Built-in per-model cost data for cross-provider comparison |
+| `routing-history.ts` | Adaptive learning from routing outcomes |
+| `captures.ts` | Fire-and-forget thought capture and triage classification |
+| `triage-resolution.ts` | Capture resolution (inject, defer, replan, quick-task) |
+| `visualizer-overlay.ts` | Workflow visualizer TUI overlay |
+| `visualizer-data.ts` | Data loading for visualizer tabs |
+| `visualizer-views.ts` | Tab renderers (progress, deps, metrics, timeline) |
+| `metrics.ts` | Token and cost tracking ledger |
+| `state.ts` | State derivation from disk |
+| `preferences.ts` | Preference loading, merging, validation |
+| `queue-order.ts` | Milestone queue ordering |
diff --git a/docs/auto-mode.md b/docs/auto-mode.md
index f930cee55..6b548e127 100644
--- a/docs/auto-mode.md
+++ b/docs/auto-mode.md
@@ -120,6 +120,22 @@ Stops auto mode gracefully. Can be run from a different terminal.
 
 Hard-steer plan documents during execution without stopping the pipeline. Changes are picked up at the next phase boundary.
 
+### Capture
+
+```
+/gsd capture "add rate limiting to API endpoints"
+```
+
+Fire-and-forget thought capture. Captures are triaged automatically between tasks. See [Captures & Triage](./captures-triage.md).
+
+### Visualize
+
+```
+/gsd visualize
+```
+
+Open the workflow visualizer — interactive tabs for progress, dependencies, metrics, and timeline. See [Workflow Visualizer](./visualizer.md).
+
 ## Dashboard
 
 `Ctrl+Alt+G` or `/gsd status` shows real-time progress:
@@ -129,6 +145,7 @@ Hard-steer plan documents during execution without stopping the pipeline. Change
 - Per-unit cost and token breakdown
 - Cost projections
 - Completed and in-progress units
+- Pending capture count (when captures are awaiting triage)
 
 ## Phase Skipping
 
@@ -141,3 +158,7 @@ Token profiles can skip certain phases to reduce cost:
 | Reassess Roadmap | Skipped | Runs | Runs |
 
 See [Token Optimization](./token-optimization.md) for details.
+
+## Dynamic Model Routing
+
+When enabled, auto-mode automatically selects cheaper models for simple units (slice completion, UAT) and reserves expensive models for complex work (replanning, architectural tasks). See [Dynamic Model Routing](./dynamic-model-routing.md).
diff --git a/docs/captures-triage.md b/docs/captures-triage.md
new file mode 100644
index 000000000..1c5f7e3f7
--- /dev/null
+++ b/docs/captures-triage.md
@@ -0,0 +1,82 @@
+# Captures & Triage
+
+*Introduced in v2.19.0*
+
+Captures let you fire-and-forget thoughts during auto-mode execution. Instead of pausing auto-mode to steer, you can capture ideas, bugs, or scope changes and let GSD triage them at natural seams between tasks.
+
+## Quick Start
+
+While auto-mode is running (or any time):
+
+```
+/gsd capture "add rate limiting to the API endpoints"
+/gsd capture "the auth flow should support OAuth, not just JWT"
+```
+
+Captures are appended to `.gsd/CAPTURES.md` and triaged automatically between tasks.
+
+## How It Works
+
+### Pipeline
+
+```
+capture → triage → confirm → resolve → resume
+```
+
+1. **Capture** — `/gsd capture "thought"` appends to `.gsd/CAPTURES.md` with a timestamp and unique ID
+2. **Triage** — at natural seams between tasks (in `handleAgentEnd`), GSD detects pending captures and classifies them
+3. **Confirm** — the user is shown the proposed resolution and confirms or adjusts
+4. **Resolve** — the resolution is applied (task injection, replan trigger, deferral, etc.)
+5. **Resume** — auto-mode continues
+
+### Classification Types
+
+Each capture is classified into one of five types:
+
+| Type | Meaning | Resolution |
+|------|---------|------------|
+| `quick-task` | Small, self-contained fix | Inline quick task executed immediately |
+| `inject` | New task needed in current slice | Task injected into the active slice plan |
+| `defer` | Important but not urgent | Deferred to roadmap reassessment |
+| `replan` | Changes the current approach | Triggers slice replan with capture context |
+| `note` | Informational, no action needed | Acknowledged, no plan changes |
+
+### Automatic Triage
+
+Triage fires automatically between tasks during auto-mode. The triage prompt receives:
+- All pending captures
+- The current slice plan
+- The active roadmap
+
+The LLM classifies each capture and proposes a resolution. Plan-modifying resolutions (inject, replan) require user confirmation.
+
+### Manual Triage
+
+Trigger triage manually at any time:
+
+```
+/gsd triage
+```
+
+This is useful when you've accumulated several captures and want to process them before the next natural seam.
+
+## Dashboard Integration
+
+The progress widget shows a pending capture count badge when captures are waiting for triage. This is visible in both the `Ctrl+Alt+G` dashboard and the auto-mode progress widget.
+
+## Context Injection
+
+Capture context is automatically injected into:
+- **Replan-slice prompts** — so the replan knows what triggered it
+- **Reassess-roadmap prompts** — so deferred captures influence roadmap decisions
+
+## Worktree Awareness
+
+Captures always resolve to the **original project root's** `.gsd/CAPTURES.md`, not the worktree's local copy. This ensures captures from a steering terminal are visible to the auto-mode session running in a worktree.
+
+## Commands
+
+| Command | Description |
+|---------|-------------|
+| `/gsd capture "text"` | Capture a thought (quotes optional for single words) |
+| `/gsd triage` | Manually trigger triage of pending captures |
diff --git a/docs/commands.md b/docs/commands.md
index 5414ea16e..a026e5803 100644
--- a/docs/commands.md
+++ b/docs/commands.md
@@ -11,7 +11,11 @@
 | `/gsd steer` | Hard-steer plan documents during execution |
 | `/gsd discuss` | Discuss architecture and decisions (works alongside auto mode) |
 | `/gsd status` | Progress dashboard |
-| `/gsd queue` | Queue future milestones (safe during auto mode) |
+| `/gsd queue` | Queue and reorder future milestones (safe during auto mode) |
+| `/gsd capture` | Fire-and-forget thought capture (works during auto mode) |
+| `/gsd triage` | Manually trigger triage of pending captures |
+| `/gsd visualize` | Open workflow visualizer (progress, deps, metrics, timeline) |
+| `/gsd knowledge` | Add persistent project knowledge (rule, pattern, or lesson) |
 | `/gsd prefs` | Model selection, timeouts, budget ceiling |
 | `/gsd migrate` | Migrate a v1 `.planning` directory to `.gsd` format |
 | `/gsd doctor` | Validate `.gsd/` integrity, find and fix issues |
diff --git a/docs/configuration.md b/docs/configuration.md
index 8b74333d1..d05ce6dc1 100644
--- a/docs/configuration.md
+++ b/docs/configuration.md
@@ -334,7 +334,33 @@ custom_instructions:
   - "Prefer functional patterns over classes"
 ```
 
-For project-specific knowledge (patterns, gotchas, lessons learned), use `.gsd/KNOWLEDGE.md` instead — it's injected into every agent prompt automatically.
+For project-specific knowledge (patterns, gotchas, lessons learned), use `.gsd/KNOWLEDGE.md` instead — it's injected into every agent prompt automatically. Add entries with `/gsd knowledge rule|pattern|lesson <description>`.
+
+### `dynamic_routing`
+
+Complexity-based model routing. See [Dynamic Model Routing](./dynamic-model-routing.md).
+
+```yaml
+dynamic_routing:
+  enabled: true
+  tier_models:
+    light: claude-haiku-4-5
+    standard: claude-sonnet-4-6
+    heavy: claude-opus-4-6
+  escalate_on_failure: true
+  budget_pressure: true
+  cross_provider: true
+```
+
+### `auto_visualize`
+
+Show the workflow visualizer automatically after milestone completion:
+
+```yaml
+auto_visualize: true
+```
+
+See [Workflow Visualizer](./visualizer.md).
 
 ## Full Example
 
@@ -356,6 +382,12 @@ models:
 # Token optimization
 token_profile: balanced
 
+# Dynamic model routing
+dynamic_routing:
+  enabled: true
+  escalate_on_failure: true
+  budget_pressure: true
+
 # Budget
 budget_ceiling: 25.00
 budget_enforcement: pause
@@ -387,6 +419,9 @@ notifications:
   on_milestone: true
   on_attention: true
 
+# Visualizer
+auto_visualize: true
+
 # Hooks
 post_unit_hooks:
   - name: code-review
diff --git a/docs/cost-management.md b/docs/cost-management.md
index efd3398e6..06214590d 100644
--- a/docs/cost-management.md
+++ b/docs/cost-management.md
@@ -89,3 +89,5 @@ See [Token Optimization](./token-optimization.md) for details.
 - Switch to `budget` profile for well-understood, repetitive work
 - Use `quality` only when architectural decisions are being made
 - Per-phase model selection lets you use Opus only for planning while keeping execution on Sonnet
+- Enable `dynamic_routing` for automatic model downgrading on simple tasks — see [Dynamic Model Routing](./dynamic-model-routing.md)
+- Use `/gsd visualize` → Metrics tab to see where your budget is going
diff --git a/docs/dynamic-model-routing.md b/docs/dynamic-model-routing.md
new file mode 100644
index 000000000..9d0d5525e
--- /dev/null
+++ b/docs/dynamic-model-routing.md
@@ -0,0 +1,127 @@
+# Dynamic Model Routing
+
+*Introduced in v2.19.0*
+
+Dynamic model routing automatically selects cheaper models for simple work and reserves expensive models for complex tasks. This reduces token consumption by 20-50% on capped plans without sacrificing quality where it matters.
+
+## How It Works
+
+Each unit dispatched by auto-mode is classified into a complexity tier:
+
+| Tier | Typical Work | Default Model Level |
+|------|-------------|-------------------|
+| **Light** | Slice completion, UAT, hooks | Haiku-class |
+| **Standard** | Research, planning, execution, milestone completion | Sonnet-class |
+| **Heavy** | Replanning, roadmap reassessment, complex execution | Opus-class |
+
+The router then selects a model for that tier. The key rule: **downgrade-only semantics**. The user's configured model is always the ceiling — routing never upgrades beyond what you've configured.
+
+## Enabling
+
+Dynamic routing is off by default. Enable it in preferences:
+
+```yaml
+---
+version: 1
+dynamic_routing:
+  enabled: true
+---
+```
+
+## Configuration
+
+```yaml
+dynamic_routing:
+  enabled: true
+  tier_models:                    # explicit model per tier (optional)
+    light: claude-haiku-4-5
+    standard: claude-sonnet-4-6
+    heavy: claude-opus-4-6
+  escalate_on_failure: true       # bump tier on task failure (default: true)
+  budget_pressure: true           # auto-downgrade when approaching budget ceiling (default: true)
+  cross_provider: true            # consider models from other providers (default: true)
+  hooks: true                     # apply routing to post-unit hooks (default: true)
+```
+
+### `tier_models`
+
+Override which model is used for each tier. When omitted, the router uses a built-in capability mapping that knows common model families:
+
+- **Light:** `claude-haiku-4-5`, `gpt-4o-mini`, `gemini-2.0-flash`
+- **Standard:** `claude-sonnet-4-6`, `gpt-4o`, `gemini-2.5-pro`
+- **Heavy:** `claude-opus-4-6`, `gpt-4.5-preview`, `gemini-2.5-pro`
+
+### `escalate_on_failure`
+
+When a task fails at a given tier, the router escalates to the next tier on retry. Light → Standard → Heavy. This prevents cheap models from burning retries on work that needs more reasoning.
+
+### `budget_pressure`
+
+When approaching the budget ceiling, the router progressively downgrades:
+
+| Budget Used | Effect |
+|------------|--------|
+| < 50% | No adjustment |
+| 50-75% | Standard → Light |
+| 75-90% | More aggressive downgrading |
+| > 90% | Nearly everything → Light; only Heavy stays at Standard |
+
+### `cross_provider`
+
+When enabled, the router may select models from providers other than your primary. This uses the built-in cost table to find the cheapest model at each tier. Requires the target provider to be configured.
+
+## Complexity Classification
+
+Units are classified using pure heuristics — no LLM calls, sub-millisecond:
+
+### Unit Type Defaults
+
+| Unit Type | Default Tier |
+|-----------|-------------|
+| `complete-slice`, `run-uat` | Light |
+| `research-*`, `plan-*`, `complete-milestone` | Standard |
+| `execute-task` | Standard (upgraded by task analysis) |
+| `replan-slice`, `reassess-roadmap` | Heavy |
+| `hook/*` | Light |
+
+### Task Plan Analysis
+
+For `execute-task` units, the classifier analyzes the task plan:
+
+| Signal | Simple → Light | Complex → Heavy |
+|--------|---------------|----------------|
+| Step count | ≤ 3 | ≥ 8 |
+| File count | ≤ 3 | ≥ 8 |
+| Description length | < 500 chars | > 2000 chars |
+| Code blocks | — | ≥ 5 |
+| Complexity keywords | None | Present |
+
+**Complexity keywords:** `research`, `investigate`, `refactor`, `migrate`, `integrate`, `complex`, `architect`, `redesign`, `security`, `performance`, `concurrent`, `parallel`, `distributed`, `backward compat`
+
+### Adaptive Learning
+
+The routing history (`.gsd/routing-history.json`) tracks success/failure per tier per unit type. If a tier's failure rate exceeds 20% for a given pattern, future classifications are bumped up. User feedback (`over`/`under`/`ok`) is weighted 2× vs automatic outcomes.
+
+## Interaction with Token Profiles
+
+Dynamic routing and token profiles are complementary:
+
+- **Token profiles** (`budget`/`balanced`/`quality`) control phase skipping and context compression
+- **Dynamic routing** controls per-unit model selection within the configured phase model
+
+When both are active, token profiles set the baseline models and dynamic routing further optimizes within those baselines. The `budget` token profile + dynamic routing provides maximum cost savings.
+
+## Cost Table
+
+The router includes a built-in cost table for common models, used for cross-provider cost comparison. Costs are per-million tokens (input/output):
+
+| Model | Input | Output |
+|-------|-------|--------|
+| claude-haiku-4-5 | $0.80 | $4.00 |
+| claude-sonnet-4-6 | $3.00 | $15.00 |
+| claude-opus-4-6 | $15.00 | $75.00 |
+| gpt-4o-mini | $0.15 | $0.60 |
+| gpt-4o | $2.50 | $10.00 |
+| gemini-2.0-flash | $0.10 | $0.40 |
+
+The cost table is used for comparison only — actual billing comes from your provider.
diff --git a/docs/visualizer.md b/docs/visualizer.md
new file mode 100644
index 000000000..6aa8e6747
--- /dev/null
+++ b/docs/visualizer.md
@@ -0,0 +1,92 @@
+# Workflow Visualizer
+
+*Introduced in v2.19.0*
+
+The workflow visualizer is a full-screen TUI overlay that shows project progress, dependencies, cost metrics, and execution timeline in an interactive four-tab view.
+
+## Opening the Visualizer
+
+```
+/gsd visualize
+```
+
+Or configure automatic display after milestone completion:
+
+```yaml
+auto_visualize: true
+```
+
+## Tabs
+
+Switch tabs with `Tab`, `1`-`4`, or arrow keys.
+
+### 1. Progress
+
+A tree view of milestones, slices, and tasks with completion status:
+
+```
+M001: User Management
+  ✅ S01: Auth module
+    ✅ T01: Core types
+    ✅ T02: JWT middleware
+    ✅ T03: Login flow
+  ⏳ S02: User dashboard
+    ✅ T01: Layout component
+    ⬜ T02: Profile page
+  ⬜ S03: Admin panel
+```
+
+Shows checkmarks for completed items, spinners for in-progress, and empty boxes for pending.
+
+### 2. Dependencies
+
+An ASCII dependency graph showing slice relationships:
+
+```
+S01 ──→ S02 ──→ S04
+  └───→ S03 ──↗
+```
+
+Visualizes the `depends:` field from the roadmap, making it easy to see which slices are blocked and which can proceed.
+
+### 3. Metrics
+
+Bar charts showing cost and token usage breakdowns:
+
+- **By phase** — research, planning, execution, completion, reassessment
+- **By slice** — cost per slice with running totals
+- **By model** — which models consumed the most budget
+
+Uses data from `.gsd/metrics.json`.
+
+### 4. Timeline
+
+Chronological execution history showing:
+
+- Unit type and ID
+- Start/end timestamps
+- Duration
+- Model used
+- Token counts
+
+Ordered by execution time, showing the full history of auto-mode dispatches.
+
+## Controls
+
+| Key | Action |
+|-----|--------|
+| `Tab` | Next tab |
+| `Shift+Tab` | Previous tab |
+| `1`-`4` | Jump to tab |
+| `↑`/`↓` | Scroll within tab |
+| `Escape` / `q` | Close visualizer |
+
+## Auto-Refresh
+
+The visualizer refreshes data from disk every 2 seconds, so it stays current if opened alongside a running auto-mode session.
+
+## Configuration
+
+```yaml
+auto_visualize: true    # show visualizer after milestone completion
+```