docs: add v2.18/v2.19 feature documentation (#631)

New docs:
- dynamic-model-routing.md — complexity classification, tier models,
  escalation, budget pressure, cost table, adaptive learning
- captures-triage.md — fire-and-forget capture, triage pipeline,
  classification types, dashboard integration, worktree awareness
- visualizer.md — four-tab TUI overlay (progress, deps, metrics,
  timeline), controls, auto-refresh, auto_visualize preference

Updated docs:
- README.md — added links to three new docs
- commands.md — added capture, triage, visualize, knowledge, queue reorder
- configuration.md — added dynamic_routing and auto_visualize settings,
  updated full example with new config options
- auto-mode.md — added capture, visualize sections, dashboard badge,
  dynamic model routing reference
- architecture.md — updated dispatch pipeline (routing + captures steps),
  added key modules table for v2.19
- cost-management.md — added dynamic routing and visualizer tips
This commit is contained in:
Tom Boucher 2026-03-16 11:00:58 -04:00 committed by GitHub
parent 370897df81
commit 330e5200bc
9 changed files with 403 additions and 13 deletions

View file

@ -12,6 +12,9 @@ Welcome to the GSD documentation. This covers everything from getting started to
| [Remote Questions](./remote-questions.md) | Discord and Slack integration for headless auto-mode |
| [Configuration](./configuration.md) | Preferences, model selection, git settings, and token profiles |
| [Token Optimization](./token-optimization.md) | Token profiles, context compression, complexity routing, and adaptive learning (v2.17) |
| [Dynamic Model Routing](./dynamic-model-routing.md) | Complexity-based model selection, cost tables, escalation, and budget pressure (v2.19) |
| [Captures & Triage](./captures-triage.md) | Fire-and-forget thought capture during auto-mode with automated triage (v2.19) |
| [Workflow Visualizer](./visualizer.md) | Interactive TUI overlay for progress, dependencies, metrics, and timeline (v2.19) |
| [Cost Management](./cost-management.md) | Budget ceilings, cost tracking, projections, and enforcement modes |
| [Git Strategy](./git-strategy.md) | Worktree isolation, branching model, and merge behavior |
| [Working in Teams](./working-in-teams.md) | Unique milestone IDs, `.gitignore` setup, and shared planning artifacts |

View file

@ -92,17 +92,41 @@ Performance-critical operations use a Rust N-API engine:
The auto mode dispatch pipeline:
```
1. Read disk state (STATE.md, roadmap, plans)
2. Determine next unit type and ID
3. Classify complexity → select model tier
4. Apply budget pressure adjustments
5. Check routing history for adaptive adjustments
6. Resolve effective model (with fallbacks)
7. Build dispatch prompt (applying inline level compression)
8. Create fresh agent session
9. Inject prompt and let LLM execute
10. On completion: snapshot metrics, verify artifacts, persist state
11. Loop to step 1
1. Read disk state (STATE.md, roadmap, plans)
2. Determine next unit type and ID
3. Classify complexity → select model tier
4. Apply budget pressure adjustments
5. Check routing history for adaptive adjustments
6. Dynamic model routing (if enabled) → select cheapest model for tier
7. Resolve effective model (with fallbacks)
8. Check pending captures → triage if needed
9. Build dispatch prompt (applying inline level compression)
10. Create fresh agent session
11. Inject prompt and let LLM execute
12. On completion: snapshot metrics, verify artifacts, persist state
13. Loop to step 1
```
Phase skipping (from token profile) gates steps 2-3: if a phase is skipped, the corresponding unit type is never dispatched.
## Key Modules (v2.19)
| Module | Purpose |
|--------|---------|
| `auto.ts` | Auto-mode state machine and orchestration |
| `auto-dispatch.ts` | Declarative dispatch table (phase → unit mapping) |
| `auto-prompts.ts` | Prompt builders with inline level compression |
| `auto-worktree.ts` | Worktree lifecycle (create, enter, merge, teardown) |
| `complexity-classifier.ts` | Unit complexity classification (light/standard/heavy) |
| `model-router.ts` | Dynamic model routing with cost-aware selection |
| `model-cost-table.ts` | Built-in per-model cost data for cross-provider comparison |
| `routing-history.ts` | Adaptive learning from routing outcomes |
| `captures.ts` | Fire-and-forget thought capture and triage classification |
| `triage-resolution.ts` | Capture resolution (inject, defer, replan, quick-task) |
| `visualizer-overlay.ts` | Workflow visualizer TUI overlay |
| `visualizer-data.ts` | Data loading for visualizer tabs |
| `visualizer-views.ts` | Tab renderers (progress, deps, metrics, timeline) |
| `metrics.ts` | Token and cost tracking ledger |
| `state.ts` | State derivation from disk |
| `preferences.ts` | Preference loading, merging, validation |
| `queue-order.ts` | Milestone queue ordering |

View file

@ -120,6 +120,22 @@ Stops auto mode gracefully. Can be run from a different terminal.
Hard-steer plan documents during execution without stopping the pipeline. Changes are picked up at the next phase boundary.
### Capture
```
/gsd capture "add rate limiting to API endpoints"
```
Fire-and-forget thought capture. Captures are triaged automatically between tasks. See [Captures & Triage](./captures-triage.md).
### Visualize
```
/gsd visualize
```
Open the workflow visualizer — interactive tabs for progress, dependencies, metrics, and timeline. See [Workflow Visualizer](./visualizer.md).
## Dashboard
`Ctrl+Alt+G` or `/gsd status` shows real-time progress:
@ -129,6 +145,7 @@ Hard-steer plan documents during execution without stopping the pipeline. Change
- Per-unit cost and token breakdown
- Cost projections
- Completed and in-progress units
- Pending capture count (when captures are awaiting triage)
## Phase Skipping
@ -141,3 +158,7 @@ Token profiles can skip certain phases to reduce cost:
| Reassess Roadmap | Skipped | Runs | Runs |
See [Token Optimization](./token-optimization.md) for details.
## Dynamic Model Routing
When enabled, auto-mode automatically selects cheaper models for simple units (slice completion, UAT) and reserves expensive models for complex work (replanning, architectural tasks). See [Dynamic Model Routing](./dynamic-model-routing.md).

82
docs/captures-triage.md Normal file
View file

@ -0,0 +1,82 @@
# Captures & Triage
*Introduced in v2.19.0*
Captures let you fire-and-forget thoughts during auto-mode execution. Instead of pausing auto-mode to steer, you can capture ideas, bugs, or scope changes and let GSD triage them at natural seams between tasks.
## Quick Start
While auto-mode is running (or any time):
```
/gsd capture "add rate limiting to the API endpoints"
/gsd capture "the auth flow should support OAuth, not just JWT"
```
Captures are appended to `.gsd/CAPTURES.md` and triaged automatically between tasks.
## How It Works
### Pipeline
```
capture → triage → confirm → resolve → resume
```
1. **Capture**`/gsd capture "thought"` appends to `.gsd/CAPTURES.md` with a timestamp and unique ID
2. **Triage** — at natural seams between tasks (in `handleAgentEnd`), GSD detects pending captures and classifies them
3. **Confirm** — the user is shown the proposed resolution and confirms or adjusts
4. **Resolve** — the resolution is applied (task injection, replan trigger, deferral, etc.)
5. **Resume** — auto-mode continues
### Classification Types
Each capture is classified into one of five types:
| Type | Meaning | Resolution |
|------|---------|------------|
| `quick-task` | Small, self-contained fix | Inline quick task executed immediately |
| `inject` | New task needed in current slice | Task injected into the active slice plan |
| `defer` | Important but not urgent | Deferred to roadmap reassessment |
| `replan` | Changes the current approach | Triggers slice replan with capture context |
| `note` | Informational, no action needed | Acknowledged, no plan changes |
### Automatic Triage
Triage fires automatically between tasks during auto-mode. The triage prompt receives:
- All pending captures
- The current slice plan
- The active roadmap
The LLM classifies each capture and proposes a resolution. Plan-modifying resolutions (inject, replan) require user confirmation.
### Manual Triage
Trigger triage manually at any time:
```
/gsd triage
```
This is useful when you've accumulated several captures and want to process them before the next natural seam.
## Dashboard Integration
The progress widget shows a pending capture count badge when captures are waiting for triage. This is visible in both the `Ctrl+Alt+G` dashboard and the auto-mode progress widget.
## Context Injection
Capture context is automatically injected into:
- **Replan-slice prompts** — so the replan knows what triggered it
- **Reassess-roadmap prompts** — so deferred captures influence roadmap decisions
## Worktree Awareness
Captures always resolve to the **original project root's** `.gsd/CAPTURES.md`, not the worktree's local copy. This ensures captures from a steering terminal are visible to the auto-mode session running in a worktree.
## Commands
| Command | Description |
|---------|-------------|
| `/gsd capture "text"` | Capture a thought (quotes optional for single words) |
| `/gsd triage` | Manually trigger triage of pending captures |

View file

@ -11,7 +11,11 @@
| `/gsd steer` | Hard-steer plan documents during execution |
| `/gsd discuss` | Discuss architecture and decisions (works alongside auto mode) |
| `/gsd status` | Progress dashboard |
| `/gsd queue` | Queue future milestones (safe during auto mode) |
| `/gsd queue` | Queue and reorder future milestones (safe during auto mode) |
| `/gsd capture` | Fire-and-forget thought capture (works during auto mode) |
| `/gsd triage` | Manually trigger triage of pending captures |
| `/gsd visualize` | Open workflow visualizer (progress, deps, metrics, timeline) |
| `/gsd knowledge` | Add persistent project knowledge (rule, pattern, or lesson) |
| `/gsd prefs` | Model selection, timeouts, budget ceiling |
| `/gsd migrate` | Migrate a v1 `.planning` directory to `.gsd` format |
| `/gsd doctor` | Validate `.gsd/` integrity, find and fix issues |

View file

@ -334,7 +334,33 @@ custom_instructions:
- "Prefer functional patterns over classes"
```
For project-specific knowledge (patterns, gotchas, lessons learned), use `.gsd/KNOWLEDGE.md` instead — it's injected into every agent prompt automatically.
For project-specific knowledge (patterns, gotchas, lessons learned), use `.gsd/KNOWLEDGE.md` instead — it's injected into every agent prompt automatically. Add entries with `/gsd knowledge rule|pattern|lesson <description>`.
### `dynamic_routing`
Complexity-based model routing. See [Dynamic Model Routing](./dynamic-model-routing.md).
```yaml
dynamic_routing:
enabled: true
tier_models:
light: claude-haiku-4-5
standard: claude-sonnet-4-6
heavy: claude-opus-4-6
escalate_on_failure: true
budget_pressure: true
cross_provider: true
```
### `auto_visualize`
Show the workflow visualizer automatically after milestone completion:
```yaml
auto_visualize: true
```
See [Workflow Visualizer](./visualizer.md).
## Full Example
@ -356,6 +382,12 @@ models:
# Token optimization
token_profile: balanced
# Dynamic model routing
dynamic_routing:
enabled: true
escalate_on_failure: true
budget_pressure: true
# Budget
budget_ceiling: 25.00
budget_enforcement: pause
@ -387,6 +419,9 @@ notifications:
on_milestone: true
on_attention: true
# Visualizer
auto_visualize: true
# Hooks
post_unit_hooks:
- name: code-review

View file

@ -89,3 +89,5 @@ See [Token Optimization](./token-optimization.md) for details.
- Switch to `budget` profile for well-understood, repetitive work
- Use `quality` only when architectural decisions are being made
- Per-phase model selection lets you use Opus only for planning while keeping execution on Sonnet
- Enable `dynamic_routing` for automatic model downgrading on simple tasks — see [Dynamic Model Routing](./dynamic-model-routing.md)
- Use `/gsd visualize` → Metrics tab to see where your budget is going

View file

@ -0,0 +1,127 @@
# Dynamic Model Routing
*Introduced in v2.19.0*
Dynamic model routing automatically selects cheaper models for simple work and reserves expensive models for complex tasks. This reduces token consumption by 20-50% on capped plans without sacrificing quality where it matters.
## How It Works
Each unit dispatched by auto-mode is classified into a complexity tier:
| Tier | Typical Work | Default Model Level |
|------|-------------|-------------------|
| **Light** | Slice completion, UAT, hooks | Haiku-class |
| **Standard** | Research, planning, execution, milestone completion | Sonnet-class |
| **Heavy** | Replanning, roadmap reassessment, complex execution | Opus-class |
The router then selects a model for that tier. The key rule: **downgrade-only semantics**. The user's configured model is always the ceiling — routing never upgrades beyond what you've configured.
## Enabling
Dynamic routing is off by default. Enable it in preferences:
```yaml
---
version: 1
dynamic_routing:
enabled: true
---
```
## Configuration
```yaml
dynamic_routing:
enabled: true
tier_models: # explicit model per tier (optional)
light: claude-haiku-4-5
standard: claude-sonnet-4-6
heavy: claude-opus-4-6
escalate_on_failure: true # bump tier on task failure (default: true)
budget_pressure: true # auto-downgrade when approaching budget ceiling (default: true)
cross_provider: true # consider models from other providers (default: true)
hooks: true # apply routing to post-unit hooks (default: true)
```
### `tier_models`
Override which model is used for each tier. When omitted, the router uses a built-in capability mapping that knows common model families:
- **Light:** `claude-haiku-4-5`, `gpt-4o-mini`, `gemini-2.0-flash`
- **Standard:** `claude-sonnet-4-6`, `gpt-4o`, `gemini-2.5-pro`
- **Heavy:** `claude-opus-4-6`, `gpt-4.5-preview`, `gemini-2.5-pro`
### `escalate_on_failure`
When a task fails at a given tier, the router escalates to the next tier on retry. Light → Standard → Heavy. This prevents cheap models from burning retries on work that needs more reasoning.
### `budget_pressure`
When approaching the budget ceiling, the router progressively downgrades:
| Budget Used | Effect |
|------------|--------|
| < 50% | No adjustment |
| 50-75% | Standard → Light |
| 75-90% | More aggressive downgrading |
| > 90% | Nearly everything → Light; only Heavy stays at Standard |
### `cross_provider`
When enabled, the router may select models from providers other than your primary. This uses the built-in cost table to find the cheapest model at each tier. Requires the target provider to be configured.
## Complexity Classification
Units are classified using pure heuristics — no LLM calls, sub-millisecond:
### Unit Type Defaults
| Unit Type | Default Tier |
|-----------|-------------|
| `complete-slice`, `run-uat` | Light |
| `research-*`, `plan-*`, `complete-milestone` | Standard |
| `execute-task` | Standard (upgraded by task analysis) |
| `replan-slice`, `reassess-roadmap` | Heavy |
| `hook/*` | Light |
### Task Plan Analysis
For `execute-task` units, the classifier analyzes the task plan:
| Signal | Simple → Light | Complex → Heavy |
|--------|---------------|----------------|
| Step count | ≤ 3 | ≥ 8 |
| File count | ≤ 3 | ≥ 8 |
| Description length | < 500 chars | > 2000 chars |
| Code blocks | — | ≥ 5 |
| Complexity keywords | None | Present |
**Complexity keywords:** `research`, `investigate`, `refactor`, `migrate`, `integrate`, `complex`, `architect`, `redesign`, `security`, `performance`, `concurrent`, `parallel`, `distributed`, `backward compat`
### Adaptive Learning
The routing history (`.gsd/routing-history.json`) tracks success/failure per tier per unit type. If a tier's failure rate exceeds 20% for a given pattern, future classifications are bumped up. User feedback (`over`/`under`/`ok`) is weighted 2× vs automatic outcomes.
## Interaction with Token Profiles
Dynamic routing and token profiles are complementary:
- **Token profiles** (`budget`/`balanced`/`quality`) control phase skipping and context compression
- **Dynamic routing** controls per-unit model selection within the configured phase model
When both are active, token profiles set the baseline models and dynamic routing further optimizes within those baselines. The `budget` token profile + dynamic routing provides maximum cost savings.
## Cost Table
The router includes a built-in cost table for common models, used for cross-provider cost comparison. Costs are per-million tokens (input/output):
| Model | Input | Output |
|-------|-------|--------|
| claude-haiku-4-5 | $0.80 | $4.00 |
| claude-sonnet-4-6 | $3.00 | $15.00 |
| claude-opus-4-6 | $15.00 | $75.00 |
| gpt-4o-mini | $0.15 | $0.60 |
| gpt-4o | $2.50 | $10.00 |
| gemini-2.0-flash | $0.10 | $0.40 |
The cost table is used for comparison only — actual billing comes from your provider.

92
docs/visualizer.md Normal file
View file

@ -0,0 +1,92 @@
# Workflow Visualizer
*Introduced in v2.19.0*
The workflow visualizer is a full-screen TUI overlay that shows project progress, dependencies, cost metrics, and execution timeline in an interactive four-tab view.
## Opening the Visualizer
```
/gsd visualize
```
Or configure automatic display after milestone completion:
```yaml
auto_visualize: true
```
## Tabs
Switch tabs with `Tab`, `1`-`4`, or arrow keys.
### 1. Progress
A tree view of milestones, slices, and tasks with completion status:
```
M001: User Management
✅ S01: Auth module
✅ T01: Core types
✅ T02: JWT middleware
✅ T03: Login flow
⏳ S02: User dashboard
✅ T01: Layout component
⬜ T02: Profile page
⬜ S03: Admin panel
```
Shows checkmarks for completed items, spinners for in-progress, and empty boxes for pending.
### 2. Dependencies
An ASCII dependency graph showing slice relationships:
```
S01 ──→ S02 ──→ S04
└───→ S03 ──↗
```
Visualizes the `depends:` field from the roadmap, making it easy to see which slices are blocked and which can proceed.
### 3. Metrics
Bar charts showing cost and token usage breakdowns:
- **By phase** — research, planning, execution, completion, reassessment
- **By slice** — cost per slice with running totals
- **By model** — which models consumed the most budget
Uses data from `.gsd/metrics.json`.
### 4. Timeline
Chronological execution history showing:
- Unit type and ID
- Start/end timestamps
- Duration
- Model used
- Token counts
Ordered by execution time, showing the full history of auto-mode dispatches.
## Controls
| Key | Action |
|-----|--------|
| `Tab` | Next tab |
| `Shift+Tab` | Previous tab |
| `1`-`4` | Jump to tab |
| `↑`/`↓` | Scroll within tab |
| `Escape` / `q` | Close visualizer |
## Auto-Refresh
The visualizer refreshes data from disk every 2 seconds, so it stays current if opened alongside a running auto-mode session.
## Configuration
```yaml
auto_visualize: true # show visualizer after milestone completion
```