From 330e5200bc3e5dd2757cba51f6aaff9edbc510cf Mon Sep 17 00:00:00 2001 From: Tom Boucher Date: Mon, 16 Mar 2026 11:00:58 -0400 Subject: [PATCH] docs: add v2.18/v2.19 feature documentation (#631) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit New docs: - dynamic-model-routing.md — complexity classification, tier models, escalation, budget pressure, cost table, adaptive learning - captures-triage.md — fire-and-forget capture, triage pipeline, classification types, dashboard integration, worktree awareness - visualizer.md — four-tab TUI overlay (progress, deps, metrics, timeline), controls, auto-refresh, auto_visualize preference Updated docs: - README.md — added links to three new docs - commands.md — added capture, triage, visualize, knowledge, queue reorder - configuration.md — added dynamic_routing and auto_visualize settings, updated full example with new config options - auto-mode.md — added capture, visualize sections, dashboard badge, dynamic model routing reference - architecture.md — updated dispatch pipeline (routing + captures steps), added key modules table for v2.19 - cost-management.md — added dynamic routing and visualizer tips --- docs/README.md | 3 + docs/architecture.md | 46 +++++++++--- docs/auto-mode.md | 21 ++++++ docs/captures-triage.md | 82 ++++++++++++++++++++++ docs/commands.md | 6 +- docs/configuration.md | 37 +++++++++- docs/cost-management.md | 2 + docs/dynamic-model-routing.md | 127 ++++++++++++++++++++++++++++++++++ docs/visualizer.md | 92 ++++++++++++++++++++++++ 9 files changed, 403 insertions(+), 13 deletions(-) create mode 100644 docs/captures-triage.md create mode 100644 docs/dynamic-model-routing.md create mode 100644 docs/visualizer.md diff --git a/docs/README.md b/docs/README.md index ce50fd528..0bba640de 100644 --- a/docs/README.md +++ b/docs/README.md @@ -12,6 +12,9 @@ Welcome to the GSD documentation. This covers everything from getting started to | [Remote Questions](./remote-questions.md) | Discord and Slack integration for headless auto-mode | | [Configuration](./configuration.md) | Preferences, model selection, git settings, and token profiles | | [Token Optimization](./token-optimization.md) | Token profiles, context compression, complexity routing, and adaptive learning (v2.17) | +| [Dynamic Model Routing](./dynamic-model-routing.md) | Complexity-based model selection, cost tables, escalation, and budget pressure (v2.19) | +| [Captures & Triage](./captures-triage.md) | Fire-and-forget thought capture during auto-mode with automated triage (v2.19) | +| [Workflow Visualizer](./visualizer.md) | Interactive TUI overlay for progress, dependencies, metrics, and timeline (v2.19) | | [Cost Management](./cost-management.md) | Budget ceilings, cost tracking, projections, and enforcement modes | | [Git Strategy](./git-strategy.md) | Worktree isolation, branching model, and merge behavior | | [Working in Teams](./working-in-teams.md) | Unique milestone IDs, `.gitignore` setup, and shared planning artifacts | diff --git a/docs/architecture.md b/docs/architecture.md index 38ec524a2..3fc29d2ca 100644 --- a/docs/architecture.md +++ b/docs/architecture.md @@ -92,17 +92,41 @@ Performance-critical operations use a Rust N-API engine: The auto mode dispatch pipeline: ``` -1. Read disk state (STATE.md, roadmap, plans) -2. Determine next unit type and ID -3. Classify complexity → select model tier -4. Apply budget pressure adjustments -5. Check routing history for adaptive adjustments -6. Resolve effective model (with fallbacks) -7. Build dispatch prompt (applying inline level compression) -8. Create fresh agent session -9. Inject prompt and let LLM execute -10. On completion: snapshot metrics, verify artifacts, persist state -11. Loop to step 1 +1. Read disk state (STATE.md, roadmap, plans) +2. Determine next unit type and ID +3. Classify complexity → select model tier +4. Apply budget pressure adjustments +5. Check routing history for adaptive adjustments +6. Dynamic model routing (if enabled) → select cheapest model for tier +7. Resolve effective model (with fallbacks) +8. Check pending captures → triage if needed +9. Build dispatch prompt (applying inline level compression) +10. Create fresh agent session +11. Inject prompt and let LLM execute +12. On completion: snapshot metrics, verify artifacts, persist state +13. Loop to step 1 ``` Phase skipping (from token profile) gates steps 2-3: if a phase is skipped, the corresponding unit type is never dispatched. + +## Key Modules (v2.19) + +| Module | Purpose | +|--------|---------| +| `auto.ts` | Auto-mode state machine and orchestration | +| `auto-dispatch.ts` | Declarative dispatch table (phase → unit mapping) | +| `auto-prompts.ts` | Prompt builders with inline level compression | +| `auto-worktree.ts` | Worktree lifecycle (create, enter, merge, teardown) | +| `complexity-classifier.ts` | Unit complexity classification (light/standard/heavy) | +| `model-router.ts` | Dynamic model routing with cost-aware selection | +| `model-cost-table.ts` | Built-in per-model cost data for cross-provider comparison | +| `routing-history.ts` | Adaptive learning from routing outcomes | +| `captures.ts` | Fire-and-forget thought capture and triage classification | +| `triage-resolution.ts` | Capture resolution (inject, defer, replan, quick-task) | +| `visualizer-overlay.ts` | Workflow visualizer TUI overlay | +| `visualizer-data.ts` | Data loading for visualizer tabs | +| `visualizer-views.ts` | Tab renderers (progress, deps, metrics, timeline) | +| `metrics.ts` | Token and cost tracking ledger | +| `state.ts` | State derivation from disk | +| `preferences.ts` | Preference loading, merging, validation | +| `queue-order.ts` | Milestone queue ordering | diff --git a/docs/auto-mode.md b/docs/auto-mode.md index f930cee55..6b548e127 100644 --- a/docs/auto-mode.md +++ b/docs/auto-mode.md @@ -120,6 +120,22 @@ Stops auto mode gracefully. Can be run from a different terminal. Hard-steer plan documents during execution without stopping the pipeline. Changes are picked up at the next phase boundary. +### Capture + +``` +/gsd capture "add rate limiting to API endpoints" +``` + +Fire-and-forget thought capture. Captures are triaged automatically between tasks. See [Captures & Triage](./captures-triage.md). + +### Visualize + +``` +/gsd visualize +``` + +Open the workflow visualizer — interactive tabs for progress, dependencies, metrics, and timeline. See [Workflow Visualizer](./visualizer.md). + ## Dashboard `Ctrl+Alt+G` or `/gsd status` shows real-time progress: @@ -129,6 +145,7 @@ Hard-steer plan documents during execution without stopping the pipeline. Change - Per-unit cost and token breakdown - Cost projections - Completed and in-progress units +- Pending capture count (when captures are awaiting triage) ## Phase Skipping @@ -141,3 +158,7 @@ Token profiles can skip certain phases to reduce cost: | Reassess Roadmap | Skipped | Runs | Runs | See [Token Optimization](./token-optimization.md) for details. + +## Dynamic Model Routing + +When enabled, auto-mode automatically selects cheaper models for simple units (slice completion, UAT) and reserves expensive models for complex work (replanning, architectural tasks). See [Dynamic Model Routing](./dynamic-model-routing.md). diff --git a/docs/captures-triage.md b/docs/captures-triage.md new file mode 100644 index 000000000..1c5f7e3f7 --- /dev/null +++ b/docs/captures-triage.md @@ -0,0 +1,82 @@ +# Captures & Triage + +*Introduced in v2.19.0* + +Captures let you fire-and-forget thoughts during auto-mode execution. Instead of pausing auto-mode to steer, you can capture ideas, bugs, or scope changes and let GSD triage them at natural seams between tasks. + +## Quick Start + +While auto-mode is running (or any time): + +``` +/gsd capture "add rate limiting to the API endpoints" +/gsd capture "the auth flow should support OAuth, not just JWT" +``` + +Captures are appended to `.gsd/CAPTURES.md` and triaged automatically between tasks. + +## How It Works + +### Pipeline + +``` +capture → triage → confirm → resolve → resume +``` + +1. **Capture** — `/gsd capture "thought"` appends to `.gsd/CAPTURES.md` with a timestamp and unique ID +2. **Triage** — at natural seams between tasks (in `handleAgentEnd`), GSD detects pending captures and classifies them +3. **Confirm** — the user is shown the proposed resolution and confirms or adjusts +4. **Resolve** — the resolution is applied (task injection, replan trigger, deferral, etc.) +5. **Resume** — auto-mode continues + +### Classification Types + +Each capture is classified into one of five types: + +| Type | Meaning | Resolution | +|------|---------|------------| +| `quick-task` | Small, self-contained fix | Inline quick task executed immediately | +| `inject` | New task needed in current slice | Task injected into the active slice plan | +| `defer` | Important but not urgent | Deferred to roadmap reassessment | +| `replan` | Changes the current approach | Triggers slice replan with capture context | +| `note` | Informational, no action needed | Acknowledged, no plan changes | + +### Automatic Triage + +Triage fires automatically between tasks during auto-mode. The triage prompt receives: +- All pending captures +- The current slice plan +- The active roadmap + +The LLM classifies each capture and proposes a resolution. Plan-modifying resolutions (inject, replan) require user confirmation. + +### Manual Triage + +Trigger triage manually at any time: + +``` +/gsd triage +``` + +This is useful when you've accumulated several captures and want to process them before the next natural seam. + +## Dashboard Integration + +The progress widget shows a pending capture count badge when captures are waiting for triage. This is visible in both the `Ctrl+Alt+G` dashboard and the auto-mode progress widget. + +## Context Injection + +Capture context is automatically injected into: +- **Replan-slice prompts** — so the replan knows what triggered it +- **Reassess-roadmap prompts** — so deferred captures influence roadmap decisions + +## Worktree Awareness + +Captures always resolve to the **original project root's** `.gsd/CAPTURES.md`, not the worktree's local copy. This ensures captures from a steering terminal are visible to the auto-mode session running in a worktree. + +## Commands + +| Command | Description | +|---------|-------------| +| `/gsd capture "text"` | Capture a thought (quotes optional for single words) | +| `/gsd triage` | Manually trigger triage of pending captures | diff --git a/docs/commands.md b/docs/commands.md index 5414ea16e..a026e5803 100644 --- a/docs/commands.md +++ b/docs/commands.md @@ -11,7 +11,11 @@ | `/gsd steer` | Hard-steer plan documents during execution | | `/gsd discuss` | Discuss architecture and decisions (works alongside auto mode) | | `/gsd status` | Progress dashboard | -| `/gsd queue` | Queue future milestones (safe during auto mode) | +| `/gsd queue` | Queue and reorder future milestones (safe during auto mode) | +| `/gsd capture` | Fire-and-forget thought capture (works during auto mode) | +| `/gsd triage` | Manually trigger triage of pending captures | +| `/gsd visualize` | Open workflow visualizer (progress, deps, metrics, timeline) | +| `/gsd knowledge` | Add persistent project knowledge (rule, pattern, or lesson) | | `/gsd prefs` | Model selection, timeouts, budget ceiling | | `/gsd migrate` | Migrate a v1 `.planning` directory to `.gsd` format | | `/gsd doctor` | Validate `.gsd/` integrity, find and fix issues | diff --git a/docs/configuration.md b/docs/configuration.md index 8b74333d1..d05ce6dc1 100644 --- a/docs/configuration.md +++ b/docs/configuration.md @@ -334,7 +334,33 @@ custom_instructions: - "Prefer functional patterns over classes" ``` -For project-specific knowledge (patterns, gotchas, lessons learned), use `.gsd/KNOWLEDGE.md` instead — it's injected into every agent prompt automatically. +For project-specific knowledge (patterns, gotchas, lessons learned), use `.gsd/KNOWLEDGE.md` instead — it's injected into every agent prompt automatically. Add entries with `/gsd knowledge rule|pattern|lesson `. + +### `dynamic_routing` + +Complexity-based model routing. See [Dynamic Model Routing](./dynamic-model-routing.md). + +```yaml +dynamic_routing: + enabled: true + tier_models: + light: claude-haiku-4-5 + standard: claude-sonnet-4-6 + heavy: claude-opus-4-6 + escalate_on_failure: true + budget_pressure: true + cross_provider: true +``` + +### `auto_visualize` + +Show the workflow visualizer automatically after milestone completion: + +```yaml +auto_visualize: true +``` + +See [Workflow Visualizer](./visualizer.md). ## Full Example @@ -356,6 +382,12 @@ models: # Token optimization token_profile: balanced +# Dynamic model routing +dynamic_routing: + enabled: true + escalate_on_failure: true + budget_pressure: true + # Budget budget_ceiling: 25.00 budget_enforcement: pause @@ -387,6 +419,9 @@ notifications: on_milestone: true on_attention: true +# Visualizer +auto_visualize: true + # Hooks post_unit_hooks: - name: code-review diff --git a/docs/cost-management.md b/docs/cost-management.md index efd3398e6..06214590d 100644 --- a/docs/cost-management.md +++ b/docs/cost-management.md @@ -89,3 +89,5 @@ See [Token Optimization](./token-optimization.md) for details. - Switch to `budget` profile for well-understood, repetitive work - Use `quality` only when architectural decisions are being made - Per-phase model selection lets you use Opus only for planning while keeping execution on Sonnet +- Enable `dynamic_routing` for automatic model downgrading on simple tasks — see [Dynamic Model Routing](./dynamic-model-routing.md) +- Use `/gsd visualize` → Metrics tab to see where your budget is going diff --git a/docs/dynamic-model-routing.md b/docs/dynamic-model-routing.md new file mode 100644 index 000000000..9d0d5525e --- /dev/null +++ b/docs/dynamic-model-routing.md @@ -0,0 +1,127 @@ +# Dynamic Model Routing + +*Introduced in v2.19.0* + +Dynamic model routing automatically selects cheaper models for simple work and reserves expensive models for complex tasks. This reduces token consumption by 20-50% on capped plans without sacrificing quality where it matters. + +## How It Works + +Each unit dispatched by auto-mode is classified into a complexity tier: + +| Tier | Typical Work | Default Model Level | +|------|-------------|-------------------| +| **Light** | Slice completion, UAT, hooks | Haiku-class | +| **Standard** | Research, planning, execution, milestone completion | Sonnet-class | +| **Heavy** | Replanning, roadmap reassessment, complex execution | Opus-class | + +The router then selects a model for that tier. The key rule: **downgrade-only semantics**. The user's configured model is always the ceiling — routing never upgrades beyond what you've configured. + +## Enabling + +Dynamic routing is off by default. Enable it in preferences: + +```yaml +--- +version: 1 +dynamic_routing: + enabled: true +--- +``` + +## Configuration + +```yaml +dynamic_routing: + enabled: true + tier_models: # explicit model per tier (optional) + light: claude-haiku-4-5 + standard: claude-sonnet-4-6 + heavy: claude-opus-4-6 + escalate_on_failure: true # bump tier on task failure (default: true) + budget_pressure: true # auto-downgrade when approaching budget ceiling (default: true) + cross_provider: true # consider models from other providers (default: true) + hooks: true # apply routing to post-unit hooks (default: true) +``` + +### `tier_models` + +Override which model is used for each tier. When omitted, the router uses a built-in capability mapping that knows common model families: + +- **Light:** `claude-haiku-4-5`, `gpt-4o-mini`, `gemini-2.0-flash` +- **Standard:** `claude-sonnet-4-6`, `gpt-4o`, `gemini-2.5-pro` +- **Heavy:** `claude-opus-4-6`, `gpt-4.5-preview`, `gemini-2.5-pro` + +### `escalate_on_failure` + +When a task fails at a given tier, the router escalates to the next tier on retry. Light → Standard → Heavy. This prevents cheap models from burning retries on work that needs more reasoning. + +### `budget_pressure` + +When approaching the budget ceiling, the router progressively downgrades: + +| Budget Used | Effect | +|------------|--------| +| < 50% | No adjustment | +| 50-75% | Standard → Light | +| 75-90% | More aggressive downgrading | +| > 90% | Nearly everything → Light; only Heavy stays at Standard | + +### `cross_provider` + +When enabled, the router may select models from providers other than your primary. This uses the built-in cost table to find the cheapest model at each tier. Requires the target provider to be configured. + +## Complexity Classification + +Units are classified using pure heuristics — no LLM calls, sub-millisecond: + +### Unit Type Defaults + +| Unit Type | Default Tier | +|-----------|-------------| +| `complete-slice`, `run-uat` | Light | +| `research-*`, `plan-*`, `complete-milestone` | Standard | +| `execute-task` | Standard (upgraded by task analysis) | +| `replan-slice`, `reassess-roadmap` | Heavy | +| `hook/*` | Light | + +### Task Plan Analysis + +For `execute-task` units, the classifier analyzes the task plan: + +| Signal | Simple → Light | Complex → Heavy | +|--------|---------------|----------------| +| Step count | ≤ 3 | ≥ 8 | +| File count | ≤ 3 | ≥ 8 | +| Description length | < 500 chars | > 2000 chars | +| Code blocks | — | ≥ 5 | +| Complexity keywords | None | Present | + +**Complexity keywords:** `research`, `investigate`, `refactor`, `migrate`, `integrate`, `complex`, `architect`, `redesign`, `security`, `performance`, `concurrent`, `parallel`, `distributed`, `backward compat` + +### Adaptive Learning + +The routing history (`.gsd/routing-history.json`) tracks success/failure per tier per unit type. If a tier's failure rate exceeds 20% for a given pattern, future classifications are bumped up. User feedback (`over`/`under`/`ok`) is weighted 2× vs automatic outcomes. + +## Interaction with Token Profiles + +Dynamic routing and token profiles are complementary: + +- **Token profiles** (`budget`/`balanced`/`quality`) control phase skipping and context compression +- **Dynamic routing** controls per-unit model selection within the configured phase model + +When both are active, token profiles set the baseline models and dynamic routing further optimizes within those baselines. The `budget` token profile + dynamic routing provides maximum cost savings. + +## Cost Table + +The router includes a built-in cost table for common models, used for cross-provider cost comparison. Costs are per-million tokens (input/output): + +| Model | Input | Output | +|-------|-------|--------| +| claude-haiku-4-5 | $0.80 | $4.00 | +| claude-sonnet-4-6 | $3.00 | $15.00 | +| claude-opus-4-6 | $15.00 | $75.00 | +| gpt-4o-mini | $0.15 | $0.60 | +| gpt-4o | $2.50 | $10.00 | +| gemini-2.0-flash | $0.10 | $0.40 | + +The cost table is used for comparison only — actual billing comes from your provider. diff --git a/docs/visualizer.md b/docs/visualizer.md new file mode 100644 index 000000000..6aa8e6747 --- /dev/null +++ b/docs/visualizer.md @@ -0,0 +1,92 @@ +# Workflow Visualizer + +*Introduced in v2.19.0* + +The workflow visualizer is a full-screen TUI overlay that shows project progress, dependencies, cost metrics, and execution timeline in an interactive four-tab view. + +## Opening the Visualizer + +``` +/gsd visualize +``` + +Or configure automatic display after milestone completion: + +```yaml +auto_visualize: true +``` + +## Tabs + +Switch tabs with `Tab`, `1`-`4`, or arrow keys. + +### 1. Progress + +A tree view of milestones, slices, and tasks with completion status: + +``` +M001: User Management + ✅ S01: Auth module + ✅ T01: Core types + ✅ T02: JWT middleware + ✅ T03: Login flow + ⏳ S02: User dashboard + ✅ T01: Layout component + ⬜ T02: Profile page + ⬜ S03: Admin panel +``` + +Shows checkmarks for completed items, spinners for in-progress, and empty boxes for pending. + +### 2. Dependencies + +An ASCII dependency graph showing slice relationships: + +``` +S01 ──→ S02 ──→ S04 + └───→ S03 ──↗ +``` + +Visualizes the `depends:` field from the roadmap, making it easy to see which slices are blocked and which can proceed. + +### 3. Metrics + +Bar charts showing cost and token usage breakdowns: + +- **By phase** — research, planning, execution, completion, reassessment +- **By slice** — cost per slice with running totals +- **By model** — which models consumed the most budget + +Uses data from `.gsd/metrics.json`. + +### 4. Timeline + +Chronological execution history showing: + +- Unit type and ID +- Start/end timestamps +- Duration +- Model used +- Token counts + +Ordered by execution time, showing the full history of auto-mode dispatches. + +## Controls + +| Key | Action | +|-----|--------| +| `Tab` | Next tab | +| `Shift+Tab` | Previous tab | +| `1`-`4` | Jump to tab | +| `↑`/`↓` | Scroll within tab | +| `Escape` / `q` | Close visualizer | + +## Auto-Refresh + +The visualizer refreshes data from disk every 2 seconds, so it stays current if opened alongside a running auto-mode session. + +## Configuration + +```yaml +auto_visualize: true # show visualizer after milestone completion +```