docs: add v2.18/v2.19 feature documentation (#631)
New docs: - dynamic-model-routing.md — complexity classification, tier models, escalation, budget pressure, cost table, adaptive learning - captures-triage.md — fire-and-forget capture, triage pipeline, classification types, dashboard integration, worktree awareness - visualizer.md — four-tab TUI overlay (progress, deps, metrics, timeline), controls, auto-refresh, auto_visualize preference Updated docs: - README.md — added links to three new docs - commands.md — added capture, triage, visualize, knowledge, queue reorder - configuration.md — added dynamic_routing and auto_visualize settings, updated full example with new config options - auto-mode.md — added capture, visualize sections, dashboard badge, dynamic model routing reference - architecture.md — updated dispatch pipeline (routing + captures steps), added key modules table for v2.19 - cost-management.md — added dynamic routing and visualizer tips
This commit is contained in:
parent
370897df81
commit
330e5200bc
9 changed files with 403 additions and 13 deletions
|
|
@ -12,6 +12,9 @@ Welcome to the GSD documentation. This covers everything from getting started to
|
|||
| [Remote Questions](./remote-questions.md) | Discord and Slack integration for headless auto-mode |
|
||||
| [Configuration](./configuration.md) | Preferences, model selection, git settings, and token profiles |
|
||||
| [Token Optimization](./token-optimization.md) | Token profiles, context compression, complexity routing, and adaptive learning (v2.17) |
|
||||
| [Dynamic Model Routing](./dynamic-model-routing.md) | Complexity-based model selection, cost tables, escalation, and budget pressure (v2.19) |
|
||||
| [Captures & Triage](./captures-triage.md) | Fire-and-forget thought capture during auto-mode with automated triage (v2.19) |
|
||||
| [Workflow Visualizer](./visualizer.md) | Interactive TUI overlay for progress, dependencies, metrics, and timeline (v2.19) |
|
||||
| [Cost Management](./cost-management.md) | Budget ceilings, cost tracking, projections, and enforcement modes |
|
||||
| [Git Strategy](./git-strategy.md) | Worktree isolation, branching model, and merge behavior |
|
||||
| [Working in Teams](./working-in-teams.md) | Unique milestone IDs, `.gitignore` setup, and shared planning artifacts |
|
||||
|
|
|
|||
|
|
@ -92,17 +92,41 @@ Performance-critical operations use a Rust N-API engine:
|
|||
The auto mode dispatch pipeline:
|
||||
|
||||
```
|
||||
1. Read disk state (STATE.md, roadmap, plans)
|
||||
2. Determine next unit type and ID
|
||||
3. Classify complexity → select model tier
|
||||
4. Apply budget pressure adjustments
|
||||
5. Check routing history for adaptive adjustments
|
||||
6. Resolve effective model (with fallbacks)
|
||||
7. Build dispatch prompt (applying inline level compression)
|
||||
8. Create fresh agent session
|
||||
9. Inject prompt and let LLM execute
|
||||
10. On completion: snapshot metrics, verify artifacts, persist state
|
||||
11. Loop to step 1
|
||||
1. Read disk state (STATE.md, roadmap, plans)
|
||||
2. Determine next unit type and ID
|
||||
3. Classify complexity → select model tier
|
||||
4. Apply budget pressure adjustments
|
||||
5. Check routing history for adaptive adjustments
|
||||
6. Dynamic model routing (if enabled) → select cheapest model for tier
|
||||
7. Resolve effective model (with fallbacks)
|
||||
8. Check pending captures → triage if needed
|
||||
9. Build dispatch prompt (applying inline level compression)
|
||||
10. Create fresh agent session
|
||||
11. Inject prompt and let LLM execute
|
||||
12. On completion: snapshot metrics, verify artifacts, persist state
|
||||
13. Loop to step 1
|
||||
```
|
||||
|
||||
Phase skipping (from token profile) gates steps 2-3: if a phase is skipped, the corresponding unit type is never dispatched.
|
||||
|
||||
## Key Modules (v2.19)
|
||||
|
||||
| Module | Purpose |
|
||||
|--------|---------|
|
||||
| `auto.ts` | Auto-mode state machine and orchestration |
|
||||
| `auto-dispatch.ts` | Declarative dispatch table (phase → unit mapping) |
|
||||
| `auto-prompts.ts` | Prompt builders with inline level compression |
|
||||
| `auto-worktree.ts` | Worktree lifecycle (create, enter, merge, teardown) |
|
||||
| `complexity-classifier.ts` | Unit complexity classification (light/standard/heavy) |
|
||||
| `model-router.ts` | Dynamic model routing with cost-aware selection |
|
||||
| `model-cost-table.ts` | Built-in per-model cost data for cross-provider comparison |
|
||||
| `routing-history.ts` | Adaptive learning from routing outcomes |
|
||||
| `captures.ts` | Fire-and-forget thought capture and triage classification |
|
||||
| `triage-resolution.ts` | Capture resolution (inject, defer, replan, quick-task) |
|
||||
| `visualizer-overlay.ts` | Workflow visualizer TUI overlay |
|
||||
| `visualizer-data.ts` | Data loading for visualizer tabs |
|
||||
| `visualizer-views.ts` | Tab renderers (progress, deps, metrics, timeline) |
|
||||
| `metrics.ts` | Token and cost tracking ledger |
|
||||
| `state.ts` | State derivation from disk |
|
||||
| `preferences.ts` | Preference loading, merging, validation |
|
||||
| `queue-order.ts` | Milestone queue ordering |
|
||||
|
|
|
|||
|
|
@ -120,6 +120,22 @@ Stops auto mode gracefully. Can be run from a different terminal.
|
|||
|
||||
Hard-steer plan documents during execution without stopping the pipeline. Changes are picked up at the next phase boundary.
|
||||
|
||||
### Capture
|
||||
|
||||
```
|
||||
/gsd capture "add rate limiting to API endpoints"
|
||||
```
|
||||
|
||||
Fire-and-forget thought capture. Captures are triaged automatically between tasks. See [Captures & Triage](./captures-triage.md).
|
||||
|
||||
### Visualize
|
||||
|
||||
```
|
||||
/gsd visualize
|
||||
```
|
||||
|
||||
Open the workflow visualizer — interactive tabs for progress, dependencies, metrics, and timeline. See [Workflow Visualizer](./visualizer.md).
|
||||
|
||||
## Dashboard
|
||||
|
||||
`Ctrl+Alt+G` or `/gsd status` shows real-time progress:
|
||||
|
|
@ -129,6 +145,7 @@ Hard-steer plan documents during execution without stopping the pipeline. Change
|
|||
- Per-unit cost and token breakdown
|
||||
- Cost projections
|
||||
- Completed and in-progress units
|
||||
- Pending capture count (when captures are awaiting triage)
|
||||
|
||||
## Phase Skipping
|
||||
|
||||
|
|
@ -141,3 +158,7 @@ Token profiles can skip certain phases to reduce cost:
|
|||
| Reassess Roadmap | Skipped | Runs | Runs |
|
||||
|
||||
See [Token Optimization](./token-optimization.md) for details.
|
||||
|
||||
## Dynamic Model Routing
|
||||
|
||||
When enabled, auto-mode automatically selects cheaper models for simple units (slice completion, UAT) and reserves expensive models for complex work (replanning, architectural tasks). See [Dynamic Model Routing](./dynamic-model-routing.md).
|
||||
|
|
|
|||
82
docs/captures-triage.md
Normal file
82
docs/captures-triage.md
Normal file
|
|
@ -0,0 +1,82 @@
|
|||
# Captures & Triage
|
||||
|
||||
*Introduced in v2.19.0*
|
||||
|
||||
Captures let you fire-and-forget thoughts during auto-mode execution. Instead of pausing auto-mode to steer, you can capture ideas, bugs, or scope changes and let GSD triage them at natural seams between tasks.
|
||||
|
||||
## Quick Start
|
||||
|
||||
While auto-mode is running (or any time):
|
||||
|
||||
```
|
||||
/gsd capture "add rate limiting to the API endpoints"
|
||||
/gsd capture "the auth flow should support OAuth, not just JWT"
|
||||
```
|
||||
|
||||
Captures are appended to `.gsd/CAPTURES.md` and triaged automatically between tasks.
|
||||
|
||||
## How It Works
|
||||
|
||||
### Pipeline
|
||||
|
||||
```
|
||||
capture → triage → confirm → resolve → resume
|
||||
```
|
||||
|
||||
1. **Capture** — `/gsd capture "thought"` appends to `.gsd/CAPTURES.md` with a timestamp and unique ID
|
||||
2. **Triage** — at natural seams between tasks (in `handleAgentEnd`), GSD detects pending captures and classifies them
|
||||
3. **Confirm** — the user is shown the proposed resolution and confirms or adjusts
|
||||
4. **Resolve** — the resolution is applied (task injection, replan trigger, deferral, etc.)
|
||||
5. **Resume** — auto-mode continues
|
||||
|
||||
### Classification Types
|
||||
|
||||
Each capture is classified into one of five types:
|
||||
|
||||
| Type | Meaning | Resolution |
|
||||
|------|---------|------------|
|
||||
| `quick-task` | Small, self-contained fix | Inline quick task executed immediately |
|
||||
| `inject` | New task needed in current slice | Task injected into the active slice plan |
|
||||
| `defer` | Important but not urgent | Deferred to roadmap reassessment |
|
||||
| `replan` | Changes the current approach | Triggers slice replan with capture context |
|
||||
| `note` | Informational, no action needed | Acknowledged, no plan changes |
|
||||
|
||||
### Automatic Triage
|
||||
|
||||
Triage fires automatically between tasks during auto-mode. The triage prompt receives:
|
||||
- All pending captures
|
||||
- The current slice plan
|
||||
- The active roadmap
|
||||
|
||||
The LLM classifies each capture and proposes a resolution. Plan-modifying resolutions (inject, replan) require user confirmation.
|
||||
|
||||
### Manual Triage
|
||||
|
||||
Trigger triage manually at any time:
|
||||
|
||||
```
|
||||
/gsd triage
|
||||
```
|
||||
|
||||
This is useful when you've accumulated several captures and want to process them before the next natural seam.
|
||||
|
||||
## Dashboard Integration
|
||||
|
||||
The progress widget shows a pending capture count badge when captures are waiting for triage. This is visible in both the `Ctrl+Alt+G` dashboard and the auto-mode progress widget.
|
||||
|
||||
## Context Injection
|
||||
|
||||
Capture context is automatically injected into:
|
||||
- **Replan-slice prompts** — so the replan knows what triggered it
|
||||
- **Reassess-roadmap prompts** — so deferred captures influence roadmap decisions
|
||||
|
||||
## Worktree Awareness
|
||||
|
||||
Captures always resolve to the **original project root's** `.gsd/CAPTURES.md`, not the worktree's local copy. This ensures captures from a steering terminal are visible to the auto-mode session running in a worktree.
|
||||
|
||||
## Commands
|
||||
|
||||
| Command | Description |
|
||||
|---------|-------------|
|
||||
| `/gsd capture "text"` | Capture a thought (quotes optional for single words) |
|
||||
| `/gsd triage` | Manually trigger triage of pending captures |
|
||||
|
|
@ -11,7 +11,11 @@
|
|||
| `/gsd steer` | Hard-steer plan documents during execution |
|
||||
| `/gsd discuss` | Discuss architecture and decisions (works alongside auto mode) |
|
||||
| `/gsd status` | Progress dashboard |
|
||||
| `/gsd queue` | Queue future milestones (safe during auto mode) |
|
||||
| `/gsd queue` | Queue and reorder future milestones (safe during auto mode) |
|
||||
| `/gsd capture` | Fire-and-forget thought capture (works during auto mode) |
|
||||
| `/gsd triage` | Manually trigger triage of pending captures |
|
||||
| `/gsd visualize` | Open workflow visualizer (progress, deps, metrics, timeline) |
|
||||
| `/gsd knowledge` | Add persistent project knowledge (rule, pattern, or lesson) |
|
||||
| `/gsd prefs` | Model selection, timeouts, budget ceiling |
|
||||
| `/gsd migrate` | Migrate a v1 `.planning` directory to `.gsd` format |
|
||||
| `/gsd doctor` | Validate `.gsd/` integrity, find and fix issues |
|
||||
|
|
|
|||
|
|
@ -334,7 +334,33 @@ custom_instructions:
|
|||
- "Prefer functional patterns over classes"
|
||||
```
|
||||
|
||||
For project-specific knowledge (patterns, gotchas, lessons learned), use `.gsd/KNOWLEDGE.md` instead — it's injected into every agent prompt automatically.
|
||||
For project-specific knowledge (patterns, gotchas, lessons learned), use `.gsd/KNOWLEDGE.md` instead — it's injected into every agent prompt automatically. Add entries with `/gsd knowledge rule|pattern|lesson <description>`.
|
||||
|
||||
### `dynamic_routing`
|
||||
|
||||
Complexity-based model routing. See [Dynamic Model Routing](./dynamic-model-routing.md).
|
||||
|
||||
```yaml
|
||||
dynamic_routing:
|
||||
enabled: true
|
||||
tier_models:
|
||||
light: claude-haiku-4-5
|
||||
standard: claude-sonnet-4-6
|
||||
heavy: claude-opus-4-6
|
||||
escalate_on_failure: true
|
||||
budget_pressure: true
|
||||
cross_provider: true
|
||||
```
|
||||
|
||||
### `auto_visualize`
|
||||
|
||||
Show the workflow visualizer automatically after milestone completion:
|
||||
|
||||
```yaml
|
||||
auto_visualize: true
|
||||
```
|
||||
|
||||
See [Workflow Visualizer](./visualizer.md).
|
||||
|
||||
## Full Example
|
||||
|
||||
|
|
@ -356,6 +382,12 @@ models:
|
|||
# Token optimization
|
||||
token_profile: balanced
|
||||
|
||||
# Dynamic model routing
|
||||
dynamic_routing:
|
||||
enabled: true
|
||||
escalate_on_failure: true
|
||||
budget_pressure: true
|
||||
|
||||
# Budget
|
||||
budget_ceiling: 25.00
|
||||
budget_enforcement: pause
|
||||
|
|
@ -387,6 +419,9 @@ notifications:
|
|||
on_milestone: true
|
||||
on_attention: true
|
||||
|
||||
# Visualizer
|
||||
auto_visualize: true
|
||||
|
||||
# Hooks
|
||||
post_unit_hooks:
|
||||
- name: code-review
|
||||
|
|
|
|||
|
|
@ -89,3 +89,5 @@ See [Token Optimization](./token-optimization.md) for details.
|
|||
- Switch to `budget` profile for well-understood, repetitive work
|
||||
- Use `quality` only when architectural decisions are being made
|
||||
- Per-phase model selection lets you use Opus only for planning while keeping execution on Sonnet
|
||||
- Enable `dynamic_routing` for automatic model downgrading on simple tasks — see [Dynamic Model Routing](./dynamic-model-routing.md)
|
||||
- Use `/gsd visualize` → Metrics tab to see where your budget is going
|
||||
|
|
|
|||
127
docs/dynamic-model-routing.md
Normal file
127
docs/dynamic-model-routing.md
Normal file
|
|
@ -0,0 +1,127 @@
|
|||
# Dynamic Model Routing
|
||||
|
||||
*Introduced in v2.19.0*
|
||||
|
||||
Dynamic model routing automatically selects cheaper models for simple work and reserves expensive models for complex tasks. This reduces token consumption by 20-50% on capped plans without sacrificing quality where it matters.
|
||||
|
||||
## How It Works
|
||||
|
||||
Each unit dispatched by auto-mode is classified into a complexity tier:
|
||||
|
||||
| Tier | Typical Work | Default Model Level |
|
||||
|------|-------------|-------------------|
|
||||
| **Light** | Slice completion, UAT, hooks | Haiku-class |
|
||||
| **Standard** | Research, planning, execution, milestone completion | Sonnet-class |
|
||||
| **Heavy** | Replanning, roadmap reassessment, complex execution | Opus-class |
|
||||
|
||||
The router then selects a model for that tier. The key rule: **downgrade-only semantics**. The user's configured model is always the ceiling — routing never upgrades beyond what you've configured.
|
||||
|
||||
## Enabling
|
||||
|
||||
Dynamic routing is off by default. Enable it in preferences:
|
||||
|
||||
```yaml
|
||||
---
|
||||
version: 1
|
||||
dynamic_routing:
|
||||
enabled: true
|
||||
---
|
||||
```
|
||||
|
||||
## Configuration
|
||||
|
||||
```yaml
|
||||
dynamic_routing:
|
||||
enabled: true
|
||||
tier_models: # explicit model per tier (optional)
|
||||
light: claude-haiku-4-5
|
||||
standard: claude-sonnet-4-6
|
||||
heavy: claude-opus-4-6
|
||||
escalate_on_failure: true # bump tier on task failure (default: true)
|
||||
budget_pressure: true # auto-downgrade when approaching budget ceiling (default: true)
|
||||
cross_provider: true # consider models from other providers (default: true)
|
||||
hooks: true # apply routing to post-unit hooks (default: true)
|
||||
```
|
||||
|
||||
### `tier_models`
|
||||
|
||||
Override which model is used for each tier. When omitted, the router uses a built-in capability mapping that knows common model families:
|
||||
|
||||
- **Light:** `claude-haiku-4-5`, `gpt-4o-mini`, `gemini-2.0-flash`
|
||||
- **Standard:** `claude-sonnet-4-6`, `gpt-4o`, `gemini-2.5-pro`
|
||||
- **Heavy:** `claude-opus-4-6`, `gpt-4.5-preview`, `gemini-2.5-pro`
|
||||
|
||||
### `escalate_on_failure`
|
||||
|
||||
When a task fails at a given tier, the router escalates to the next tier on retry. Light → Standard → Heavy. This prevents cheap models from burning retries on work that needs more reasoning.
|
||||
|
||||
### `budget_pressure`
|
||||
|
||||
When approaching the budget ceiling, the router progressively downgrades:
|
||||
|
||||
| Budget Used | Effect |
|
||||
|------------|--------|
|
||||
| < 50% | No adjustment |
|
||||
| 50-75% | Standard → Light |
|
||||
| 75-90% | More aggressive downgrading |
|
||||
| > 90% | Nearly everything → Light; only Heavy stays at Standard |
|
||||
|
||||
### `cross_provider`
|
||||
|
||||
When enabled, the router may select models from providers other than your primary. This uses the built-in cost table to find the cheapest model at each tier. Requires the target provider to be configured.
|
||||
|
||||
## Complexity Classification
|
||||
|
||||
Units are classified using pure heuristics — no LLM calls, sub-millisecond:
|
||||
|
||||
### Unit Type Defaults
|
||||
|
||||
| Unit Type | Default Tier |
|
||||
|-----------|-------------|
|
||||
| `complete-slice`, `run-uat` | Light |
|
||||
| `research-*`, `plan-*`, `complete-milestone` | Standard |
|
||||
| `execute-task` | Standard (upgraded by task analysis) |
|
||||
| `replan-slice`, `reassess-roadmap` | Heavy |
|
||||
| `hook/*` | Light |
|
||||
|
||||
### Task Plan Analysis
|
||||
|
||||
For `execute-task` units, the classifier analyzes the task plan:
|
||||
|
||||
| Signal | Simple → Light | Complex → Heavy |
|
||||
|--------|---------------|----------------|
|
||||
| Step count | ≤ 3 | ≥ 8 |
|
||||
| File count | ≤ 3 | ≥ 8 |
|
||||
| Description length | < 500 chars | > 2000 chars |
|
||||
| Code blocks | — | ≥ 5 |
|
||||
| Complexity keywords | None | Present |
|
||||
|
||||
**Complexity keywords:** `research`, `investigate`, `refactor`, `migrate`, `integrate`, `complex`, `architect`, `redesign`, `security`, `performance`, `concurrent`, `parallel`, `distributed`, `backward compat`
|
||||
|
||||
### Adaptive Learning
|
||||
|
||||
The routing history (`.gsd/routing-history.json`) tracks success/failure per tier per unit type. If a tier's failure rate exceeds 20% for a given pattern, future classifications are bumped up. User feedback (`over`/`under`/`ok`) is weighted 2× vs automatic outcomes.
|
||||
|
||||
## Interaction with Token Profiles
|
||||
|
||||
Dynamic routing and token profiles are complementary:
|
||||
|
||||
- **Token profiles** (`budget`/`balanced`/`quality`) control phase skipping and context compression
|
||||
- **Dynamic routing** controls per-unit model selection within the configured phase model
|
||||
|
||||
When both are active, token profiles set the baseline models and dynamic routing further optimizes within those baselines. The `budget` token profile + dynamic routing provides maximum cost savings.
|
||||
|
||||
## Cost Table
|
||||
|
||||
The router includes a built-in cost table for common models, used for cross-provider cost comparison. Costs are per-million tokens (input/output):
|
||||
|
||||
| Model | Input | Output |
|
||||
|-------|-------|--------|
|
||||
| claude-haiku-4-5 | $0.80 | $4.00 |
|
||||
| claude-sonnet-4-6 | $3.00 | $15.00 |
|
||||
| claude-opus-4-6 | $15.00 | $75.00 |
|
||||
| gpt-4o-mini | $0.15 | $0.60 |
|
||||
| gpt-4o | $2.50 | $10.00 |
|
||||
| gemini-2.0-flash | $0.10 | $0.40 |
|
||||
|
||||
The cost table is used for comparison only — actual billing comes from your provider.
|
||||
92
docs/visualizer.md
Normal file
92
docs/visualizer.md
Normal file
|
|
@ -0,0 +1,92 @@
|
|||
# Workflow Visualizer
|
||||
|
||||
*Introduced in v2.19.0*
|
||||
|
||||
The workflow visualizer is a full-screen TUI overlay that shows project progress, dependencies, cost metrics, and execution timeline in an interactive four-tab view.
|
||||
|
||||
## Opening the Visualizer
|
||||
|
||||
```
|
||||
/gsd visualize
|
||||
```
|
||||
|
||||
Or configure automatic display after milestone completion:
|
||||
|
||||
```yaml
|
||||
auto_visualize: true
|
||||
```
|
||||
|
||||
## Tabs
|
||||
|
||||
Switch tabs with `Tab`, `1`-`4`, or arrow keys.
|
||||
|
||||
### 1. Progress
|
||||
|
||||
A tree view of milestones, slices, and tasks with completion status:
|
||||
|
||||
```
|
||||
M001: User Management
|
||||
✅ S01: Auth module
|
||||
✅ T01: Core types
|
||||
✅ T02: JWT middleware
|
||||
✅ T03: Login flow
|
||||
⏳ S02: User dashboard
|
||||
✅ T01: Layout component
|
||||
⬜ T02: Profile page
|
||||
⬜ S03: Admin panel
|
||||
```
|
||||
|
||||
Shows checkmarks for completed items, spinners for in-progress, and empty boxes for pending.
|
||||
|
||||
### 2. Dependencies
|
||||
|
||||
An ASCII dependency graph showing slice relationships:
|
||||
|
||||
```
|
||||
S01 ──→ S02 ──→ S04
|
||||
└───→ S03 ──↗
|
||||
```
|
||||
|
||||
Visualizes the `depends:` field from the roadmap, making it easy to see which slices are blocked and which can proceed.
|
||||
|
||||
### 3. Metrics
|
||||
|
||||
Bar charts showing cost and token usage breakdowns:
|
||||
|
||||
- **By phase** — research, planning, execution, completion, reassessment
|
||||
- **By slice** — cost per slice with running totals
|
||||
- **By model** — which models consumed the most budget
|
||||
|
||||
Uses data from `.gsd/metrics.json`.
|
||||
|
||||
### 4. Timeline
|
||||
|
||||
Chronological execution history showing:
|
||||
|
||||
- Unit type and ID
|
||||
- Start/end timestamps
|
||||
- Duration
|
||||
- Model used
|
||||
- Token counts
|
||||
|
||||
Ordered by execution time, showing the full history of auto-mode dispatches.
|
||||
|
||||
## Controls
|
||||
|
||||
| Key | Action |
|
||||
|-----|--------|
|
||||
| `Tab` | Next tab |
|
||||
| `Shift+Tab` | Previous tab |
|
||||
| `1`-`4` | Jump to tab |
|
||||
| `↑`/`↓` | Scroll within tab |
|
||||
| `Escape` / `q` | Close visualizer |
|
||||
|
||||
## Auto-Refresh
|
||||
|
||||
The visualizer refreshes data from disk every 2 seconds, so it stays current if opened alongside a running auto-mode session.
|
||||
|
||||
## Configuration
|
||||
|
||||
```yaml
|
||||
auto_visualize: true # show visualizer after milestone completion
|
||||
```
|
||||
Loading…
Add table
Reference in a new issue