Reduces auto-mode token consumption by 40-60% through coordinated optimizations driven by a single token_profile preference. Profile presets (budget/balanced/quality): - One preference key coordinates model selection, phase skipping, context compression, and subagent routing - Balanced is the default for new projects (D046) - Explicit user preferences always override profile defaults Phase skipping: - Guard clauses on research-milestone, research-slice, and reassess-roadmap dispatch rules - Skipped phases return null (fall-through), preserving state machine - Budget profile skips all research + reassess; balanced skips slice research only Context compression: - inlineLevel parameter (full/standard/minimal) on 6 prompt builders - Minimal: only output template + essential context (≥30% reduction) - Standard: skip redundant templates - Full: current behavior unchanged Complexity routing: - classifyTaskComplexity() for task plans (step/file/signal heuristics) - classifyUnitComplexity() for unit types with budget pressure thresholds at 50/75/90% (from #579) - execution_simple model config for cheap simple-task routing - escalateTier() for failure recovery (light→standard→heavy) Adaptive learning (from #579): - routing-history.ts tracks success/failure per tier per pattern - Rolling 50-entry window, 20% failure threshold auto-bumps tier - User feedback weighted 2x vs automatic detection - Persists to .gsd/routing-history.json Budget prediction: - getAverageCostPerUnitType() + predictRemainingCost() in metrics - projectedRemainingCost + profileDowngraded in AutoDashboardData - One-way auto-downgrade within a milestone (D048) Addresses #575 95 tests across 5 test files, all passing. |
||
|---|---|---|
| .. | ||
| agents | ||
| extensions | ||
| skills | ||
| GSD-WORKFLOW.md | ||