* feat: dynamic model routing for token consumption optimization (#575)
Add complexity-based model routing that classifies units into light/standard/heavy
tiers and routes to cheaper models when appropriate. Reduces token consumption
by 20-50% for users on capped plans.
- Complexity classifier with heuristic-based tier assignment (no LLM call)
- Model router with downgrade-only semantics (user's config is ceiling)
- Budget-pressure-aware routing (more aggressive as budget fills)
- Cross-provider cost comparison via bundled cost table
- Hook classification support
- Escalation on failure (light → standard → heavy)
- Full preference validation and merge support
- Metrics tracking with tier and downgrade fields
- 40 new tests (classifier, router, cost table)
Closes#575
* feat: phases 2-4 — dashboard, adaptive learning, task introspection
Phase 2 — Observability & Dashboard:
- Tier badge [L]/[S]/[H] displayed in progress widget next to phase label
- Dynamic routing savings summary shown in footer when units have been downgraded
- Tier and modelDowngraded fields passed through snapshotUnitMetrics
Phase 3 — Adaptive Learning:
- New routing-history.ts: tracks success/failure per tier per unit-type pattern
- Rolling window of 50 entries per pattern to prevent stale data
- User feedback support (over/under/ok) with 2x weight vs automatic
- Failure rate >20% auto-bumps tier for that pattern
- Tag-specific patterns (e.g. execute-task:docs) for granular learning
- History persists to .gsd/routing-history.json
- Classifier consults adaptive history before finalizing tier
Phase 4 — Task Plan Introspection:
- Code block counting in task plans (5+ blocks → heavy)
- Complexity keyword detection: migration, architecture, security,
performance, concurrency, compatibility
- Multiple complexity keywords (2+) → heavy, single → standard
- New codeBlockCount and complexityKeywords fields in TaskMetadata
Tests: 16 new tests (routing history + introspection), 419 total passing
* docs: add startup performance analysis and optimization plan
Profiled GSD CLI startup finding 2.2s for --version and ~3.8s for
interactive mode. Identified 5 root causes with measured timings and
created a phased optimization plan targeting <0.2s for --version
and ~0.8s for interactive startup.
* perf: speed up GSD startup with lazy loading and fast paths
- Fast-path --version/-v and --help/-h in loader.ts before importing
any heavy dependencies (2.2s → 0.15s, 14x faster)
- Lazy-load undici (~200ms) only when HTTP_PROXY env vars are set
- Skip initResources cpSync when managed-resources.json version
matches current GSD version (~128ms saved per launch)
- Lazy-load Mistral SDK (~369ms) on first API call instead of startup
- Lazy-load Google GenAI SDK (~186ms) on first API call instead of
startup
- Parallelize extension loading with Promise.all() instead of
sequential for-loop
---------
Co-authored-by: TÂCHES <afromanguy@me.com>
When all credentials for a provider are exhausted, the system now
automatically falls back to the next available provider in a
user-configured fallback chain. Higher-priority providers are
restored automatically when their backoff expires.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>