Jeremy 872b0adb48 docs: reorganize into user-docs/ and dev/ subdirectories

Split flat docs/ into user-docs/ (guides, config, troubleshooting) and
dev/ (ADRs, architecture, extension guides, proposals). Updated
docs/README.md index to reflect new paths.

2026-04-10 09:25:31 -05:00

1.5 KiB

Raw Blame History

Cost-Quality Tradeoff & Model Routing

The Key Insight

Quality requirements vary enormously across task types, but most systems use the same model for everything.

The Optimal Model Routing Strategy (All 4 Agree)

Task Type	Model Tier	Rationale
Planning, architecture, critique	Frontier (always)	Planning errors cascade through every downstream task
Ambiguity resolution	Frontier	Wrong interpretation = wasted execution
Well-specified implementation (CRUD, standard UI, utilities)	Mid-tier / capable but cheaper	Task is well-defined, patterns established
Code review, test generation	Mid-tier	Evaluating against known criteria, not generating novel solutions
Summarization (task records, manifest updates)	Lightest viable	Language competence, minimal reasoning depth
Boilerplate	Small/fast model	Predictable output, low reasoning requirements

The Non-Obvious Cost Optimization

Reducing wasted tokens is higher leverage than reducing token price. A bloated context window costs money on every single call. Trimming 500 unnecessary tokens from context assembly saves more over a project than switching to a model that's 10% cheaper.

Measurement

Track cost-per-successful-task, not cost-per-task. If the cheaper model requires twice as many iterations, it's not actually cheaper. Grok reports 60-70% cost reduction with zero quality loss when routing is done at the orchestrator level.

1.5 KiB Raw Blame History