Commit graph

1 commit

Author SHA1 Message Date
Jeremy McSpadden
39b3daee6f feat: add token optimization suite for prompt caching, compression, and smart context selection
Introduces six new modules that work together to reduce token usage across
the dispatch pipeline while preserving semantic content quality:

- Provider-aware token counting with per-provider char/token ratios
- Prompt cache optimizer for maximizing Anthropic/OpenAI cache hit rates
- Structured data formatter (compact notation for decisions/requirements/tasks)
- Deterministic prompt compressor (light/moderate/aggressive levels)
- Semantic chunker with TF-IDF relevance scoring for context selection
- Summary distiller for condensed dependency summaries

Integration points:
- inlineDependencySummaries uses distillation before truncation (3+ deps)
- inlineDecisionsFromDb/inlineRequirementsFromDb use compact format at non-full levels
- buildExecuteTaskPrompt compresses carry-forward when it exceeds 40% of budget
- context-budget.reduceToFit combines compression with section-boundary truncation
- computeBudgets accepts optional provider for accurate char/token ratios

All existing 1475 unit tests + 30 integration tests pass with zero regressions.
157 new tests cover all optimization modules.
2026-03-17 22:02:27 -05:00