test: add end-to-end token optimization benchmark
Benchmark validates all optimization modules with realistic GSD content: - Structured data: 20% decisions savings, 7% requirements savings - Prompt compression: 5-17% across light/moderate/aggressive levels - Semantic chunking: 73% content reduction via TF-IDF selection - Summary distillation: 73% savings preserving structured fields - Combined pipeline: 43% total savings on realistic dispatch prompt - Cache efficiency: 94% cacheable prefix, 85% estimated Anthropic savings - Provider-aware: 14% budget accuracy improvement for Anthropic vs OpenAI
This commit is contained in:
parent
d65da6c927
commit
4e7b3d486f
1 changed files with 1272 additions and 0 deletions
File diff suppressed because it is too large
Load diff
Loading…
Add table
Reference in a new issue