ace-pm 35dc87ef53 chore: sync workspace state after rebrand

- Rebrand commits already in history (gsd → forge)
- Sync pre-existing doc, docker, and CI config updates
- All rebrand artifacts verified in place:
  * Native crates: forge-engine, forge-ast, forge-grep
  * Log prefixes: [forge] across 22+ files
  * Binary: ~/bin/sf-run
  * Workspace scopes: @sf-run/*, @singularity-forge/*
  * Nix flake: Rust toolchain ready

System ready for: nix develop && bun run build:native

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

2026-04-15 14:54:20 +02:00

3.4 KiB

Raw Blame History

Token Optimization

SF's token optimization system can reduce token usage by 40-60% without sacrificing output quality. It has three pillars: token profiles, context compression, and complexity-based task routing.

Token Profiles

A token profile coordinates model selection, phase skipping, and context compression with a single setting:

token_profile: balanced

`budget` — Maximum Savings (40-60%)

Setting	Value
Planning model	Sonnet
Execution model	Sonnet
Simple task model	Haiku
Milestone research	Skipped
Slice research	Skipped
Roadmap reassessment	Skipped
Context level	Minimal

Best for: prototyping, small projects, well-understood codebases.

`balanced` — Smart Defaults (default)

Setting	Value
All models	User's default
Milestone research	Runs
Slice research	Skipped
Roadmap reassessment	Runs
Context level	Standard

Best for: most projects, day-to-day development.

`quality` — Full Context

Setting	Value
All models	User's configured defaults
All phases	Run
Context level	Full

Best for: complex architectures, greenfield projects, critical work.

Context Compression

Each profile controls how much context is pre-loaded into AI prompts:

Profile	What's Included
`budget`	Task plan and essential prior summaries only
`balanced`	Task plan, summaries, slice plan, roadmap excerpt
`quality`	Everything — all plans, summaries, decisions, requirements

Complexity-Based Task Routing

SF classifies each task by complexity and routes it to an appropriate model:

Complexity	Indicators	Model Level
Simple	≤3 steps, ≤3 files, short description	Haiku-class
Standard	4-7 steps, 4-7 files	Sonnet-class
Complex	≥8 steps, ≥8 files, complexity keywords	Opus-class

Complexity keywords that prevent simple classification: refactor, migrate, integrate, architect, security, performance, concurrent, distributed, and others.

{% hint style="info" %} Dynamic routing requires models configured in your preferences and dynamic_routing.enabled: true. See Dynamic Model Routing. {% endhint %}

Overriding Profile Defaults

The token_profile sets defaults, but explicit preferences always win:

token_profile: budget
phases:
  skip_research: false        # override: keep research
models:
  planning: claude-opus-4-6   # override: use Opus for planning

Adaptive Learning

SF tracks success and failure of tier assignments over time. If a model tier's failure rate exceeds 20% for a given task type, future tasks of that type are bumped to a higher tier.

Submit manual feedback with:

/gsd rate over    # model was overpowered — use cheaper next time
/gsd rate ok      # model was appropriate
/gsd rate under   # model was too weak — use stronger next time

Observation Masking

During auto mode, old tool results are replaced with lightweight placeholders before each AI call. This reduces token usage between compactions with zero overhead.

context_management:
  observation_masking: true     # default: true
  observation_mask_turns: 8     # keep results from last 8 turns
  tool_result_max_chars: 800    # truncate large tool outputs

3.4 KiB Raw Blame History