Jeremy 872b0adb48 docs: reorganize into user-docs/ and dev/ subdirectories

Split flat docs/ into user-docs/ (guides, config, troubleshooting) and
dev/ (ADRs, architecture, extension guides, proposals). Updated
docs/README.md index to reflect new paths.

2026-04-10 09:25:31 -05:00

2.7 KiB

Raw Blame History

Speed Optimization

The #1 Speed Principle

The fastest possible operation is the one you don't perform. Before optimizing any step, ask: does this step need to exist at all?

Speed Levers (Ranked by Impact)

1. Minimize LLM Calls

Batch intent into single calls. Don't generate code, then tests, then docs separately. One call: "implement, test, and document." TypeScript splits the output.
Deterministic fast paths. Missing import? Syntax error? Fix without an LLM call if the fix is mechanical.
Audit call chains ruthlessly — most systems have 50%+ unnecessary sequential calls.

2. Make Feedback Loops Instantaneous

Use test watch mode (no cold start)
Run only relevant test subsets (track which files affect which tests)
Incremental builds (hot module reloading)
Async, non-blocking file writes

3. Precompute Context

Predict what the agent will need based on task definition
Pre-load into the prompt — no tool calls needed mid-generation
Speculative pre-fetching (like CPU cache prefetching)

4. Parallelize Independent Work

Minimize startup cost for new parallel agents (pre-built templates, warm connections)
Use the dependency graph to identify independent work automatically

5. Stream Everything, Block on Nothing

Process tokens as they arrive
Pipeline parallelism: start formatting code while commit message is still generating

6. Cache Aggressively

In-memory cache of everything agent might need
Cross-task caching for unchanged files
Cache LLM results for deterministic inputs (boilerplate, type definitions)

7. Minimize Token Waste

Dense context, not verbose context
Structured formats for structured data
Minify reference code that's informational, not for modification

Anti-Patterns That Murder Speed

Anti-Pattern	Fix
Re-verifying things that can't have changed	Dependency-aware selective re-verification
Excessive self-reflection on simple tasks	Complexity-based workflow routing
Over-summarization between micro-steps	Only full context reset at task boundaries
Waiting for human approval on auto-verifiable work	Human checkpoints at milestones, not tasks
Quadratic history growth	Aggressive compression at every transition
Synchronous blocking tools	Async everything, pipeline parallelism

The Speed Multiplier Nobody Talks About

Failure prediction. Track patterns across tasks. If certain task types fail on first attempt, pre-load extra guidance. Preventing a failed iteration is faster than executing one.

The magical feeling of speed comes from only doing things that matter, and then doing those things as fast as possible. The system should feel like the agent knew what to do and just did it.

2.7 KiB Raw Blame History