Jeremy 872b0adb48 docs: reorganize into user-docs/ and dev/ subdirectories

Split flat docs/ into user-docs/ (guides, config, troubleshooting) and
dev/ (ADRs, architecture, extension guides, proposals). Updated
docs/README.md index to reflect new paths.

2026-04-10 09:25:31 -05:00

3.9 KiB

Raw Blame History

System Prompt & LLM vs Deterministic Split

The Core Separation Principle

If you could write an if-else statement that handles it correctly every time, it should not be in the LLM's context. Every token the model spends reasoning about something deterministic is wasted and introduces hallucination risk.

What the LLM Owns

Capability	Why LLM
Understanding intent	Interpretation, judgment
Architectural reasoning	Weighing tradeoffs
Code generation	Creative, context-dependent
Debugging & diagnosis	Abductive reasoning, hypothesis formation
Self-critique & quality assessment	Judgment calls

What TypeScript/Deterministic Code Owns

Capability	Why Deterministic
State machine transitions	Typed state object, no ambiguity
Context assembly	Predict + pre-load what agent needs
File operations	Validate paths, handle encoding, manage permissions
Test execution & result parsing	Structured results, not raw terminal output
Build & environment management	Install deps, start servers, manage ports
Code formatting	Run prettier automatically, never waste LLM tokens
Task scheduling & dependency resolution	Graph traversal, instant vs 5-second LLM call
Summarization triggers	Mechanical workflow, LLM provides content

Modular System Prompt Architecture

Base Layer (always present, ~500 tokens)
  → Identity, core behavioral rules, general approach
  
Phase-Specific Layer (swapped based on state)
  → Planning mode: decomposition, interfaces, risks
  → Execution mode: implementation, testing, iteration
  → Debugging mode: diagnosis, hypothesis testing, isolation

Task-Specific Layer (assembled fresh per task)
  → Current spec, acceptance criteria, relevant contracts, prior attempts

Tools Layer
  → Available tool definitions and parameters

Tool Design Philosophy

Each tool should do one thing, do it completely, and return structured results the LLM can immediately act on.

Bad: LLM calls readFile → parseJSON → runCommand (3 calls, 3 failure points)
Good: LLM calls runTests(filter) → gets structured pass/fail with locations (1 call, clean result)

Essential Tools

Tool	Returns
`runTests`	Structured results: pass count, fail count, per-failure details
`readFiles`	Batched file contents (array of paths, not one at a time)
`writeFile`	Auto-formats before writing
`searchCodebase`	Grep-like results with file paths and line numbers
`getProjectState`	Manifest + current task spec + related task statuses
`updateTaskStatus`	Handles downstream state updates automatically
`buildProject`	Structured errors with file paths and line numbers
`browserCheck`	Screenshot or structured description of rendered output
`commitChanges`	Enforces conventions, runs pre-commit hooks
`revertToCheckpoint`	Rolls back to last known good state

Prompt Patterns That Maximize Agency

Tell it what it CAN do, not what it can't. "Full authority as long as acceptance criteria and tests pass."
Explicit permission to iterate. "First attempt doesn't need to be perfect. Write, run, observe, improve."
Clear exit conditions. Concrete, measurable, unambiguous definition of "done."
Built-in scratchpad. "Write reasoning in thinking blocks. Track attempts and outcomes."
Recovery protocol. "After 3 failed approaches, produce structured escalation."

The Meta-Principle

Your TypeScript orchestrator is the deterministic skeleton — workflow, state, context, tools, coordination. The LLM is the reasoning muscle — understanding, creativity, judgment, problem-solving. Neither should do the other's job. When you get this right, the LLM becomes dramatically more capable because it's only doing what it's good at, with exactly the context it needs.

3.9 KiB Raw Blame History