# System Prompt & LLM vs Deterministic Split ### The Core Separation Principle > If you could write an if-else statement that handles it correctly every time, **it should not be in the LLM's context**. Every token the model spends reasoning about something deterministic is wasted and introduces hallucination risk. ### What the LLM Owns | Capability | Why LLM | |-----------|---------| | Understanding intent | Interpretation, judgment | | Architectural reasoning | Weighing tradeoffs | | Code generation | Creative, context-dependent | | Debugging & diagnosis | Abductive reasoning, hypothesis formation | | Self-critique & quality assessment | Judgment calls | ### What TypeScript/Deterministic Code Owns | Capability | Why Deterministic | |-----------|-------------------| | State machine transitions | Typed state object, no ambiguity | | Context assembly | Predict + pre-load what agent needs | | File operations | Validate paths, handle encoding, manage permissions | | Test execution & result parsing | Structured results, not raw terminal output | | Build & environment management | Install deps, start servers, manage ports | | Code formatting | Run prettier automatically, never waste LLM tokens | | Task scheduling & dependency resolution | Graph traversal, instant vs 5-second LLM call | | Summarization triggers | Mechanical workflow, LLM provides content | ### Modular System Prompt Architecture ``` Base Layer (always present, ~500 tokens) → Identity, core behavioral rules, general approach Phase-Specific Layer (swapped based on state) → Planning mode: decomposition, interfaces, risks → Execution mode: implementation, testing, iteration → Debugging mode: diagnosis, hypothesis testing, isolation Task-Specific Layer (assembled fresh per task) → Current spec, acceptance criteria, relevant contracts, prior attempts Tools Layer → Available tool definitions and parameters ``` ### Tool Design Philosophy > Each tool should do one thing, do it completely, and return structured results the LLM can immediately act on. **Bad:** LLM calls `readFile` → `parseJSON` → `runCommand` (3 calls, 3 failure points) **Good:** LLM calls `runTests(filter)` → gets structured pass/fail with locations (1 call, clean result) ### Essential Tools | Tool | Returns | |------|---------| | `runTests` | Structured results: pass count, fail count, per-failure details | | `readFiles` | Batched file contents (array of paths, not one at a time) | | `writeFile` | Auto-formats before writing | | `searchCodebase` | Grep-like results with file paths and line numbers | | `getProjectState` | Manifest + current task spec + related task statuses | | `updateTaskStatus` | Handles downstream state updates automatically | | `buildProject` | Structured errors with file paths and line numbers | | `browserCheck` | Screenshot or structured description of rendered output | | `commitChanges` | Enforces conventions, runs pre-commit hooks | | `revertToCheckpoint` | Rolls back to last known good state | ### Prompt Patterns That Maximize Agency 1. **Tell it what it CAN do, not what it can't.** "Full authority as long as acceptance criteria and tests pass." 2. **Explicit permission to iterate.** "First attempt doesn't need to be perfect. Write, run, observe, improve." 3. **Clear exit conditions.** Concrete, measurable, unambiguous definition of "done." 4. **Built-in scratchpad.** "Write reasoning in thinking blocks. Track attempts and outcomes." 5. **Recovery protocol.** "After 3 failed approaches, produce structured escalation." ### The Meta-Principle > Your TypeScript orchestrator is the deterministic skeleton — workflow, state, context, tools, coordination. The LLM is the reasoning muscle — understanding, creativity, judgment, problem-solving. **Neither should do the other's job.** When you get this right, the LLM becomes dramatically more capable because it's only doing what it's good at, with exactly the context it needs. ---