83 lines
3.9 KiB
Markdown
83 lines
3.9 KiB
Markdown
|
|
# System Prompt & LLM vs Deterministic Split
|
||
|
|
|
||
|
|
### The Core Separation Principle
|
||
|
|
|
||
|
|
> If you could write an if-else statement that handles it correctly every time, **it should not be in the LLM's context**. Every token the model spends reasoning about something deterministic is wasted and introduces hallucination risk.
|
||
|
|
|
||
|
|
### What the LLM Owns
|
||
|
|
|
||
|
|
| Capability | Why LLM |
|
||
|
|
|-----------|---------|
|
||
|
|
| Understanding intent | Interpretation, judgment |
|
||
|
|
| Architectural reasoning | Weighing tradeoffs |
|
||
|
|
| Code generation | Creative, context-dependent |
|
||
|
|
| Debugging & diagnosis | Abductive reasoning, hypothesis formation |
|
||
|
|
| Self-critique & quality assessment | Judgment calls |
|
||
|
|
|
||
|
|
### What TypeScript/Deterministic Code Owns
|
||
|
|
|
||
|
|
| Capability | Why Deterministic |
|
||
|
|
|-----------|-------------------|
|
||
|
|
| State machine transitions | Typed state object, no ambiguity |
|
||
|
|
| Context assembly | Predict + pre-load what agent needs |
|
||
|
|
| File operations | Validate paths, handle encoding, manage permissions |
|
||
|
|
| Test execution & result parsing | Structured results, not raw terminal output |
|
||
|
|
| Build & environment management | Install deps, start servers, manage ports |
|
||
|
|
| Code formatting | Run prettier automatically, never waste LLM tokens |
|
||
|
|
| Task scheduling & dependency resolution | Graph traversal, instant vs 5-second LLM call |
|
||
|
|
| Summarization triggers | Mechanical workflow, LLM provides content |
|
||
|
|
|
||
|
|
### Modular System Prompt Architecture
|
||
|
|
|
||
|
|
```
|
||
|
|
Base Layer (always present, ~500 tokens)
|
||
|
|
→ Identity, core behavioral rules, general approach
|
||
|
|
|
||
|
|
Phase-Specific Layer (swapped based on state)
|
||
|
|
→ Planning mode: decomposition, interfaces, risks
|
||
|
|
→ Execution mode: implementation, testing, iteration
|
||
|
|
→ Debugging mode: diagnosis, hypothesis testing, isolation
|
||
|
|
|
||
|
|
Task-Specific Layer (assembled fresh per task)
|
||
|
|
→ Current spec, acceptance criteria, relevant contracts, prior attempts
|
||
|
|
|
||
|
|
Tools Layer
|
||
|
|
→ Available tool definitions and parameters
|
||
|
|
```
|
||
|
|
|
||
|
|
### Tool Design Philosophy
|
||
|
|
|
||
|
|
> Each tool should do one thing, do it completely, and return structured results the LLM can immediately act on.
|
||
|
|
|
||
|
|
**Bad:** LLM calls `readFile` → `parseJSON` → `runCommand` (3 calls, 3 failure points)
|
||
|
|
**Good:** LLM calls `runTests(filter)` → gets structured pass/fail with locations (1 call, clean result)
|
||
|
|
|
||
|
|
### Essential Tools
|
||
|
|
|
||
|
|
| Tool | Returns |
|
||
|
|
|------|---------|
|
||
|
|
| `runTests` | Structured results: pass count, fail count, per-failure details |
|
||
|
|
| `readFiles` | Batched file contents (array of paths, not one at a time) |
|
||
|
|
| `writeFile` | Auto-formats before writing |
|
||
|
|
| `searchCodebase` | Grep-like results with file paths and line numbers |
|
||
|
|
| `getProjectState` | Manifest + current task spec + related task statuses |
|
||
|
|
| `updateTaskStatus` | Handles downstream state updates automatically |
|
||
|
|
| `buildProject` | Structured errors with file paths and line numbers |
|
||
|
|
| `browserCheck` | Screenshot or structured description of rendered output |
|
||
|
|
| `commitChanges` | Enforces conventions, runs pre-commit hooks |
|
||
|
|
| `revertToCheckpoint` | Rolls back to last known good state |
|
||
|
|
|
||
|
|
### Prompt Patterns That Maximize Agency
|
||
|
|
|
||
|
|
1. **Tell it what it CAN do, not what it can't.** "Full authority as long as acceptance criteria and tests pass."
|
||
|
|
2. **Explicit permission to iterate.** "First attempt doesn't need to be perfect. Write, run, observe, improve."
|
||
|
|
3. **Clear exit conditions.** Concrete, measurable, unambiguous definition of "done."
|
||
|
|
4. **Built-in scratchpad.** "Write reasoning in thinking blocks. Track attempts and outcomes."
|
||
|
|
5. **Recovery protocol.** "After 3 failed approaches, produce structured escalation."
|
||
|
|
|
||
|
|
### The Meta-Principle
|
||
|
|
|
||
|
|
> Your TypeScript orchestrator is the deterministic skeleton — workflow, state, context, tools, coordination. The LLM is the reasoning muscle — understanding, creativity, judgment, problem-solving. **Neither should do the other's job.** When you get this right, the LLM becomes dramatically more capable because it's only doing what it's good at, with exactly the context it needs.
|
||
|
|
|
||
|
|
---
|