singularity-forge/docs/dev/building-coding-agents/07-system-prompt-llm-vs-deterministic-split.md
Jeremy 872b0adb48 docs: reorganize into user-docs/ and dev/ subdirectories
Split flat docs/ into user-docs/ (guides, config, troubleshooting) and
dev/ (ADRs, architecture, extension guides, proposals). Updated
docs/README.md index to reflect new paths.
2026-04-10 09:25:31 -05:00

3.9 KiB

System Prompt & LLM vs Deterministic Split

The Core Separation Principle

If you could write an if-else statement that handles it correctly every time, it should not be in the LLM's context. Every token the model spends reasoning about something deterministic is wasted and introduces hallucination risk.

What the LLM Owns

Capability Why LLM
Understanding intent Interpretation, judgment
Architectural reasoning Weighing tradeoffs
Code generation Creative, context-dependent
Debugging & diagnosis Abductive reasoning, hypothesis formation
Self-critique & quality assessment Judgment calls

What TypeScript/Deterministic Code Owns

Capability Why Deterministic
State machine transitions Typed state object, no ambiguity
Context assembly Predict + pre-load what agent needs
File operations Validate paths, handle encoding, manage permissions
Test execution & result parsing Structured results, not raw terminal output
Build & environment management Install deps, start servers, manage ports
Code formatting Run prettier automatically, never waste LLM tokens
Task scheduling & dependency resolution Graph traversal, instant vs 5-second LLM call
Summarization triggers Mechanical workflow, LLM provides content

Modular System Prompt Architecture

Base Layer (always present, ~500 tokens)
  → Identity, core behavioral rules, general approach
  
Phase-Specific Layer (swapped based on state)
  → Planning mode: decomposition, interfaces, risks
  → Execution mode: implementation, testing, iteration
  → Debugging mode: diagnosis, hypothesis testing, isolation

Task-Specific Layer (assembled fresh per task)
  → Current spec, acceptance criteria, relevant contracts, prior attempts

Tools Layer
  → Available tool definitions and parameters

Tool Design Philosophy

Each tool should do one thing, do it completely, and return structured results the LLM can immediately act on.

Bad: LLM calls readFileparseJSONrunCommand (3 calls, 3 failure points)
Good: LLM calls runTests(filter) → gets structured pass/fail with locations (1 call, clean result)

Essential Tools

Tool Returns
runTests Structured results: pass count, fail count, per-failure details
readFiles Batched file contents (array of paths, not one at a time)
writeFile Auto-formats before writing
searchCodebase Grep-like results with file paths and line numbers
getProjectState Manifest + current task spec + related task statuses
updateTaskStatus Handles downstream state updates automatically
buildProject Structured errors with file paths and line numbers
browserCheck Screenshot or structured description of rendered output
commitChanges Enforces conventions, runs pre-commit hooks
revertToCheckpoint Rolls back to last known good state

Prompt Patterns That Maximize Agency

  1. Tell it what it CAN do, not what it can't. "Full authority as long as acceptance criteria and tests pass."
  2. Explicit permission to iterate. "First attempt doesn't need to be perfect. Write, run, observe, improve."
  3. Clear exit conditions. Concrete, measurable, unambiguous definition of "done."
  4. Built-in scratchpad. "Write reasoning in thinking blocks. Track attempts and outcomes."
  5. Recovery protocol. "After 3 failed approaches, produce structured escalation."

The Meta-Principle

Your TypeScript orchestrator is the deterministic skeleton — workflow, state, context, tools, coordination. The LLM is the reasoning muscle — understanding, creativity, judgment, problem-solving. Neither should do the other's job. When you get this right, the LLM becomes dramatically more capable because it's only doing what it's good at, with exactly the context it needs.