singularity-forge/docs/dev/building-coding-agents/04-optimal-storage-for-project-context.md

57 lines
2.2 KiB
Markdown
Raw Normal View History

# Optimal Storage for Project Context
### The Universal Answer: Plain Text Files in the Repo + Structured State Store
All four models converge on a hybrid approach. The key insight: **don't over-engineer with databases and vector stores, but don't under-engineer with a single massive file either.**
### The Optimal Stack
| Storage | What Lives Here | Why |
|---------|----------------|-----|
| **Project Manifest** (`PROJECT.md`) | Vision, principles, architecture overview, component status | Always loaded, <1000 tokens, single source of truth |
| **Structured State** (JSON/SQLite/Postgres) | Task status, phase, dependencies, verification results | Machine-parseable, drives state machine transitions |
| **Context Directory** (`.context/` or `.ai/`) | Architecture docs, task specs, decision records | Organized for retrieval, not human browsing |
| **Git Repository** | Actual source code, test results | Ultimate ground truth, never duplicated |
| **Knowledge Graph** (optional at scale) | File → function → dependency relationships | Enables "what breaks if I change this?" queries |
### Why Plain Files Win
- AI reads files directly — no query language, no ORM, no API calls
- Version control comes free via git
- Human can read and edit with any text editor
- Survives tooling changes — not locked into any system
### Why NOT Vector Stores (as primary)
- Project context is **structured** — you know where things are
- Vector stores return **approximately relevant** results — approximate is often wrong in codebases
- They can't represent state, relationships, or task progress
### The Hybrid Format
Individual files use **YAML frontmatter + Markdown body**:
```yaml
---
status: in_progress
dependencies: [AUTH-01, DB-02]
acceptance_criteria:
- User can reset password via email
- Token expires after 30 minutes
---
## Task: Password Reset Flow
[Rich narrative description and context here]
```
### Size Discipline
| File | Target Size |
|------|------------|
| Project Manifest | <1,000 tokens |
| Individual task files (completed) | <500 tokens |
| Architecture doc | <2,000 tokens |
> The context system isn't just storage — it's a **compression engine**. Its job is to maintain maximum useful understanding in minimum token footprint.
---