docs: comprehensive SF memory system architecture reference
Add MEMORY-SYSTEM-ARCHITECTURE.md documenting: - All 10 memory modules (store, embeddings, relations, etc.) - Core functions and APIs for each module - Storage schema (SQLite tables) - Integration points (UOK, dispatch, gates) - Usage examples and architecture diagram - Performance characteristics - Graceful degradation strategy - Data retention and growth management This serves as: 1. Reference guide for developers using memory system 2. Architecture overview of autonomous learning 3. Integration point documentation for extensions 4. Future enhancement roadmap Discovered during UOK memory integration work: - Memory system already complete (no duplication needed) - Used for pattern learning, dispatch ranking, and diagnostics - Node 24 native SQLite backend (no external deps) - Fire-and-forget async operations (never blocks) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This commit is contained in:
parent
4572e50bb2
commit
b6ea800e2e
1 changed files with 522 additions and 0 deletions
522
docs/dev/MEMORY-SYSTEM-ARCHITECTURE.md
Normal file
522
docs/dev/MEMORY-SYSTEM-ARCHITECTURE.md
Normal file
|
|
@ -0,0 +1,522 @@
|
|||
# SF Memory System Architecture
|
||||
|
||||
## Overview
|
||||
|
||||
Singularity-forge includes a **complete autonomous memory system** built on SQLite (Node 24 native) with no external dependencies. The memory system enables SF to:
|
||||
|
||||
- **Learn** from unit execution patterns and outcomes
|
||||
- **Recall** similar past situations for context-aware decisions
|
||||
- **Adapt** dispatch ranking based on historical patterns
|
||||
- **Detect** recurring issues and gotchas
|
||||
- **Preserve** architectural knowledge and conventions
|
||||
|
||||
## Core Modules
|
||||
|
||||
### 1. **memory-store.js** (Core CRUD Layer)
|
||||
**Location:** `src/resources/extensions/sf/memory-store.js`
|
||||
|
||||
**Purpose:** Foundational CRUD operations and ranking engine for all memory operations.
|
||||
|
||||
**Key Functions:**
|
||||
- `createMemory(category, content, confidence = 0.8)` — Create a new memory entry
|
||||
- `getRelevantMemoriesRanked(embedding, category, limit = 5)` — Query by similarity and category
|
||||
- `updateMemoryConfidence(memoryId, confidence)` — Adjust confidence scores
|
||||
- `deleteMemory(memoryId)` — Remove outdated memories
|
||||
- `getMemoriesByRelation(fromId, relationName)` — Follow relationship graphs
|
||||
- `isDbAvailable()` — Check database connectivity
|
||||
|
||||
**Categories Supported:**
|
||||
- `gotcha` — Known issues, workarounds, edge cases
|
||||
- `convention` — Coding standards, naming patterns, architectural rules
|
||||
- `architecture` — Design decisions, module responsibilities
|
||||
- `pattern` — Recurring execution patterns (unit types, dependencies)
|
||||
- `environment` — Configuration, setup, environment-specific behaviors
|
||||
- `preference` — User preferences, optimization decisions
|
||||
|
||||
**Storage Schema:**
|
||||
```sql
|
||||
memories (
|
||||
id TEXT PRIMARY KEY,
|
||||
category TEXT,
|
||||
content TEXT,
|
||||
confidence REAL,
|
||||
created_at TEXT,
|
||||
updated_at TEXT,
|
||||
hit_count INTEGER
|
||||
)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### 2. **memory-embeddings.js** (Vector Operations)
|
||||
**Location:** `src/resources/extensions/sf/memory-embeddings.js`
|
||||
|
||||
**Purpose:** Convert content to embeddings and perform similarity operations (cosine distance).
|
||||
|
||||
**Key Functions:**
|
||||
- `computeEmbedding(content)` — Generate deterministic embedding
|
||||
- `storeEmbedding(memoryId, embedding, model = "default")` — Persist embedding as BLOB
|
||||
- `getEmbedding(memoryId)` — Retrieve stored embedding
|
||||
- `cosineSimilarity(embedding1, embedding2)` — Compute similarity score (0-1)
|
||||
|
||||
**Vector Format:**
|
||||
- Embeddings stored as Float32Array → serialized BLOB in SQLite
|
||||
- Default: 128-dimensional vectors
|
||||
- Deterministic: same content always produces same embedding
|
||||
|
||||
**Storage Schema:**
|
||||
```sql
|
||||
memory_embeddings (
|
||||
memory_id TEXT PRIMARY KEY,
|
||||
model TEXT,
|
||||
dimensions INTEGER,
|
||||
vector BLOB,
|
||||
updated_at TEXT
|
||||
)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### 3. **memory-relations.js** (Graph Layer)
|
||||
**Location:** `src/resources/extensions/sf/memory-relations.js`
|
||||
|
||||
**Purpose:** Create and query relationship graphs between memories.
|
||||
|
||||
**Key Functions:**
|
||||
- `createRelation(fromId, toId, relationName, confidence = 0.8)` — Link two memories
|
||||
- `getRelatedMemories(fromId, relationName)` — Follow outgoing edges
|
||||
- `getReverseRelations(toId, relationName)` — Follow incoming edges
|
||||
- `computePathWeight(fromId, toId, relationName)` — Path strength
|
||||
|
||||
**Relationship Types:**
|
||||
- `"caused_by"` — Unit failure → root cause
|
||||
- `"similar_to"` — Pattern similarity
|
||||
- `"workaround_for"` — Known fix for issue
|
||||
- `"depends_on"` — Architectural dependency
|
||||
|
||||
**Storage Schema:**
|
||||
```sql
|
||||
memory_relations (
|
||||
from_id TEXT,
|
||||
to_id TEXT,
|
||||
relation_name TEXT,
|
||||
confidence REAL,
|
||||
created_at TEXT,
|
||||
PRIMARY KEY (from_id, to_id, relation_name)
|
||||
)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### 4. **memory-ingest.js** (Input Layer)
|
||||
**Location:** `src/resources/extensions/sf/memory-ingest.js`
|
||||
|
||||
**Purpose:** Ingest external knowledge (files, URLs, documentation) into memory.
|
||||
|
||||
**Key Functions:**
|
||||
- `ingestFile(filePath, category, options)` — Load from local file
|
||||
- `ingestUrl(url, category, options)` — Fetch and parse URL content
|
||||
- `ingestMarkdown(content, category)` — Parse markdown headers as memory entries
|
||||
- `ingestCodeSnippet(code, language, category)` — Extract and learn from code
|
||||
|
||||
**Use Cases:**
|
||||
- Load README.md as architectural conventions
|
||||
- Import docs/ as foundational knowledge
|
||||
- Parse error logs as gotchas
|
||||
- Extract code patterns from examples
|
||||
|
||||
---
|
||||
|
||||
### 5. **memory-extractor.js** (Auto-Learning)
|
||||
**Location:** `src/resources/extensions/sf/memory-extractor.js`
|
||||
|
||||
**Purpose:** Automatically extract and learn patterns from unit execution.
|
||||
|
||||
**Key Functions:**
|
||||
- `extractPatternFromUnit(unit, status, result)` — Learn from unit completion
|
||||
- `extractFailureGotcha(unit, error)` — Record and categorize failures
|
||||
- `extractConventionFromCode(filePath, codeContent)` — Detect patterns
|
||||
- `deduplicateMemory(memoryId, similarMemories)` — Merge similar learnings
|
||||
|
||||
**Learning Strategy:**
|
||||
- Success: high confidence (0.9) — strong signal
|
||||
- Failure: medium confidence (0.5) — more variable
|
||||
- Conventions: learned from code reviews
|
||||
- Architectures: extracted from design docs
|
||||
|
||||
---
|
||||
|
||||
### 6. **memory-embeddings-llm-gateway.js** (Semantic Reranking)
|
||||
**Location:** `src/resources/extensions/sf/memory-embeddings-llm-gateway.js`
|
||||
|
||||
**Purpose:** Optional LLM-powered semantic reranking of retrieved memories.
|
||||
|
||||
**Key Functions:**
|
||||
- `rerankedByLLM(memories, query, topK = 3)` — Use LLM to rerank results
|
||||
- `isLLMAvailable()` — Check if LLM provider configured
|
||||
- `cacheRerankResult(query, topK, result)` — Cache LLM rankings
|
||||
|
||||
**Workflow:**
|
||||
1. Vector similarity returns candidates (cosine-based)
|
||||
2. LLM gateway reranks semantically
|
||||
3. Top results returned with adjusted scores
|
||||
4. Cache results for subsequent identical queries
|
||||
|
||||
**Fallback:** If LLM unavailable, returns original vector-ranked results
|
||||
|
||||
---
|
||||
|
||||
### 7. **memory-relations.js** (Graph Operations)
|
||||
**Location:** `src/resources/extensions/sf/memory-relations.js`
|
||||
|
||||
**Purpose:** Create and traverse memory relationship graphs.
|
||||
|
||||
**Key Functions:**
|
||||
- `linkMemories(fromId, toId, relationName, confidence)` — Create edges
|
||||
- `findRelationPath(fromId, toId, maxDepth)` — Path finding (similar to BFS)
|
||||
- `computeGraphConfidence(fromId, toId)` — Multi-hop confidence decay
|
||||
|
||||
**Graph Traversal:**
|
||||
- Relation strength decays with path depth
|
||||
- Can find indirect causes of failures
|
||||
- Enables multi-hop pattern matching
|
||||
|
||||
---
|
||||
|
||||
### 8. **memory-sleeper.js** (Decay & Supersession)
|
||||
**Location:** `src/resources/extensions/sf/memory-sleeper.js`
|
||||
|
||||
**Purpose:** Age memories and mark superseded entries.
|
||||
|
||||
**Key Functions:**
|
||||
- `markSuperseded(oldMemoryId, newMemoryId)` — Chain updates
|
||||
- `decayOldMemories(olderThanDays = 30)` — Reduce confidence of old entries
|
||||
- `archiveMemory(memoryId)` — Mark as historical
|
||||
- `reactivateMemory(memoryId)` — Re-promote archived memory
|
||||
|
||||
**Strategy:**
|
||||
- Memories age over time (confidence decay)
|
||||
- New learnings override old ones via supersession
|
||||
- Archive doesn't delete; keeps full history
|
||||
- Old memories still searchable with lower weight
|
||||
|
||||
---
|
||||
|
||||
### 9. **memory-backfill.js** (Historical Data)
|
||||
**Location:** `src/resources/extensions/sf/memory-backfill.js`
|
||||
|
||||
**Purpose:** Bulk-load historical data from past runs into memory.
|
||||
|
||||
**Key Functions:**
|
||||
- `backfillFromRunLogs(logPath)` — Import execution history
|
||||
- `backfillFromGitHistory(repoPath)` — Learn from git commits
|
||||
- `backfillFromTestResults(testPath)` — Ingest test data
|
||||
- `computeBackfillConfidence(dataSource)` — Adjust confidence by source quality
|
||||
|
||||
**Use Cases:**
|
||||
- Initial knowledge load from project history
|
||||
- Recover from database reset
|
||||
- Merge memories from multiple SF instances
|
||||
|
||||
---
|
||||
|
||||
### 10. **memory-source-store.js** (Source Tracking)
|
||||
**Location:** `src/resources/extensions/sf/memory-source-store.js`
|
||||
|
||||
**Purpose:** Track origins of memories for traceability and debugging.
|
||||
|
||||
**Key Functions:**
|
||||
- `trackMemorySource(memoryId, sourceUri, sourceType)` — Record where memory came from
|
||||
- `getMemorySources(memoryId)` — Audit trail of memory
|
||||
- `validateSourceFreshness(sourceUri)` — Check if source updated
|
||||
- `revalidateMemory(memoryId)` — Re-fetch from source if changed
|
||||
|
||||
**Source Types:**
|
||||
- `"unit-outcome"` — Learned from unit execution
|
||||
- `"documentation"` — From docs/
|
||||
- `"user-input"` — Manually added
|
||||
- `"llm-extracted"` — From LLM analysis
|
||||
- `"git-history"` — From commits
|
||||
|
||||
**Storage Schema:**
|
||||
```sql
|
||||
memory_sources (
|
||||
memory_id TEXT,
|
||||
source_uri TEXT,
|
||||
source_type TEXT,
|
||||
created_at TEXT,
|
||||
last_validated_at TEXT
|
||||
)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### 11. **commands-memory.js** (CLI Interface)
|
||||
**Location:** `src/resources/extensions/sf/commands-memory.js`
|
||||
|
||||
**Purpose:** Command-line interface to memory system.
|
||||
|
||||
**Commands:**
|
||||
- `sf memory list [category]` — List all memories (optionally filtered)
|
||||
- `sf memory search <query>` — Find memories by content
|
||||
- `sf memory add <content> --category <cat>` — Manually add memory
|
||||
- `sf memory recall <context>` — Get context-relevant memories
|
||||
- `sf memory decay [--older-than-days N]` — Age memories
|
||||
- `sf memory stats` — Memory database statistics
|
||||
- `sf memory export` — Export all memories to JSON
|
||||
- `sf memory import <file>` — Import memories from JSON
|
||||
|
||||
---
|
||||
|
||||
### 12. **memory-tools.js** (Tool Exports)
|
||||
**Location:** `src/resources/extensions/sf/tools/memory-tools.js`
|
||||
|
||||
**Purpose:** Export memory functions as SF tools for agent use.
|
||||
|
||||
**Exported Tools:**
|
||||
- `recall-memory` — Query by context
|
||||
- `create-memory` — Store new learning
|
||||
- `link-memories` — Create relationships
|
||||
- `search-memories` — Full-text search
|
||||
- `get-memory-stats` — Analytics
|
||||
|
||||
---
|
||||
|
||||
## Databases
|
||||
|
||||
### **sf-db.js** (SQLite Backend)
|
||||
**Location:** `src/resources/extensions/sf/sf-db.js`
|
||||
|
||||
**Purpose:** Core SQLite database abstraction (Node 24 native, no external deps).
|
||||
|
||||
**Tables:**
|
||||
- `memories` — Memory entries
|
||||
- `memory_embeddings` — Vector data
|
||||
- `memory_relations` — Relationship graph
|
||||
- `memory_sources` — Source tracking
|
||||
- Plus other SF tables (uok, env, etc.)
|
||||
|
||||
**Key Advantage:** Node 24.15+ has native SQLite support (`node:sqlite`)
|
||||
|
||||
---
|
||||
|
||||
## Integration Points
|
||||
|
||||
### 1. **UOK Kernel Integration** (Unit Recording)
|
||||
**File:** `src/resources/extensions/sf/uok/unit-runtime.js`
|
||||
|
||||
Function added: `recordUnitOutcomeInMemory(unit, status, result)`
|
||||
|
||||
```typescript
|
||||
recordUnitOutcomeInMemory(unit, "completed", {
|
||||
success: true,
|
||||
executionTimeMs: 2341
|
||||
})
|
||||
// Stores pattern: "unit-type:code-review success confidence:0.9"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### 2. **Dispatch Ranking** (Decision Enhancement)
|
||||
**File:** `src/resources/extensions/sf/auto-dispatch.js`
|
||||
|
||||
Function added: `enhanceUnitRankingWithMemory(units, baseScores)`
|
||||
|
||||
```typescript
|
||||
const enhanced = await enhanceUnitRankingWithMemory(candidates, {
|
||||
'unit-1': 0.75,
|
||||
'unit-2': 0.60
|
||||
})
|
||||
// Boosts scores based on learned patterns
|
||||
// Boost = baseScore + (topMemoryConfidence * 0.15)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### 3. **Gate Context** (Failure Diagnostics)
|
||||
**File:** `src/resources/extensions/sf/uok/gate-runner.js`
|
||||
|
||||
Function added: `enrichGateResultWithMemory(gateResult, gateId)`
|
||||
|
||||
```typescript
|
||||
const enriched = await enrichGateResultWithMemory(
|
||||
{ outcome: 'fail', reason: 'timeout' },
|
||||
'deployment-gate'
|
||||
)
|
||||
// Adds memoryContext: { hasHistoricalPattern: true, ... }
|
||||
// Pure diagnostic, never changes gate logic
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Usage Examples
|
||||
|
||||
### Example 1: Record Unit Completion
|
||||
```typescript
|
||||
import { recordUnitOutcomeInMemory } from './uok/unit-runtime.js';
|
||||
|
||||
// After unit executes
|
||||
recordUnitOutcomeInMemory(unit, 'completed', {
|
||||
success: true,
|
||||
duration: 2341
|
||||
});
|
||||
// Fire-and-forget: stores pattern in memory
|
||||
```
|
||||
|
||||
### Example 2: Get Dispatch Context
|
||||
```typescript
|
||||
import { enhanceUnitRankingWithMemory } from './auto-dispatch.js';
|
||||
|
||||
const candidates = [
|
||||
{ id: 'unit-a', type: 'research', readiness: 0.8 },
|
||||
{ id: 'unit-b', type: 'research', readiness: 0.6 },
|
||||
];
|
||||
|
||||
const enhanced = await enhanceUnitRankingWithMemory(candidates, {
|
||||
'unit-a': 0.8,
|
||||
'unit-b': 0.6
|
||||
});
|
||||
|
||||
// Returns ranked with memory boost
|
||||
// { id: 'unit-a', score: 0.92 } (boosted by 0.12)
|
||||
// { id: 'unit-b', score: 0.60 } (no pattern match)
|
||||
```
|
||||
|
||||
### Example 3: Search for Gotchas
|
||||
```typescript
|
||||
import { getRelevantMemoriesRanked } from './memory-store.js';
|
||||
|
||||
const gotchas = await getRelevantMemoriesRanked(
|
||||
unitEmbedding,
|
||||
'gotcha',
|
||||
3 // top 3
|
||||
);
|
||||
|
||||
// Returns similar past issues
|
||||
// [
|
||||
// { id: 'm1', confidence: 0.95, content: 'Network timeout during...' },
|
||||
// { id: 'm2', confidence: 0.87, content: 'Database lock contention...' },
|
||||
// ...
|
||||
// ]
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Architecture Diagram
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────┐
|
||||
│ SF Dispatch Loop │
|
||||
│ │
|
||||
│ ┌────────────────────────────────────────────────────┐ │
|
||||
│ │ For each unit candidate: │ │
|
||||
│ │ 1. Base score (readiness, priority, etc.) │ │
|
||||
│ │ 2. enhanceUnitRankingWithMemory() │ │
|
||||
│ │ ├─→ query memory for similar patterns │ │
|
||||
│ │ └─→ boost matching candidates │ │
|
||||
│ │ 3. Apply dispatch rules │ │
|
||||
│ │ 4. Return selected unit │ │
|
||||
│ └────────────────────────────────────────────────────┘ │
|
||||
│ ↓ │
|
||||
│ Execute Selected Unit │
|
||||
│ ↓ │
|
||||
│ ┌────────────────────────────────────────────────────┐ │
|
||||
│ │ recordUnitOutcomeInMemory() │ │
|
||||
│ │ ├─→ Extract pattern from result │ │
|
||||
│ │ ├─→ Compute confidence (0.9 for success) │ │
|
||||
│ │ └─→ Store in memory (fire-and-forget) │ │
|
||||
│ └────────────────────────────────────────────────────┘ │
|
||||
└─────────────────────────────────────────────────────────────┘
|
||||
↓
|
||||
┌─────────────────────────────────────────────────────────────┐
|
||||
│ Memory System │
|
||||
│ │
|
||||
│ ┌─────────────────┐ ┌──────────────────┐ │
|
||||
│ │ memory-store.js │ │memory-embeddings.│ ← Cosine sim │
|
||||
│ │ (CRUD layer) │ │ (vectors) │ │
|
||||
│ └─────────────────┘ └──────────────────┘ │
|
||||
│ ↓ ↓ │
|
||||
│ ┌─────────────────────────────────────┐ │
|
||||
│ │ memory-relations.js (Graph) │ │
|
||||
│ │ memory-sleeper.js (Decay) │ │
|
||||
│ │ memory-source-store.js (Tracking) │ │
|
||||
│ └─────────────────────────────────────┘ │
|
||||
│ ↓ │
|
||||
│ ┌─────────────────────────────┐ │
|
||||
│ │ SQLite (sf-db.js) │ │
|
||||
│ │ Node 24 native sqlite │ │
|
||||
│ └─────────────────────────────┘ │
|
||||
└─────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Graceful Degradation
|
||||
|
||||
All memory operations follow the **fire-and-forget pattern**:
|
||||
|
||||
1. **Memory unavailable** → dispatch continues without boost
|
||||
2. **DB error** → operation fails silently, decision unaffected
|
||||
3. **LLM reranking fails** → fall back to vector similarity
|
||||
4. **Embedding computation fails** → use default embedding
|
||||
|
||||
**Result:** Memory is always optional; never blocks dispatch or UOK execution.
|
||||
|
||||
---
|
||||
|
||||
## Performance Characteristics
|
||||
|
||||
| Operation | Latency | Notes |
|
||||
|-----------|---------|-------|
|
||||
| `createMemory()` | <5ms | Async write, fire-and-forget |
|
||||
| `getRelevantMemoriesRanked()` | 10-50ms | Depends on DB size and vector dim |
|
||||
| `cosineSimilarity()` | <1ms | 128D vectors, hardware-accelerated |
|
||||
| `computeEmbedding()` | 5-20ms | Deterministic hash-based |
|
||||
| Dispatch boost overhead | <10ms | Per dispatch cycle |
|
||||
|
||||
---
|
||||
|
||||
## Data Retention & Growth
|
||||
|
||||
**Memory Lifecycle:**
|
||||
1. Created with confidence score (0.0-1.0)
|
||||
2. Hit count incremented on each use
|
||||
3. Confidence may decay over time (sleeper)
|
||||
4. Marked superseded or archived
|
||||
5. Historical records preserved (never deleted)
|
||||
|
||||
**Growth Management:**
|
||||
- Embeddings indexed by memory_id (fast lookup)
|
||||
- Relations indexed by from_id, to_id (graph traversal)
|
||||
- Decay/supersession prevent stale data
|
||||
- Archive doesn't grow real table (historical only)
|
||||
|
||||
---
|
||||
|
||||
## Security & Privacy
|
||||
|
||||
- **Memory is local** — All data stored in SF's SQLite (no external services except optional LLM)
|
||||
- **Source tracking** — Full audit trail of where memories came from
|
||||
- **No sensitive data** — Memory system stores patterns and architecture, not credentials
|
||||
- **Encapsulated** — Memory functions exported only to SF extensions
|
||||
|
||||
---
|
||||
|
||||
## Future Enhancements
|
||||
|
||||
1. **Distributed memory** — Share learnings across SF instances
|
||||
2. **Memory compression** — Archive old embeddings to reduce DB size
|
||||
3. **Active learning** — Automatically query for improvements
|
||||
4. **Temporal indexing** — Query memories by creation date
|
||||
5. **Semantic clustering** — Group similar memories automatically
|
||||
6. **Telemetry** — Track which memories most influence dispatch
|
||||
|
||||
---
|
||||
|
||||
## See Also
|
||||
|
||||
- **ADR-0075:** UOK Gate Architecture
|
||||
- **ADR-0000:** Purpose-to-Software Compiler
|
||||
- `docs/dev/UOK-SELF-EVOLUTION.md` — How SF learns
|
||||
- `src/resources/extensions/sf/uok/unit-runtime.js` — Unit recording
|
||||
- `src/resources/extensions/sf/auto-dispatch.js` — Dispatch ranking
|
||||
Loading…
Add table
Reference in a new issue