539 lines
19 KiB
Markdown
539 lines
19 KiB
Markdown
# SF Memory System Architecture
|
|
|
|
## Overview
|
|
|
|
Singularity-forge includes a **complete autonomous memory system** built on SQLite (Node 26 native) with no external dependencies. The memory system enables SF to:
|
|
|
|
- **Learn** from unit execution patterns and outcomes
|
|
- **Recall** similar past situations for context-aware decisions
|
|
- **Adapt** dispatch ranking based on historical patterns
|
|
- **Detect** recurring issues and gotchas
|
|
- **Preserve** architectural knowledge and conventions
|
|
|
|
## Core Modules
|
|
|
|
### 1. **memory-store.js** (Core CRUD Layer)
|
|
**Location:** `src/resources/extensions/sf/memory-store.js`
|
|
|
|
**Purpose:** Foundational CRUD operations and ranking engine for all memory operations.
|
|
|
|
**Key Functions:**
|
|
- `createMemory(category, content, confidence = 0.8)` — Create a new memory entry
|
|
- `getRelevantMemoriesRanked(embedding, category, limit = 5)` — Query by similarity and category
|
|
- `updateMemoryConfidence(memoryId, confidence)` — Adjust confidence scores
|
|
- `deleteMemory(memoryId)` — Remove outdated memories
|
|
- `getMemoriesByRelation(fromId, relationName)` — Follow relationship graphs
|
|
- `isDbAvailable()` — Check database connectivity
|
|
|
|
**Categories Supported:**
|
|
- `gotcha` — Known issues, workarounds, edge cases
|
|
- `convention` — Coding standards, naming patterns, architectural rules
|
|
- `architecture` — Design decisions, module responsibilities
|
|
- `pattern` — Recurring execution patterns (unit types, dependencies)
|
|
- `environment` — Configuration, setup, environment-specific behaviors
|
|
- `preference` — User preferences, optimization decisions
|
|
|
|
**Storage Schema:**
|
|
```sql
|
|
memories (
|
|
id TEXT PRIMARY KEY,
|
|
category TEXT,
|
|
content TEXT,
|
|
confidence REAL,
|
|
created_at TEXT,
|
|
updated_at TEXT,
|
|
hit_count INTEGER
|
|
)
|
|
```
|
|
|
|
---
|
|
|
|
### 2. **memory-embeddings.js** (Vector Operations)
|
|
**Location:** `src/resources/extensions/sf/memory-embeddings.js`
|
|
|
|
**Purpose:** Convert content to embeddings and perform similarity operations (cosine distance).
|
|
|
|
**Key Functions:**
|
|
- `computeEmbedding(content)` — Generate deterministic embedding
|
|
- `storeEmbedding(memoryId, embedding, model = "default")` — Persist embedding as BLOB
|
|
- `getEmbedding(memoryId)` — Retrieve stored embedding
|
|
- `cosineSimilarity(embedding1, embedding2)` — Compute similarity score (0-1)
|
|
|
|
**Vector Format:**
|
|
- Embeddings stored as Float32Array → serialized BLOB in SQLite
|
|
- Default: 128-dimensional vectors
|
|
- Deterministic: same content always produces same embedding
|
|
|
|
**Storage Schema:**
|
|
```sql
|
|
memory_embeddings (
|
|
memory_id TEXT PRIMARY KEY,
|
|
model TEXT,
|
|
dimensions INTEGER,
|
|
vector BLOB,
|
|
updated_at TEXT
|
|
)
|
|
```
|
|
|
|
---
|
|
|
|
### 3. **memory-relations.js** (Graph Layer)
|
|
**Location:** `src/resources/extensions/sf/memory-relations.js`
|
|
|
|
**Purpose:** Create and query relationship graphs between memories.
|
|
|
|
**Key Functions:**
|
|
- `createRelation(fromId, toId, relationName, confidence = 0.8)` — Link two memories
|
|
- `getRelatedMemories(fromId, relationName)` — Follow outgoing edges
|
|
- `getReverseRelations(toId, relationName)` — Follow incoming edges
|
|
- `computePathWeight(fromId, toId, relationName)` — Path strength
|
|
|
|
**Relationship Types:**
|
|
- `"caused_by"` — Unit failure → root cause
|
|
- `"similar_to"` — Pattern similarity
|
|
- `"workaround_for"` — Known fix for issue
|
|
- `"depends_on"` — Architectural dependency
|
|
|
|
**Storage Schema:**
|
|
```sql
|
|
memory_relations (
|
|
from_id TEXT,
|
|
to_id TEXT,
|
|
relation_name TEXT,
|
|
confidence REAL,
|
|
created_at TEXT,
|
|
PRIMARY KEY (from_id, to_id, relation_name)
|
|
)
|
|
```
|
|
|
|
---
|
|
|
|
### 4. **memory-ingest.js** (Input Layer)
|
|
**Location:** `src/resources/extensions/sf/memory-ingest.js`
|
|
|
|
**Purpose:** Ingest external knowledge (files, URLs, documentation) into memory.
|
|
|
|
**Key Functions:**
|
|
- `ingestFile(filePath, category, options)` — Load from local file
|
|
- `ingestUrl(url, category, options)` — Fetch and parse URL content
|
|
- `ingestMarkdown(content, category)` — Parse markdown headers as memory entries
|
|
- `ingestCodeSnippet(code, language, category)` — Extract and learn from code
|
|
|
|
**Use Cases:**
|
|
- Load README.md as architectural conventions
|
|
- Import docs/ as foundational knowledge
|
|
- Parse error logs as gotchas
|
|
- Extract code patterns from examples
|
|
|
|
---
|
|
|
|
### 5. **memory-extractor.js** (Auto-Learning)
|
|
**Location:** `src/resources/extensions/sf/memory-extractor.js`
|
|
|
|
**Purpose:** Automatically extract and learn patterns from unit execution.
|
|
|
|
**Key Functions:**
|
|
- `extractPatternFromUnit(unit, status, result)` — Learn from unit completion
|
|
- `extractFailureGotcha(unit, error)` — Record and categorize failures
|
|
- `extractConventionFromCode(filePath, codeContent)` — Detect patterns
|
|
- `deduplicateMemory(memoryId, similarMemories)` — Merge similar learnings
|
|
|
|
**Learning Strategy:**
|
|
- Success: high confidence (0.9) — strong signal
|
|
- Failure: medium confidence (0.5) — more variable
|
|
- Conventions: learned from code reviews
|
|
- Architectures: extracted from design docs
|
|
|
|
---
|
|
|
|
### 6. **memory-embeddings-llm-gateway.js** (Semantic Reranking)
|
|
**Location:** `src/resources/extensions/sf/memory-embeddings-llm-gateway.js`
|
|
|
|
**Purpose:** Optional LLM-powered semantic reranking of retrieved memories.
|
|
|
|
**Key Functions:**
|
|
- `rerankedByLLM(memories, query, topK = 3)` — Use LLM to rerank results
|
|
- `isLLMAvailable()` — Check if LLM provider configured
|
|
- `cacheRerankResult(query, topK, result)` — Cache LLM rankings
|
|
|
|
**Workflow:**
|
|
1. Vector similarity returns candidates (cosine-based)
|
|
2. LLM gateway reranks semantically
|
|
3. Top results returned with adjusted scores
|
|
4. Cache results for subsequent identical queries
|
|
|
|
**Fallback:** If LLM unavailable, returns original vector-ranked results
|
|
|
|
---
|
|
|
|
### 7. **memory-relations.js** (Graph Operations)
|
|
**Location:** `src/resources/extensions/sf/memory-relations.js`
|
|
|
|
**Purpose:** Create and traverse memory relationship graphs.
|
|
|
|
**Key Functions:**
|
|
- `linkMemories(fromId, toId, relationName, confidence)` — Create edges
|
|
- `findRelationPath(fromId, toId, maxDepth)` — Path finding (similar to BFS)
|
|
- `computeGraphConfidence(fromId, toId)` — Multi-hop confidence decay
|
|
|
|
**Graph Traversal:**
|
|
- Relation strength decays with path depth
|
|
- Can find indirect causes of failures
|
|
- Enables multi-hop pattern matching
|
|
|
|
---
|
|
|
|
### 8. **memory-sleeper.js** (Decay & Supersession)
|
|
**Location:** `src/resources/extensions/sf/memory-sleeper.js`
|
|
|
|
**Purpose:** Age memories and mark superseded entries.
|
|
|
|
**Key Functions:**
|
|
- `markSuperseded(oldMemoryId, newMemoryId)` — Chain updates
|
|
- `decayOldMemories(olderThanDays = 30)` — Reduce confidence of old entries
|
|
- `archiveMemory(memoryId)` — Mark as historical
|
|
- `reactivateMemory(memoryId)` — Re-promote archived memory
|
|
|
|
**Strategy:**
|
|
- Memories age over time (confidence decay)
|
|
- New learnings override old ones via supersession
|
|
- Archive doesn't delete; keeps full history
|
|
- Old memories still searchable with lower weight
|
|
|
|
---
|
|
|
|
### 9. **memory-backfill.js** (Historical Data)
|
|
**Location:** `src/resources/extensions/sf/memory-backfill.js`
|
|
|
|
**Purpose:** Bulk-load historical data from past runs into memory.
|
|
|
|
**Key Functions:**
|
|
- `backfillFromRunLogs(logPath)` — Import execution history
|
|
- `backfillFromGitHistory(repoPath)` — Learn from git commits
|
|
- `backfillFromTestResults(testPath)` — Ingest test data
|
|
- `computeBackfillConfidence(dataSource)` — Adjust confidence by source quality
|
|
|
|
**Use Cases:**
|
|
- Initial knowledge load from project history
|
|
- Recover from database reset
|
|
- Merge memories from multiple SF instances
|
|
|
|
---
|
|
|
|
### 10. **memory-source-store.js** (Source Tracking)
|
|
**Location:** `src/resources/extensions/sf/memory-source-store.js`
|
|
|
|
**Purpose:** Track origins of memories for traceability and debugging.
|
|
|
|
**Key Functions:**
|
|
- `trackMemorySource(memoryId, sourceUri, sourceType)` — Record where memory came from
|
|
- `getMemorySources(memoryId)` — Audit trail of memory
|
|
- `validateSourceFreshness(sourceUri)` — Check if source updated
|
|
- `revalidateMemory(memoryId)` — Re-fetch from source if changed
|
|
|
|
**Source Types:**
|
|
- `"unit-outcome"` — Learned from unit execution
|
|
- `"documentation"` — From docs/
|
|
- `"user-input"` — Manually added
|
|
- `"llm-extracted"` — From LLM analysis
|
|
- `"git-history"` — From commits
|
|
|
|
**Storage Schema:**
|
|
```sql
|
|
memory_sources (
|
|
memory_id TEXT,
|
|
source_uri TEXT,
|
|
source_type TEXT,
|
|
created_at TEXT,
|
|
last_validated_at TEXT
|
|
)
|
|
```
|
|
|
|
---
|
|
|
|
### 11. **commands-memory.js** (CLI Interface)
|
|
**Location:** `src/resources/extensions/sf/commands-memory.js`
|
|
|
|
**Purpose:** Command-line interface to memory system.
|
|
|
|
**Commands:**
|
|
- `/sf memory list [category]` — List all memories (optionally filtered)
|
|
- `/sf memory search <query>` — Find memories by content
|
|
- `/sf memory note <content>` — Manually add memory
|
|
- `/sf memory status` — Memory database statistics
|
|
- `/sf memory decay [--older-than-days N]` — Age memories
|
|
- `/sf memory export <path.json>` — Export all memories to JSON
|
|
- `/sf memory import <path.json>` — Import memories from JSON
|
|
|
|
---
|
|
|
|
### 12. **memory-tools.js** (Tool Exports)
|
|
**Location:** `src/resources/extensions/sf/tools/memory-tools.js`
|
|
|
|
**Purpose:** Export memory functions as SF tools for agent use.
|
|
|
|
**Exported Tools:**
|
|
- `recall-memory` — Query by context
|
|
- `create-memory` — Store new learning
|
|
- `link-memories` — Create relationships
|
|
- `search-memories` — Full-text search
|
|
- `get-memory-stats` — Analytics
|
|
|
|
---
|
|
|
|
## Databases
|
|
|
|
### **sf-db.js** (SQLite Backend)
|
|
**Location:** `src/resources/extensions/sf/sf-db.js`
|
|
|
|
**Purpose:** Core SQLite database abstraction (Node 26 native, no external deps).
|
|
|
|
**Tables:**
|
|
- `memories` — Memory entries
|
|
- `memory_embeddings` — Vector data
|
|
- `memory_relations` — Relationship graph
|
|
- `memory_sources` — Source tracking
|
|
- Plus other SF tables (uok, env, etc.)
|
|
|
|
**Key Advantage:** Node 26.1+ has native SQLite support (`node:sqlite`)
|
|
|
|
---
|
|
|
|
## Integration Points
|
|
|
|
### 1. **UOK Kernel Integration** (Unit Recording)
|
|
**File:** `src/resources/extensions/sf/uok/unit-runtime.js`
|
|
|
|
Function added: `recordUnitOutcomeInMemory(unit, status, result)`
|
|
|
|
```typescript
|
|
recordUnitOutcomeInMemory(unit, "completed", {
|
|
success: true,
|
|
executionTimeMs: 2341
|
|
})
|
|
// Stores pattern: "unit-type:code-review success confidence:0.9"
|
|
```
|
|
|
|
---
|
|
|
|
### 2. **Dispatch Ranking** (Decision Enhancement)
|
|
**File:** `src/resources/extensions/sf/auto-dispatch.js`
|
|
|
|
Function added: `enhanceUnitRankingWithMemory(units, baseScores)`
|
|
|
|
```typescript
|
|
const enhanced = await enhanceUnitRankingWithMemory(candidates, {
|
|
'unit-1': 0.75,
|
|
'unit-2': 0.60
|
|
})
|
|
// Boosts scores based on learned patterns
|
|
// Boost = baseScore + (topMemoryConfidence * 0.15)
|
|
```
|
|
|
|
---
|
|
|
|
### 3. **Gate Context** (Failure Diagnostics)
|
|
**File:** `src/resources/extensions/sf/uok/gate-runner.js`
|
|
|
|
Function added: `enrichGateResultWithMemory(gateResult, gateId)`
|
|
|
|
```typescript
|
|
const enriched = await enrichGateResultWithMemory(
|
|
{ outcome: 'fail', reason: 'timeout' },
|
|
'deployment-gate'
|
|
)
|
|
// Adds memoryContext: { hasHistoricalPattern: true, ... }
|
|
// Pure diagnostic, never changes gate logic
|
|
```
|
|
|
|
---
|
|
|
|
## Usage Examples
|
|
|
|
### Example 1: Record Unit Completion
|
|
```typescript
|
|
import { recordUnitOutcomeInMemory } from './uok/unit-runtime.js';
|
|
|
|
// After unit executes
|
|
recordUnitOutcomeInMemory(unit, 'completed', {
|
|
success: true,
|
|
duration: 2341
|
|
});
|
|
// Fire-and-forget: stores pattern in memory
|
|
```
|
|
|
|
### Example 2: Get Dispatch Context
|
|
```typescript
|
|
import { enhanceUnitRankingWithMemory } from './auto-dispatch.js';
|
|
|
|
const candidates = [
|
|
{ id: 'unit-a', type: 'research', readiness: 0.8 },
|
|
{ id: 'unit-b', type: 'research', readiness: 0.6 },
|
|
];
|
|
|
|
const enhanced = await enhanceUnitRankingWithMemory(candidates, {
|
|
'unit-a': 0.8,
|
|
'unit-b': 0.6
|
|
});
|
|
|
|
// Returns ranked with memory boost
|
|
// { id: 'unit-a', score: 0.92 } (boosted by 0.12)
|
|
// { id: 'unit-b', score: 0.60 } (no pattern match)
|
|
```
|
|
|
|
### Example 3: Search for Gotchas
|
|
```typescript
|
|
import { getRelevantMemoriesRanked } from './memory-store.js';
|
|
|
|
const gotchas = await getRelevantMemoriesRanked(
|
|
unitEmbedding,
|
|
'gotcha',
|
|
3 // top 3
|
|
);
|
|
|
|
// Returns similar past issues
|
|
// [
|
|
// { id: 'm1', confidence: 0.95, content: 'Network timeout during...' },
|
|
// { id: 'm2', confidence: 0.87, content: 'Database lock contention...' },
|
|
// ...
|
|
// ]
|
|
```
|
|
|
|
---
|
|
|
|
## Architecture Diagram
|
|
|
|
```
|
|
┌─────────────────────────────────────────────────────────────┐
|
|
│ SF Dispatch Loop │
|
|
│ │
|
|
│ ┌────────────────────────────────────────────────────┐ │
|
|
│ │ For each unit candidate: │ │
|
|
│ │ 1. Base score (readiness, priority, etc.) │ │
|
|
│ │ 2. enhanceUnitRankingWithMemory() │ │
|
|
│ │ ├─→ query memory for similar patterns │ │
|
|
│ │ └─→ boost matching candidates │ │
|
|
│ │ 3. Apply dispatch rules │ │
|
|
│ │ 4. Return selected unit │ │
|
|
│ └────────────────────────────────────────────────────┘ │
|
|
│ ↓ │
|
|
│ Execute Selected Unit │
|
|
│ ↓ │
|
|
│ ┌────────────────────────────────────────────────────┐ │
|
|
│ │ recordUnitOutcomeInMemory() │ │
|
|
│ │ ├─→ Extract pattern from result │ │
|
|
│ │ ├─→ Compute confidence (0.9 for success) │ │
|
|
│ │ └─→ Store in memory (fire-and-forget) │ │
|
|
│ └────────────────────────────────────────────────────┘ │
|
|
└─────────────────────────────────────────────────────────────┘
|
|
↓
|
|
┌─────────────────────────────────────────────────────────────┐
|
|
│ Memory System │
|
|
│ │
|
|
│ ┌─────────────────┐ ┌──────────────────┐ │
|
|
│ │ memory-store.js │ │memory-embeddings.│ ← Cosine sim │
|
|
│ │ (CRUD layer) │ │ (vectors) │ │
|
|
│ └─────────────────┘ └──────────────────┘ │
|
|
│ ↓ ↓ │
|
|
│ ┌─────────────────────────────────────┐ │
|
|
│ │ memory-relations.js (Graph) │ │
|
|
│ │ memory-sleeper.js (Decay) │ │
|
|
│ │ memory-source-store.js (Tracking) │ │
|
|
│ └─────────────────────────────────────┘ │
|
|
│ ↓ │
|
|
│ ┌─────────────────────────────┐ │
|
|
│ │ SQLite (sf-db.js) │ │
|
|
│ │ Node 26 native sqlite │ │
|
|
│ └─────────────────────────────┘ │
|
|
└─────────────────────────────────────────────────────────────┘
|
|
```
|
|
|
|
---
|
|
|
|
## Graceful Degradation
|
|
|
|
All memory operations follow the **fire-and-forget pattern**:
|
|
|
|
1. **Memory unavailable** → dispatch continues without boost
|
|
2. **DB error** → operation fails silently, decision unaffected
|
|
3. **LLM reranking fails** → fall back to vector similarity
|
|
4. **Embedding computation fails** → use default embedding
|
|
|
|
**Result:** Memory is always optional; never blocks dispatch or UOK execution.
|
|
|
|
---
|
|
|
|
## Performance Characteristics
|
|
|
|
| Operation | Latency | Notes |
|
|
|-----------|---------|-------|
|
|
| `createMemory()` | <5ms | Async write, fire-and-forget |
|
|
| `getRelevantMemoriesRanked()` | 10-50ms | Depends on DB size and vector dim |
|
|
| `cosineSimilarity()` | <1ms | 128D vectors, hardware-accelerated |
|
|
| `computeEmbedding()` | 5-20ms | Deterministic hash-based |
|
|
| Dispatch boost overhead | <10ms | Per dispatch cycle |
|
|
|
|
---
|
|
|
|
## Data Retention & Growth
|
|
|
|
**Memory Lifecycle:**
|
|
1. Created with confidence score (0.0-1.0)
|
|
2. Hit count incremented on each use
|
|
3. Confidence may decay over time (sleeper)
|
|
4. Marked superseded or archived
|
|
5. Historical records preserved (never deleted)
|
|
|
|
**Growth Management:**
|
|
- Embeddings indexed by memory_id (fast lookup)
|
|
- Relations indexed by from_id, to_id (graph traversal)
|
|
- Decay/supersession prevent stale data
|
|
- Archive doesn't grow real table (historical only)
|
|
|
|
---
|
|
|
|
## Security & Privacy
|
|
|
|
- **Memory is local** — All data stored in SF's SQLite (no external services except optional LLM)
|
|
- **Source tracking** — Full audit trail of where memories came from
|
|
- **No sensitive data** — Memory system stores patterns and architecture, not credentials
|
|
- **Encapsulated** — Memory functions exported only to SF extensions
|
|
|
|
---
|
|
|
|
## Future Enhancements
|
|
|
|
1. **Distributed memory** — Share learnings across SF instances
|
|
2. **Memory compression** — Archive old embeddings to reduce DB size
|
|
3. **Active learning** — Automatically query for improvements
|
|
4. **Temporal indexing** — Query memories by creation date
|
|
5. **Semantic clustering** — Group similar memories automatically
|
|
6. **Telemetry** — Track which memories most influence dispatch
|
|
|
|
---
|
|
|
|
## Architecture Decision: SF Tools, Not MCP
|
|
|
|
**Memory is NOT exposed as MCP server.**
|
|
|
|
- **SF is an MCP *client* only** — SF consumes MCP tools from external services
|
|
- **Memory is internal SF infrastructure** — uses SQLite (Node 26 native)
|
|
- **Memory exported as SF tools** — LLM agents within SF call memory functions
|
|
- **No external exposure** — Memory system is not a service; it's SF's autonomous learning mechanism
|
|
|
|
This keeps memory **private to SF** and prevents:
|
|
- External memory pollution
|
|
- Uncontrolled confidence scoring
|
|
- Inconsistent learning patterns
|
|
- Loss of autonomy (memory decisions stay internal)
|
|
|
|
---
|
|
|
|
## See Also
|
|
|
|
- **ADR-0075:** UOK Gate Architecture
|
|
- **ADR-0000:** Purpose-to-Software Compiler
|
|
- `docs/dev/UOK-SELF-EVOLUTION.md` — How SF learns
|
|
- `src/resources/extensions/sf/uok/unit-runtime.js` — Unit recording
|
|
- `src/resources/extensions/sf/auto-dispatch.js` — Dispatch ranking
|
|
- `src/resources/extensions/sf/tools/memory-tools.js` — SF tool executors
|