From f2db20b4d62a5ef4751a2a7df38dd82195d3cc47 Mon Sep 17 00:00:00 2001
From: Mikael Hugo <mhugo@vega.hugo.dk>
Date: Wed, 6 May 2026 23:03:50 +0200
Subject: [PATCH] docs: add SQLite migration guide for Node 24 upgrade

Comprehensive guide for migrating from JSON to node:sqlite when Node 24 is available:
- Schema design (model_outcomes + model_stats tables)
- Phase-by-phase refactoring approach
- Data migration from JSON with backward compatibility
- Testing strategy with new SQLite-specific tests
- Future opportunities: dashboards, trend analysis, A/B testing, federated learning

This doc serves as a roadmap for ~2 days of work when Node 24 becomes standard.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
---
 docs/dev/SQLITE-MIGRATION.md | 312 +++++++++++++++++++++++++++++++++++
 1 file changed, 312 insertions(+)
 create mode 100644 docs/dev/SQLITE-MIGRATION.md

diff --git a/docs/dev/SQLITE-MIGRATION.md b/docs/dev/SQLITE-MIGRATION.md
new file mode 100644
index 000000000..59964a4a6
--- /dev/null
+++ b/docs/dev/SQLITE-MIGRATION.md
@@ -0,0 +1,312 @@
+# SQLite Migration Guide for Model Learning
+
+**Status**: Planned for Node 24.15.0 upgrade
+**Current**: JSON-based storage (model-learner.js, self-report-fixer.js)
+**Target**: Native `node:sqlite` integration
+
+## Why SQLite?
+
+1. **Zero dependencies**: Node 24+ has built-in `node:sqlite` (no package install)
+2. **Queryable**: SQL joins with UOK's `llm_task_outcomes` table for unified learning database
+3. **Transactional**: Atomic outcome recording prevents partial state corruption
+4. **Performant**: Indexes on (task_type, model_id) for per-task-type ranking queries
+5. **Durable**: WAL mode ensures data survives crashes
+
+## Current State (Node 20)
+
+### JSON-Based Storage
+- `model-learner.js`: `.sf/model-performance.json` (nested object hierarchy)
+  ```json
+  {
+    "execute-task": {
+      "gpt-4o": {
+        "successes": 42,
+        "failures": 3,
+        "successRate": 0.93
+      }
+    }
+  }
+  ```
+- `self-report-fixer.js`: Stateless (no persistent storage)
+- `triage-self-feedback.js`: Reads/writes `REQUIREMENTS.md`, `ARCHITECTURE.md`
+
+### Pain Points
+- Entire file read/write on every outcome (O(n) latency)
+- No queryable schema (must load all data, filter in-memory)
+- No transactions (partial failures possible)
+- No natural joins with UOK database
+
+## SQLite Schema (Target)
+
+### Table 1: model_outcomes
+Raw event log for every model outcome.
+
+```sql
+CREATE TABLE model_outcomes (
+  id INTEGER PRIMARY KEY AUTOINCREMENT,
+  task_type TEXT NOT NULL,           -- "execute-task", "plan-slice", etc.
+  model_id TEXT NOT NULL,             -- "gpt-4o", "claude-opus", etc.
+  success INTEGER NOT NULL,           -- 1 = success, 0 = failure
+  timeout INTEGER NOT NULL DEFAULT 0, -- 1 = timed out, 0 = normal
+  tokens_used INTEGER NOT NULL DEFAULT 0,
+  cost_usd REAL NOT NULL DEFAULT 0.0,
+  timestamp TEXT NOT NULL,            -- ISO 8601
+  FOREIGN KEY (task_type, model_id) REFERENCES model_stats(task_type, model_id)
+);
+
+CREATE INDEX idx_outcomes_task_model ON model_outcomes(task_type, model_id);
+CREATE INDEX idx_outcomes_timestamp ON model_outcomes(timestamp DESC);
+```
+
+### Table 2: model_stats
+Aggregated per-task-per-model statistics (updated atomically with each outcome).
+
+```sql
+CREATE TABLE model_stats (
+  id INTEGER PRIMARY KEY AUTOINCREMENT,
+  task_type TEXT NOT NULL,
+  model_id TEXT NOT NULL,
+  successes INTEGER NOT NULL DEFAULT 0,
+  failures INTEGER NOT NULL DEFAULT 0,
+  timeouts INTEGER NOT NULL DEFAULT 0,
+  total_tokens INTEGER NOT NULL DEFAULT 0,
+  total_cost REAL NOT NULL DEFAULT 0.0,
+  last_used TEXT,                    -- ISO 8601 timestamp of last outcome
+  UNIQUE(task_type, model_id)
+);
+
+CREATE INDEX idx_stats_task_model ON model_stats(task_type, model_id);
+```
+
+## Migration Steps
+
+### Phase 1: Refactor `ModelPerformanceTracker` (model-learner.js)
+
+**Before** (JSON):
+```javascript
+recordOutcome(taskType, modelId, outcome) {
+  if (!this.data[taskType]) this.data[taskType] = {};
+  if (!this.data[taskType][modelId]) {
+    this.data[taskType][modelId] = { successes: 0, failures: 0, ... };
+  }
+  const stats = this.data[taskType][modelId];
+  if (outcome.success) stats.successes += 1;
+  else stats.failures += 1;
+  this._save(); // Entire file rewrite
+}
+```
+
+**After** (SQLite):
+```javascript
+recordOutcome(taskType, modelId, outcome) {
+  this.db.exec("BEGIN");
+  
+  // Insert event
+  const insertStmt = this.db.prepare(`
+    INSERT INTO model_outcomes (task_type, model_id, success, timeout, ...)
+    VALUES (?, ?, ?, ?, ...)
+  `);
+  insertStmt.run(taskType, modelId, outcome.success ? 1 : 0, ...);
+  
+  // Upsert stats
+  const updateStmt = this.db.prepare(`
+    INSERT INTO model_stats (task_type, model_id, successes, ...)
+    VALUES (?, ?, ?, ...)
+    ON CONFLICT(task_type, model_id) DO UPDATE SET
+      successes = successes + ?,
+      failures = failures + ?,
+      ...
+  `);
+  updateStmt.run(...);
+  
+  this.db.exec("COMMIT");
+}
+```
+
+**Benefits**:
+- O(1) outcome recording (single INSERT)
+- Atomic transaction (both tables updated together)
+- No full-file rewrite
+
+### Phase 2: Update Query Methods
+
+**getRankedModels** → SQL SELECT with ORDER BY
+
+```javascript
+getRankedModels(taskType, minSamples = 3) {
+  const query = this.db.prepare(`
+    SELECT model_id, successes, failures, total_tokens, total_cost, last_used
+    FROM model_stats
+    WHERE task_type = ? AND (successes + failures) >= ?
+    ORDER BY (CAST(successes AS FLOAT) / (successes + failures)) DESC
+  `);
+  return query.all(taskType, minSamples).map(row => ({
+    modelId: row.model_id,
+    successRate: row.successes / (row.successes + row.failures),
+    ...
+  }));
+}
+```
+
+### Phase 3: Integrate with UOK Database (Optional)
+
+If UOK stores outcomes in its database, consider a **federated schema**:
+- Keep model_learner SQLite database separate (`.sf/model-performance.db`)
+- OR: Create view in UOK database that joins with UOK's `llm_task_outcomes`
+
+```sql
+-- In UOK database:
+CREATE VIEW model_performance AS
+SELECT 
+  outcome.task_type,
+  outcome.model_id,
+  COUNT(CASE WHEN outcome.success = 1 THEN 1 END) as successes,
+  COUNT(CASE WHEN outcome.success = 0 THEN 1 END) as failures,
+  SUM(outcome.tokens_used) as total_tokens,
+  SUM(outcome.cost_usd) as total_cost
+FROM llm_task_outcomes outcome
+GROUP BY outcome.task_type, outcome.model_id;
+```
+
+### Phase 4: Data Migration (JSON → SQLite)
+
+Create migration function in constructor:
+
+```javascript
+_initDb() {
+  const db = new DatabaseSync(this.dbPath);
+  // ... create tables ...
+  
+  // Migrate existing JSON data
+  if (existsSync(this.oldJsonPath)) {
+    const jsonData = JSON.parse(readFileSync(this.oldJsonPath, 'utf-8'));
+    this._migrateFromJson(db, jsonData);
+    // After migration: delete old JSON or archive
+  }
+  
+  return db;
+}
+
+_migrateFromJson(db, jsonData) {
+  db.exec("BEGIN");
+  
+  for (const [taskType, models] of Object.entries(jsonData)) {
+    for (const [modelId, stats] of Object.entries(models)) {
+      const insertStmt = db.prepare(`
+        INSERT INTO model_stats 
+        (task_type, model_id, successes, failures, timeouts, total_tokens, total_cost, last_used)
+        VALUES (?, ?, ?, ?, ?, ?, ?, ?)
+      `);
+      insertStmt.run(
+        taskType, modelId,
+        stats.successes, stats.failures, stats.timeouts || 0,
+        stats.totalTokens, stats.totalCost, stats.lastUsed
+      );
+    }
+  }
+  
+  db.exec("COMMIT");
+}
+```
+
+## Testing Strategy
+
+### Unit Tests (No Changes Needed)
+Existing tests in `model-learner.test.ts` should pass unchanged:
+- `recordOutcome()` API remains the same
+- `getRankedModels()` returns same shape
+- `shouldDemote()`, `getABTestCandidates()` unchanged
+
+### Integration Tests (Add SQLite-Specific)
+```typescript
+test("persists to SQLite database", () => {
+  const learner = new ModelLearner(basePath);
+  learner.recordOutcome("execute-task", "gpt-4o", { success: true, tokensUsed: 100 });
+  
+  // Verify record in model_outcomes table
+  const query = learner.tracker.db.prepare(`
+    SELECT COUNT(*) as count FROM model_outcomes 
+    WHERE task_type = ? AND model_id = ?
+  `);
+  const result = query.get("execute-task", "gpt-4o");
+  expect(result.count).toBe(1);
+});
+
+test("transactions are atomic", () => {
+  // Simulate failure during upsert
+  // Verify both INSERT and UPDATE succeed or both rollback
+});
+```
+
+## Timeline
+
+1. **When Node 24.15.0 becomes standard** (6-8 weeks)
+   - Update `.nvmrc`, `package.json` engines
+   - Enable snap to run Node 24
+   
+2. **Migration PR** (2 days of work)
+   - Refactor `ModelPerformanceTracker` class
+   - Add migration function
+   - Test with existing unit tests
+   
+3. **Rollout** (1 day)
+   - Deploy with backward-compatible JSON→SQLite auto-migration
+   - Monitor for edge cases
+   - Archive old JSON files after 1 week
+
+## Backward Compatibility
+
+- **Auto-migrate**: On first run with Node 24, detect `.sf/model-performance.json` and import to SQLite
+- **Keep JSON**: Don't delete old JSON file immediately (keep for 1 week as backup)
+- **Graceful fallback**: If SQLite init fails, log error and fall back to JSON (degraded mode)
+
+## Future Opportunities
+
+Once SQLite is in place:
+
+1. **Dashboard**: Query performance metrics
+   ```sql
+   SELECT model_id, 
+     ROUND(100.0 * successes / (successes + failures), 1) as success_rate,
+     total_tokens, total_cost
+   FROM model_stats
+   WHERE task_type = ?
+   ORDER BY success_rate DESC;
+   ```
+
+2. **Trend analysis**: Model performance over time
+   ```sql
+   SELECT DATE(timestamp) as day, model_id, COUNT(*) as attempts,
+     SUM(success) as wins, 
+     ROUND(100.0 * SUM(success) / COUNT(*), 1) as daily_success_rate
+   FROM model_outcomes
+   WHERE task_type = ? AND timestamp > date('now', '-30 days')
+   GROUP BY day, model_id
+   ORDER BY day DESC;
+   ```
+
+3. **A/B testing**: Compare challenger vs incumbent in detail
+   ```sql
+   SELECT 
+     model_id,
+     COUNT(*) as trials,
+     SUM(success) as wins,
+     ROUND(AVG(tokens_used), 0) as avg_tokens,
+     ROUND(AVG(cost_usd), 4) as avg_cost
+   FROM model_outcomes
+   WHERE task_type = ? AND timestamp > ?
+   GROUP BY model_id;
+   ```
+
+4. **Federated learning**: Export performance data for cross-project analysis
+   ```sql
+   SELECT * FROM model_stats
+   WHERE successes + failures >= 10  -- High-confidence entries only
+   ORDER BY success_rate DESC;
+   ```
+
+## References
+
+- Node.js `node:sqlite` docs: https://nodejs.org/api/sqlite.html
+- UOK `llm_task_outcomes` schema: See `docs/dev/UOK-SELF-EVOLUTION.md`
+- SQLite WAL mode: https://www.sqlite.org/wal.html