docs: add SQLite migration guide for Node 24 upgrade
Comprehensive guide for migrating from JSON to node:sqlite when Node 24 is available: - Schema design (model_outcomes + model_stats tables) - Phase-by-phase refactoring approach - Data migration from JSON with backward compatibility - Testing strategy with new SQLite-specific tests - Future opportunities: dashboards, trend analysis, A/B testing, federated learning This doc serves as a roadmap for ~2 days of work when Node 24 becomes standard. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This commit is contained in:
parent
034e7be216
commit
f2db20b4d6
1 changed files with 312 additions and 0 deletions
312
docs/dev/SQLITE-MIGRATION.md
Normal file
312
docs/dev/SQLITE-MIGRATION.md
Normal file
|
|
@ -0,0 +1,312 @@
|
|||
# SQLite Migration Guide for Model Learning
|
||||
|
||||
**Status**: Planned for Node 24.15.0 upgrade
|
||||
**Current**: JSON-based storage (model-learner.js, self-report-fixer.js)
|
||||
**Target**: Native `node:sqlite` integration
|
||||
|
||||
## Why SQLite?
|
||||
|
||||
1. **Zero dependencies**: Node 24+ has built-in `node:sqlite` (no package install)
|
||||
2. **Queryable**: SQL joins with UOK's `llm_task_outcomes` table for unified learning database
|
||||
3. **Transactional**: Atomic outcome recording prevents partial state corruption
|
||||
4. **Performant**: Indexes on (task_type, model_id) for per-task-type ranking queries
|
||||
5. **Durable**: WAL mode ensures data survives crashes
|
||||
|
||||
## Current State (Node 20)
|
||||
|
||||
### JSON-Based Storage
|
||||
- `model-learner.js`: `.sf/model-performance.json` (nested object hierarchy)
|
||||
```json
|
||||
{
|
||||
"execute-task": {
|
||||
"gpt-4o": {
|
||||
"successes": 42,
|
||||
"failures": 3,
|
||||
"successRate": 0.93
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
- `self-report-fixer.js`: Stateless (no persistent storage)
|
||||
- `triage-self-feedback.js`: Reads/writes `REQUIREMENTS.md`, `ARCHITECTURE.md`
|
||||
|
||||
### Pain Points
|
||||
- Entire file read/write on every outcome (O(n) latency)
|
||||
- No queryable schema (must load all data, filter in-memory)
|
||||
- No transactions (partial failures possible)
|
||||
- No natural joins with UOK database
|
||||
|
||||
## SQLite Schema (Target)
|
||||
|
||||
### Table 1: model_outcomes
|
||||
Raw event log for every model outcome.
|
||||
|
||||
```sql
|
||||
CREATE TABLE model_outcomes (
|
||||
id INTEGER PRIMARY KEY AUTOINCREMENT,
|
||||
task_type TEXT NOT NULL, -- "execute-task", "plan-slice", etc.
|
||||
model_id TEXT NOT NULL, -- "gpt-4o", "claude-opus", etc.
|
||||
success INTEGER NOT NULL, -- 1 = success, 0 = failure
|
||||
timeout INTEGER NOT NULL DEFAULT 0, -- 1 = timed out, 0 = normal
|
||||
tokens_used INTEGER NOT NULL DEFAULT 0,
|
||||
cost_usd REAL NOT NULL DEFAULT 0.0,
|
||||
timestamp TEXT NOT NULL, -- ISO 8601
|
||||
FOREIGN KEY (task_type, model_id) REFERENCES model_stats(task_type, model_id)
|
||||
);
|
||||
|
||||
CREATE INDEX idx_outcomes_task_model ON model_outcomes(task_type, model_id);
|
||||
CREATE INDEX idx_outcomes_timestamp ON model_outcomes(timestamp DESC);
|
||||
```
|
||||
|
||||
### Table 2: model_stats
|
||||
Aggregated per-task-per-model statistics (updated atomically with each outcome).
|
||||
|
||||
```sql
|
||||
CREATE TABLE model_stats (
|
||||
id INTEGER PRIMARY KEY AUTOINCREMENT,
|
||||
task_type TEXT NOT NULL,
|
||||
model_id TEXT NOT NULL,
|
||||
successes INTEGER NOT NULL DEFAULT 0,
|
||||
failures INTEGER NOT NULL DEFAULT 0,
|
||||
timeouts INTEGER NOT NULL DEFAULT 0,
|
||||
total_tokens INTEGER NOT NULL DEFAULT 0,
|
||||
total_cost REAL NOT NULL DEFAULT 0.0,
|
||||
last_used TEXT, -- ISO 8601 timestamp of last outcome
|
||||
UNIQUE(task_type, model_id)
|
||||
);
|
||||
|
||||
CREATE INDEX idx_stats_task_model ON model_stats(task_type, model_id);
|
||||
```
|
||||
|
||||
## Migration Steps
|
||||
|
||||
### Phase 1: Refactor `ModelPerformanceTracker` (model-learner.js)
|
||||
|
||||
**Before** (JSON):
|
||||
```javascript
|
||||
recordOutcome(taskType, modelId, outcome) {
|
||||
if (!this.data[taskType]) this.data[taskType] = {};
|
||||
if (!this.data[taskType][modelId]) {
|
||||
this.data[taskType][modelId] = { successes: 0, failures: 0, ... };
|
||||
}
|
||||
const stats = this.data[taskType][modelId];
|
||||
if (outcome.success) stats.successes += 1;
|
||||
else stats.failures += 1;
|
||||
this._save(); // Entire file rewrite
|
||||
}
|
||||
```
|
||||
|
||||
**After** (SQLite):
|
||||
```javascript
|
||||
recordOutcome(taskType, modelId, outcome) {
|
||||
this.db.exec("BEGIN");
|
||||
|
||||
// Insert event
|
||||
const insertStmt = this.db.prepare(`
|
||||
INSERT INTO model_outcomes (task_type, model_id, success, timeout, ...)
|
||||
VALUES (?, ?, ?, ?, ...)
|
||||
`);
|
||||
insertStmt.run(taskType, modelId, outcome.success ? 1 : 0, ...);
|
||||
|
||||
// Upsert stats
|
||||
const updateStmt = this.db.prepare(`
|
||||
INSERT INTO model_stats (task_type, model_id, successes, ...)
|
||||
VALUES (?, ?, ?, ...)
|
||||
ON CONFLICT(task_type, model_id) DO UPDATE SET
|
||||
successes = successes + ?,
|
||||
failures = failures + ?,
|
||||
...
|
||||
`);
|
||||
updateStmt.run(...);
|
||||
|
||||
this.db.exec("COMMIT");
|
||||
}
|
||||
```
|
||||
|
||||
**Benefits**:
|
||||
- O(1) outcome recording (single INSERT)
|
||||
- Atomic transaction (both tables updated together)
|
||||
- No full-file rewrite
|
||||
|
||||
### Phase 2: Update Query Methods
|
||||
|
||||
**getRankedModels** → SQL SELECT with ORDER BY
|
||||
|
||||
```javascript
|
||||
getRankedModels(taskType, minSamples = 3) {
|
||||
const query = this.db.prepare(`
|
||||
SELECT model_id, successes, failures, total_tokens, total_cost, last_used
|
||||
FROM model_stats
|
||||
WHERE task_type = ? AND (successes + failures) >= ?
|
||||
ORDER BY (CAST(successes AS FLOAT) / (successes + failures)) DESC
|
||||
`);
|
||||
return query.all(taskType, minSamples).map(row => ({
|
||||
modelId: row.model_id,
|
||||
successRate: row.successes / (row.successes + row.failures),
|
||||
...
|
||||
}));
|
||||
}
|
||||
```
|
||||
|
||||
### Phase 3: Integrate with UOK Database (Optional)
|
||||
|
||||
If UOK stores outcomes in its database, consider a **federated schema**:
|
||||
- Keep model_learner SQLite database separate (`.sf/model-performance.db`)
|
||||
- OR: Create view in UOK database that joins with UOK's `llm_task_outcomes`
|
||||
|
||||
```sql
|
||||
-- In UOK database:
|
||||
CREATE VIEW model_performance AS
|
||||
SELECT
|
||||
outcome.task_type,
|
||||
outcome.model_id,
|
||||
COUNT(CASE WHEN outcome.success = 1 THEN 1 END) as successes,
|
||||
COUNT(CASE WHEN outcome.success = 0 THEN 1 END) as failures,
|
||||
SUM(outcome.tokens_used) as total_tokens,
|
||||
SUM(outcome.cost_usd) as total_cost
|
||||
FROM llm_task_outcomes outcome
|
||||
GROUP BY outcome.task_type, outcome.model_id;
|
||||
```
|
||||
|
||||
### Phase 4: Data Migration (JSON → SQLite)
|
||||
|
||||
Create migration function in constructor:
|
||||
|
||||
```javascript
|
||||
_initDb() {
|
||||
const db = new DatabaseSync(this.dbPath);
|
||||
// ... create tables ...
|
||||
|
||||
// Migrate existing JSON data
|
||||
if (existsSync(this.oldJsonPath)) {
|
||||
const jsonData = JSON.parse(readFileSync(this.oldJsonPath, 'utf-8'));
|
||||
this._migrateFromJson(db, jsonData);
|
||||
// After migration: delete old JSON or archive
|
||||
}
|
||||
|
||||
return db;
|
||||
}
|
||||
|
||||
_migrateFromJson(db, jsonData) {
|
||||
db.exec("BEGIN");
|
||||
|
||||
for (const [taskType, models] of Object.entries(jsonData)) {
|
||||
for (const [modelId, stats] of Object.entries(models)) {
|
||||
const insertStmt = db.prepare(`
|
||||
INSERT INTO model_stats
|
||||
(task_type, model_id, successes, failures, timeouts, total_tokens, total_cost, last_used)
|
||||
VALUES (?, ?, ?, ?, ?, ?, ?, ?)
|
||||
`);
|
||||
insertStmt.run(
|
||||
taskType, modelId,
|
||||
stats.successes, stats.failures, stats.timeouts || 0,
|
||||
stats.totalTokens, stats.totalCost, stats.lastUsed
|
||||
);
|
||||
}
|
||||
}
|
||||
|
||||
db.exec("COMMIT");
|
||||
}
|
||||
```
|
||||
|
||||
## Testing Strategy
|
||||
|
||||
### Unit Tests (No Changes Needed)
|
||||
Existing tests in `model-learner.test.ts` should pass unchanged:
|
||||
- `recordOutcome()` API remains the same
|
||||
- `getRankedModels()` returns same shape
|
||||
- `shouldDemote()`, `getABTestCandidates()` unchanged
|
||||
|
||||
### Integration Tests (Add SQLite-Specific)
|
||||
```typescript
|
||||
test("persists to SQLite database", () => {
|
||||
const learner = new ModelLearner(basePath);
|
||||
learner.recordOutcome("execute-task", "gpt-4o", { success: true, tokensUsed: 100 });
|
||||
|
||||
// Verify record in model_outcomes table
|
||||
const query = learner.tracker.db.prepare(`
|
||||
SELECT COUNT(*) as count FROM model_outcomes
|
||||
WHERE task_type = ? AND model_id = ?
|
||||
`);
|
||||
const result = query.get("execute-task", "gpt-4o");
|
||||
expect(result.count).toBe(1);
|
||||
});
|
||||
|
||||
test("transactions are atomic", () => {
|
||||
// Simulate failure during upsert
|
||||
// Verify both INSERT and UPDATE succeed or both rollback
|
||||
});
|
||||
```
|
||||
|
||||
## Timeline
|
||||
|
||||
1. **When Node 24.15.0 becomes standard** (6-8 weeks)
|
||||
- Update `.nvmrc`, `package.json` engines
|
||||
- Enable snap to run Node 24
|
||||
|
||||
2. **Migration PR** (2 days of work)
|
||||
- Refactor `ModelPerformanceTracker` class
|
||||
- Add migration function
|
||||
- Test with existing unit tests
|
||||
|
||||
3. **Rollout** (1 day)
|
||||
- Deploy with backward-compatible JSON→SQLite auto-migration
|
||||
- Monitor for edge cases
|
||||
- Archive old JSON files after 1 week
|
||||
|
||||
## Backward Compatibility
|
||||
|
||||
- **Auto-migrate**: On first run with Node 24, detect `.sf/model-performance.json` and import to SQLite
|
||||
- **Keep JSON**: Don't delete old JSON file immediately (keep for 1 week as backup)
|
||||
- **Graceful fallback**: If SQLite init fails, log error and fall back to JSON (degraded mode)
|
||||
|
||||
## Future Opportunities
|
||||
|
||||
Once SQLite is in place:
|
||||
|
||||
1. **Dashboard**: Query performance metrics
|
||||
```sql
|
||||
SELECT model_id,
|
||||
ROUND(100.0 * successes / (successes + failures), 1) as success_rate,
|
||||
total_tokens, total_cost
|
||||
FROM model_stats
|
||||
WHERE task_type = ?
|
||||
ORDER BY success_rate DESC;
|
||||
```
|
||||
|
||||
2. **Trend analysis**: Model performance over time
|
||||
```sql
|
||||
SELECT DATE(timestamp) as day, model_id, COUNT(*) as attempts,
|
||||
SUM(success) as wins,
|
||||
ROUND(100.0 * SUM(success) / COUNT(*), 1) as daily_success_rate
|
||||
FROM model_outcomes
|
||||
WHERE task_type = ? AND timestamp > date('now', '-30 days')
|
||||
GROUP BY day, model_id
|
||||
ORDER BY day DESC;
|
||||
```
|
||||
|
||||
3. **A/B testing**: Compare challenger vs incumbent in detail
|
||||
```sql
|
||||
SELECT
|
||||
model_id,
|
||||
COUNT(*) as trials,
|
||||
SUM(success) as wins,
|
||||
ROUND(AVG(tokens_used), 0) as avg_tokens,
|
||||
ROUND(AVG(cost_usd), 4) as avg_cost
|
||||
FROM model_outcomes
|
||||
WHERE task_type = ? AND timestamp > ?
|
||||
GROUP BY model_id;
|
||||
```
|
||||
|
||||
4. **Federated learning**: Export performance data for cross-project analysis
|
||||
```sql
|
||||
SELECT * FROM model_stats
|
||||
WHERE successes + failures >= 10 -- High-confidence entries only
|
||||
ORDER BY success_rate DESC;
|
||||
```
|
||||
|
||||
## References
|
||||
|
||||
- Node.js `node:sqlite` docs: https://nodejs.org/api/sqlite.html
|
||||
- UOK `llm_task_outcomes` schema: See `docs/dev/UOK-SELF-EVOLUTION.md`
|
||||
- SQLite WAL mode: https://www.sqlite.org/wal.html
|
||||
Loading…
Add table
Reference in a new issue