Mikael Hugo 61b4fecdaf fix(notices+db): complete NOTICE_KIND tagging, fix slice-dep query, cap error storage

NOTICE_KIND tagging:
- auto.js: ctrl-c-pause (USER_VISIBLE), auto-start-failed/session-lock-lost/
  stopAuto/debug-summary-written (SYSTEM_NOTICE), auto-no-command-ctx (USER_VISIBLE)
- loop.js: model-policy-blocked SYSTEM_NOTICE→BLOCKING_NOTICE (user must act),
  solver-eval results/infra-stop/consecutive-cooldowns (SYSTEM_NOTICE),
  phase-timeout/credential-cooldown-wait/iteration-error (TOOL_NOTICE); fix import order
- register-hooks.js: destructive-command (TOOL_NOTICE), gemini-preflight (SYSTEM_NOTICE)
- provider-error-pause.js: auto-resume (TOOL_NOTICE), scheduled-resume (SYSTEM_NOTICE),
  permanent-pause (BLOCKING_NOTICE)
- uok-parity-summary.js: parity warning (SYSTEM_NOTICE)

sf-db fixes:
- getActiveSliceFromDb: use slice_dependencies junction table instead of
  json_each(s.depends) — junction table is kept in sync by syncSliceDependencies
- capErrorForStorage: cap UOK run error blobs at 4 KB; excess spills to
  .sf/runtime/errors/<runId>.txt to prevent DB bloat from large stack traces

ARCHITECTURE.md:
- Document DB-first invariant; remove .sf/DECISIONS.md/.REQUIREMENTS.md/.KNOWLEDGE.md
  from tracked-file list (they are rendered projections, not authoritative sources)
- Add .sf/traces/ and .sf/metrics.db to gitignored list
- Update system-context assembly order to show DB-sourced decisions/requirements
- Correct system-context.ts → system-context.js

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

2026-05-10 20:26:18 +02:00

10 KiB

Raw Blame History

Architecture

Purpose

Singularity Forge (SF) is the product. It runs long-horizon coding work through the Unified Operation Kernel (UOK): milestones → slices → tasks. Each dispatch unit runs a fresh AI context, writes its output to disk, then terminates. UOK owns lifecycle, recovery, and the DB-backed run ledger; runtime files under .sf/runtime/ are projections for query, UI, and compatibility. A deterministic controller (not an LLM) reads canonical state and decides what to dispatch next. Core changes follow purpose-driven TDD: purpose and consumer first, then failing tests, then implementation. The user is the end-gate — autonomous mode delivers work to human review, it does not merge to production unattended.

Codemap

Path	Purpose
`src/loader.ts`	Entry point — initializes resources, registers extension
`src/headless.ts`	Non-interactive (headless) mode driver — exit codes 0/1/10/11/12
`src/headless-events.ts`	Transcript event parsing and notification routing
`src/extension-registry.ts`	Registers SF as a coding-agent extension
`src/resources/extensions/sf/`	All SF extension source (TypeScript)
`src/resources/extensions/sf/auto/`	Autonomous workflow orchestrator (UOK lifecycle, dispatch, planning)
`src/resources/extensions/sf/bootstrap/`	Context injection, system prompt assembly
`src/resources/extensions/sf/prompts/`	Prompt templates (`.md`, loaded by `prompt-loader.ts`)
`src/resources/extensions/sf/tests/`	Unit and integration tests
`dist/resources/extensions/sf/`	Compiled JS (rebuilt by `npm run copy-resources`)
`~/.sf/agent/extensions/sf/`	Installed copy (synced from dist on startup)
`docs/`	Durable product, design, plan, reliability, and security context
`harness/`	Specs (behavior contracts), evals (model-output tests), graders

State layout (`.sf/`)

.sf/ can be a symlink (external state, ~/.sf/projects/<hash>/) or a local directory (tracking-enabled per ADR-001).

Tracked in git (travel with the branch, per ADR-001):

.sf/milestones/     — roadmaps, plans, summaries, task plans (rendered projections from DB)
.sf/PROJECT.md      — project overview

Gitignored (runtime/ephemeral — managed by ensureGitInfoExclude() in .git/info/exclude):

.sf/activity/       — JSONL session dumps
.sf/audit/          — audit trail entries (primary: events.jsonl)
.sf/exec/           — in-flight execution state
.sf/forensics/      — crash forensics
.sf/journal/        — SF journal entries
.sf/model-benchmarks/ — model benchmark results
.sf/parallel/       — parallel dispatch coordination
.sf/reports/        — generated reports
.sf/runtime/        — dispatch records, timeout tracking, error spill files
.sf/traces/         — per-session trace JSONL (gate runs, git ops); latest symlink
.sf/worktrees/      — git worktree working directories
.sf/auto.lock       — crash detection sentinel
.sf/metrics.db      — token/cost metrics (dedicated DB, separate from sf.db)
.sf/sf.db*          — SQLite canonical structured state, priority order, validation/gate state, and UOK ledgers

The symlink case uses a blanket .sf gitignore pattern (git cannot traverse symlinks). The directory case uses granular patterns so planning artifacts remain trackable.

DB-first invariant: sf.db is the single source of truth for all structured state (milestones, slices, tasks, decisions, requirements, memories, self-feedback). Markdown files under .sf/ are rendered projections or human-editable inputs — they are never the authoritative source when the DB is open. Agents write to DB via tool calls (save_decision, save_knowledge, save_requirement, update_requirement), not by appending to .md files.

Key flows

Autonomous dispatch loop (src/resources/extensions/sf/auto/):

UOK reconciles the DB-backed ledger and runtime diagnostics into a typed state snapshot
Controller selects the next dispatch unit (research, plan, implement, verify, etc.) from canonical DB state
A fresh agent context is started with the task plan injected via system-context.js
Agent writes artifacts to disk, commits, exits
UOK records completion/recovery, updates projections, and repeats until milestone completes or a gate fails

System context assembly (bootstrap/system-context.js): PREFERENCES.md → project knowledge (DB memories table) → ARCHITECTURE.md → CODEBASE.md → code intelligence → active decisions (DB) → active requirements (DB) → self-feedback (DB) → worktree/VCS blocks

Write gate (bootstrap/write-gate.ts): All file writes in autonomous mode pass through a gate. Protected files (CLAUDE.md, CODEBASE.md, certain spec files) require explicit override.

UOK Dispatch State Machine (Five-Phase Loop)

UOK orchestrates work through a deterministic five-phase state machine:

PhaseDiscuss → PhasePlan → PhaseExecute → PhaseMerge → PhaseComplete
     ↓            ↓            ↓            ↓            ↓
  (discuss)    (plan)      (execute)    (merge)     (finalize)
     ↓            ↓            ↓            ↓            ↓
  gates       gates        gates       gates      validation
     ↓            ↓            ↓            ↓            ↓
  (continue or remediate)

Phase details:

Phase	Purpose	Exit Conditions	Failure Path
PhaseDiscuss	Gather project context, requirements, scope	Gates pass (discussion-close gate)	Loop back for more context or escalate
PhasePlan	Create milestone/slice plans with success criteria	Gates pass (planning-approval gate)	Add remediation slices or replan
PhaseExecute	Implement tasks through the dispatch sequence	Gates pass (code-quality, test gates)	Isolate failed task, add recovery slices
PhaseMerge	Integrate slices, run end-to-end tests, merge branches	Gates pass (integration gate)	Add integration-fix slices, retry
PhaseComplete	Final validation, audit trail, summary, gate completion	Validation passes (acceptance gate)	Add remediation milestone or escalate

Error recovery:

If a gate fails, UOK records the verdict and routes through phase-specific handlers
Failed gates can trigger automatic remediation slices (new plan → execute loop)
Stuck-loop detection: if the same unit repeats without progress after N attempts, invoke recovery protocol (timeout, manual review, or skip)
Crash recovery: .sf/auto.lock sentinel + sf.db WAL enables recovery from agent crash mid-phase
Run errors are capped at 4 KB in uok_runs.error; payloads exceeding that spill to .sf/runtime/errors/<runId>.txt

Gate Verdict Semantics

Every gate runs in parallel and returns one of three verdicts:

Verdict	Meaning	Next Action
passed	Gate question answerable; no concern blocking this phase	Proceed to next phase
failed	Gate question answerable; concern blocks phase progression	Record failure, optionally add remediation slice(s)
omitted	Gate question not applicable to this unit (e.g., no auth work → auth gate omitted)	Proceed (gate doesn't apply)

Critical rule: omitted must have a one-line reason (e.g., "no auth surface"). Unexplained omitted verdicts are treated as failures and re-dispatched with explicit instruction to pick passed or failed.

Gate run history is written to .sf/traces/<traceId>.jsonl (append-only JSONL, not DB). Gate circuit-breaker state lives in the gate_circuit_breakers table in sf.db.

Outcome Learning for Model Selection

UOK tracks model success/failure per task-type using Bayesian updating:

P(model_i succeeds | task_type) = (successes + prior) / (total_trials + prior_weight)

Mechanism:

After each task completes, UOK logs: { model, task_type, succeeded: bool, latency_ms, tokens }
Model scores updated dynamically; different models get different confidence per phase/task
Prior weights prevent early abandonment (new models get benefit of the doubt)
Used by benchmark-selector.ts to route future similar tasks to higher-scoring models

Self-Evolution Mechanisms

Self-Report Collection

Agents and gates file issues via the report_issue tool during dispatch:

Reports stored in self_feedback table in sf.db
Triage pipeline (triage-self-feedback.js) runs at session start to cluster and prioritize entries
High/critical entries surfaced in system context for the next planning round
Status: Collection and triage injection are active

Knowledge Compounding

Knowledge entries are stored in the memories table in sf.db (category: knowledge):

Agents write via save_knowledge tool (not by appending to files)
Injected into agent prompts via system-context.js (DB query, keyword-scoped, budget-capped)
knowledge-compounding.js distills high-confidence judgment-log entries after each milestone close
Status: Storage, injection, and compounding are all active

Requirement Promotion

requirement-promoter.js sweeps self_feedback entries at session start:

Clusters recurring feedback by kind (count ≥ 5 or spanning ≥ 3 milestones)
Promotes clusters to the requirements table via upsertRequirement
Promoted entries are marked resolved in self_feedback
Status: Active

Gate-Based Pattern Detection

Gates can detect and report repeated failure patterns (e.g., "same requirement-validation failure in S01 and S03")

Status: Logic exists per gate; no automatic aggregation across gates

Invariants

UOK and the dispatch controller are pure TypeScript — no LLM decisions in the dispatch loop itself.
Each dispatch unit runs in a fresh context — no cross-turn state accumulation.
Planning artifacts are tracked in git; runtime artifacts are never committed.
DB-first: sf.db is the only executable truth. Agents read decisions, requirements, and knowledge from DB-injected context; they write back via tool calls. .md projection files are rendered outputs, not inputs.
SF_RUNTIME_PATTERNS in gitignore.ts is the canonical source of truth for runtime paths. git-service.ts (RUNTIME_EXCLUSION_PATHS) and worktree-manager.ts (SKIP_* arrays) must stay synchronized with it.
The user is the end-gate. SF delivers for review, not to production.

10 KiB Raw Blame History