singularity/singularity-forge

Fork 0

Mikael Hugo eacbbaac82

CI / detect-changes (push) Waiting to run

Details

CI / docs-check (push) Blocked by required conditions

Details

CI / lint (push) Blocked by required conditions

Details

CI / build (push) Blocked by required conditions

Details

CI / integration-tests (push) Blocked by required conditions

Details

CI / windows-portability (push) Blocked by required conditions

Details

CI / rtk-portability (linux, blacksmith-4vcpu-ubuntu-2404) (push) Blocked by required conditions

Details

CI / rtk-portability (macos, macos-15) (push) Blocked by required conditions

Details

CI / rtk-portability (windows, blacksmith-4vcpu-windows-2025) (push) Blocked by required conditions

Details

TODO: simplify md-tracking — drop snapshot blob, accept mid-edit corner

Final settled design: sha + git ref only, no DB content snapshots at
all. The mid-edit case (file observed dirty) loses the ability to
reconstruct the intermediate working-tree state, but the change-
detection signal is preserved and the operator can commit first if
intermediate fidelity matters.

Trades a corner-case fidelity loss for a much simpler schema and
no DB-vs-disk content duplication. Git remains the only version
store; the DB row is a pure "where I left off" pointer.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-05-11 19:49:25 +02:00

7.8 KiB

Raw Blame History

TODO

Dump anything here.

Cross-repo triage / unified backlog view

Today's dogfood: a scan across active repos found ~40 TODO.md files totalling ~10,000+ lines across /srv/infra, /srv/operations-memory, /home/mhugo/code/singularity-engine (27 subdir TODOs, 9 000+ lines), /home/mhugo/code/inference-fabric (8 crate TODOs), plus per-repo singletons in ace-coder, dks-web, vectordrive, centralcloud, etc.

The per-subdir files are not noise — most are substantive design specs scoped to their domain/crate/service. Collapsing them into a single root file would destroy useful structure.

The actual gap: no single way to see "what's queued across all the repos" at once. Today this requires walking N repos by hand.

Wanted:

sf headless triage-all-repos --config ~/.sf/repos.yaml

Where ~/.sf/repos.yaml is a list of repo paths and (optional) per-repo priority. For each repo:

If TODO.md has non-template content, run triageTodoDump in that repo's SF db.
After all repos triaged, emit a unified report: one row per backlog item across all repos, sortable by priority / tier / inserted_at.
Optionally produce a single ~/.sf/cross-repo-view.md for quick human reading.

Per-repo SF dbs stay separate (each repo owns its work); the cross-repo view is read-only aggregation.

Slash command `/todo triage` should actually invoke the typed backend

Observed today: sf --print "/todo triage" ran the agent, which read TODO.md and emitted a triage-shaped markdown response, but the agent did not call handleTodo → triageTodoDump — it re-implemented the flow in natural language via Read/Write tools. Side effect: a patched backend in commands-todo.js was bypassed entirely.

Wanted: when a slash command has a registered typed handler in the extension surface (i.e. handleTodo, handleNewMilestone, …), the agent's prompt should require the call go through that handler rather than letting the LLM improvise. The handler can be invoked as a tool call so the LLM still has narrative space, but the side effects (DB writes, file scaffolds, etc.) come from the typed path, not from raw Write/Edit on TODO.md.

Concretely:

In slash-commands.md (or wherever the slash dispatch prompt lives), enumerate handlers and forbid the LLM from "doing the work" itself when a typed handler exists.
Add an integration test that runs sf --print "/todo triage" against a fixture TODO.md and asserts that triage_runs rows appear in sf.db (i.e. the backend ran, not just the LLM).

Triage result needs structured tier/priority per item

Current shape:

result.implementation_tasks: string[]   // titles only
result.memory_requirements: string[]
result.harness_suggestions: string[]
result.docs_or_tests: string[]
result.unclear_notes: string[]
result.eval_candidates: { id, task_input, expected_behavior, … }[]

Tiers (T1 / T2 / T3) appear only in the LLM-prose tier list it appends to BUILD_PLAN.md. They are not present as a structured field per item. That blocks any downstream "for each Tier-1 item, scaffold a milestone" automation — the tier info is locked in prose.

Wanted: extend the triage JSON schema so each implementation task is

{ title: string, tier: "T1" | "T2" | "T3", rationale: string }

and update appendBacklogItems + a future milestone-escalator to read the structured tier rather than re-parsing markdown.

Sha-track every source-of-truth markdown file, diff on change

Generalised from the milestone-files case: any markdown file that is a source of truth for SF or for humans navigating the repo should be sha-tracked, and any change since SF last saw it should surface as a diff for review (or auto-accept under a configured policy).

In scope (per repo):

Repo-level meta — AGENTS.md, README.md, STATUS.md, BACKLOG.md, STANDALONE.md, MIGRATION.md, etc. (any uppercase root-level .md)
Pointer — .github/copilot-instructions.md
Wiki — .sf/wiki/**/*.md
Planning — .sf/milestones/**/*.md (CONTEXT, MILESTONE-SUMMARY, ROADMAP, SUMMARY per milestone; PLAN / SUMMARY per slice; same per task)
ADRs — docs/adr/**/*.md (these should rarely change, so any edit is loud and worth surfacing)
Triage outputs — docs/plans/**/*.md

Explicit out of scope:

TODO.md — gets reset to empty template by /todo triage on every cycle; tracking churn here is just noise.
CHANGELOG.md / BUILD_PLAN.md — append-only by design; sha churn is expected, no signal in tracking.
node_modules, dist, vendored copies — irrelevant.

Storage in sf.db — sha + git ref, no content snapshots. Git is the version store; the DB is just a pointer:

CREATE TABLE tracked_md_files (
  relpath           TEXT PRIMARY KEY,  -- repo-relative path
  sha256            TEXT NOT NULL,     -- hash of last-seen content
  size_bytes        INTEGER NOT NULL,
  last_seen_at      TEXT NOT NULL,
  last_seen_commit  TEXT,              -- git SHA1 of HEAD when observed
  category          TEXT               -- 'meta'|'wiki'|'milestone'|'adr'|'plan'
);

Diff source priority:

Tracked + committed at observation (the common case): git diff <last_seen_commit> -- <path> shows everything since. Cheap, no blob, perfect history via git log <path> if needed.
Tracked + uncommitted at observation (mid-edit corner): no git ref points at that exact content. Diff shows "changed since <last_seen_commit>" but the prior intermediate working-tree state isn't reconstructable. Acceptable trade-off — the main signal is "changed", and the operator can commit before letting SF observe if intermediate fidelity matters.
Untracked / gitignored: not tracked in this table. SF-generated transient files don't belong in version control or in this audit.

History per file = git log <relpath> (already there, free). SF's DB just records "where I left off." No md_observation_log history table unless someone has a concrete need for an SF-side timeline.

On session start + each autonomous-cycle entry, walk the configured glob set, hash each file, diff against tracked_md_files.sha256. For each changed file:

Surface to operator: "N files changed since SF last saw — review or accept?" with per-file diff (computed from git, not from a DB blob).
On accept → update sha + last_seen_at. No content stored.
New files (sha not in DB) → classify by glob category, store sha, continue.
Deleted files → archive the DB row (mark inactive); don't purge until operator confirms.

Useful for:

hand-edits / cross-agent edits / git pulls (the original milestone-files motivation)
catching when an AGENTS.md drifted because someone edited it during a code review and nobody told SF
ADR drift detection — ADRs should almost never change; if one does, surface it loudly
treating .sf/wiki/* as living docs that need review when they drift from what sf has internalised

Storage cost: ~40 bytes per file (sha + meta) + optional gzipped snapshot (typically 30-70 % of original size). Negligible vs. the rest of sf.db.

Phases-helpers extension-load error on every SF run

Every sf … invocation today prints:

[sf] Extension load error Error: Failed to load extension
"/home/mhugo/.sf/agent/extensions/sf/index.js": The requested module
'./phases-helpers.js' does not provide an export named 'closeoutAndStop'

Non-fatal (SF continues), but noisy and a sign of stale state. Either:

A recent rename of closeoutAndStop in phases-helpers.js wasn't propagated to its caller, and npm run copy-resources quietly shipped the partial state, or
A test gap doesn't catch missing exports from phases-helpers.js.

Add an import-time sanity check (or a test that imports every entry in the extension index and asserts all required symbols resolve).

7.8 KiB Raw Blame History

TODO

Cross-repo triage / unified backlog view

Slash command /todo triage should actually invoke the typed backend

Triage result needs structured tier/priority per item

Sha-track every source-of-truth markdown file, diff on change

Phases-helpers extension-load error on every SF run

7.8 KiB

Raw Blame History

Slash command `/todo triage` should actually invoke the typed backend