Without storing snapshots we lose the ability to diff against "what SF last saw". The fix is hybrid: store the git commit SHA1 that contained the observed content (cheap, no DB blob), and only fall back to a gzipped snapshot when the file was observed with uncommitted changes (no git ref exists for that exact content). For ".sf/-generated, untracked, in .gitignore" the right answer is to not track them in this table at all. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
8.2 KiB
TODO
Dump anything here.
Cross-repo triage / unified backlog view
Today's dogfood: a scan across active repos found ~40 TODO.md files
totalling ~10,000+ lines across /srv/infra, /srv/operations-memory,
/home/mhugo/code/singularity-engine (27 subdir TODOs, 9 000+ lines),
/home/mhugo/code/inference-fabric (8 crate TODOs), plus per-repo
singletons in ace-coder, dks-web, vectordrive, centralcloud, etc.
The per-subdir files are not noise — most are substantive design specs scoped to their domain/crate/service. Collapsing them into a single root file would destroy useful structure.
The actual gap: no single way to see "what's queued across all the repos" at once. Today this requires walking N repos by hand.
Wanted:
sf headless triage-all-repos --config ~/.sf/repos.yaml
Where ~/.sf/repos.yaml is a list of repo paths and (optional) per-repo
priority. For each repo:
- If
TODO.mdhas non-template content, runtriageTodoDumpin that repo's SF db. - After all repos triaged, emit a unified report: one row per backlog item across all repos, sortable by priority / tier / inserted_at.
- Optionally produce a single
~/.sf/cross-repo-view.mdfor quick human reading.
Per-repo SF dbs stay separate (each repo owns its work); the cross-repo view is read-only aggregation.
Slash command /todo triage should actually invoke the typed backend
Observed today: sf --print "/todo triage" ran the agent, which read
TODO.md and emitted a triage-shaped markdown response, but the agent
did not call handleTodo → triageTodoDump — it re-implemented the
flow in natural language via Read/Write tools. Side effect: a patched
backend in commands-todo.js was bypassed entirely.
Wanted: when a slash command has a registered typed handler in the
extension surface (i.e. handleTodo, handleNewMilestone, …), the
agent's prompt should require the call go through that handler rather
than letting the LLM improvise. The handler can be invoked as a tool
call so the LLM still has narrative space, but the side effects (DB
writes, file scaffolds, etc.) come from the typed path, not from raw
Write/Edit on TODO.md.
Concretely:
- In
slash-commands.md(or wherever the slash dispatch prompt lives), enumerate handlers and forbid the LLM from "doing the work" itself when a typed handler exists. - Add an integration test that runs
sf --print "/todo triage"against a fixture TODO.md and asserts thattriage_runsrows appear insf.db(i.e. the backend ran, not just the LLM).
Triage result needs structured tier/priority per item
Current shape:
result.implementation_tasks: string[] // titles only
result.memory_requirements: string[]
result.harness_suggestions: string[]
result.docs_or_tests: string[]
result.unclear_notes: string[]
result.eval_candidates: { id, task_input, expected_behavior, … }[]
Tiers (T1 / T2 / T3) appear only in the LLM-prose tier list it appends
to BUILD_PLAN.md. They are not present as a structured field per
item. That blocks any downstream "for each Tier-1 item, scaffold a
milestone" automation — the tier info is locked in prose.
Wanted: extend the triage JSON schema so each implementation task is
{ title: string, tier: "T1" | "T2" | "T3", rationale: string }
and update appendBacklogItems + a future milestone-escalator to read
the structured tier rather than re-parsing markdown.
Sha-track every source-of-truth markdown file, diff on change
Generalised from the milestone-files case: any markdown file that is a source of truth for SF or for humans navigating the repo should be sha-tracked, and any change since SF last saw it should surface as a diff for review (or auto-accept under a configured policy).
In scope (per repo):
- Repo-level meta —
AGENTS.md,README.md,STATUS.md,BACKLOG.md,STANDALONE.md,MIGRATION.md, etc. (any uppercase root-level.md) - Pointer —
.github/copilot-instructions.md - Wiki —
.sf/wiki/**/*.md - Planning —
.sf/milestones/**/*.md(CONTEXT,MILESTONE-SUMMARY,ROADMAP,SUMMARYper milestone;PLAN/SUMMARYper slice; same per task) - ADRs —
docs/adr/**/*.md(these should rarely change, so any edit is loud and worth surfacing) - Triage outputs —
docs/plans/**/*.md
Explicit out of scope:
TODO.md— gets reset to empty template by/todo triageon every cycle; tracking churn here is just noise.CHANGELOG.md/BUILD_PLAN.md— append-only by design; sha churn is expected, no signal in tracking.node_modules,dist, vendored copies — irrelevant.
Storage in sf.db — sha + git ref, with snapshot only as a fallback
for uncommitted observations. SF generates many of these files
itself; storing every version in the DB would duplicate disk + git
for no benefit. But we still need a reference point to compute diffs
against — that's the versioning question.
CREATE TABLE tracked_md_files (
relpath TEXT PRIMARY KEY, -- repo-relative path
sha256 TEXT NOT NULL, -- hash of last-seen content
size_bytes INTEGER NOT NULL,
last_seen_at TEXT NOT NULL,
last_seen_commit TEXT, -- git SHA1 of HEAD when we saw it
uncommitted_snapshot BLOB, -- gzipped, ONLY if observed in working tree
category TEXT -- 'meta'|'wiki'|'milestone'|'adr'|'plan'
);
Versioning + diff source decision tree per file:
-
Observed at commit X (file was clean at the time) → store
last_seen_commit = X,uncommitted_snapshot = NULL. Diff later =git show X:<path>vs current. Cheap, no DB blob. -
Observed with uncommitted changes (working-tree state at time of observation) → store
uncommitted_snapshot = gzip(content),last_seen_commit = HEAD-at-the-time-anyway. Diff later = unpack the snapshot vs current. Necessary because there is no git ref that ever held that exact content. -
File untracked or in .gitignore (transient SF state, generated artifacts) → either skip tracking entirely (preferred), or treat it like case 2 (always store snapshot). Don't pretend a git ref exists when it doesn't.
In practice most md SF deals with is case 1 — committed at observation time — so the snapshot blob stays NULL for most rows. The DB stays small; the working-tree-edit corner case still has a clean diff.
On session start + each autonomous-cycle entry, walk the configured
glob set, hash each file, diff against tracked_md_files.sha256.
For each changed file:
- Surface to operator: "N files changed since SF last saw — review or accept?" with per-file diff (computed from git, not from a DB blob).
- On accept → update sha + last_seen_at. No content stored.
- New files (sha not in DB) → classify by glob category, store sha, continue.
- Deleted files → archive the DB row (mark inactive); don't purge until operator confirms.
Useful for:
- hand-edits / cross-agent edits / git pulls (the original milestone-files motivation)
- catching when an AGENTS.md drifted because someone edited it during a code review and nobody told SF
- ADR drift detection — ADRs should almost never change; if one does, surface it loudly
- treating
.sf/wiki/*as living docs that need review when they drift from whatsfhas internalised
Storage cost: ~40 bytes per file (sha + meta) + optional gzipped
snapshot (typically 30-70 % of original size). Negligible vs. the
rest of sf.db.
Phases-helpers extension-load error on every SF run
Every sf … invocation today prints:
[sf] Extension load error Error: Failed to load extension
"/home/mhugo/.sf/agent/extensions/sf/index.js": The requested module
'./phases-helpers.js' does not provide an export named 'closeoutAndStop'
Non-fatal (SF continues), but noisy and a sign of stale state. Either:
- A recent rename of
closeoutAndStopinphases-helpers.jswasn't propagated to its caller, andnpm run copy-resourcesquietly shipped the partial state, or - A test gap doesn't catch missing exports from
phases-helpers.js.
Add an import-time sanity check (or a test that imports every entry in the extension index and asserts all required symbols resolve).