Captures a real bug class observed during today's session: nothing notices when a milestone file (CONTEXT.md, ROADMAP.md, slice PLAN.md, etc.) is edited out of band — by a human, another agent, or a git pull. SF keeps using the cached state and drifts. Wanted: per-file sha tracking in sf.db, diff surface on change, + hooks for accept/reject/import/archive. Storage cost negligible. Useful in concert with the cross-repo triage and slash-command routing gaps already in this TODO.md — together they close most of the "unattended SF actually works" surface. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
5.9 KiB
TODO
Dump anything here.
Cross-repo triage / unified backlog view
Today's dogfood: a scan across active repos found ~40 TODO.md files
totalling ~10,000+ lines across /srv/infra, /srv/operations-memory,
/home/mhugo/code/singularity-engine (27 subdir TODOs, 9 000+ lines),
/home/mhugo/code/inference-fabric (8 crate TODOs), plus per-repo
singletons in ace-coder, dks-web, vectordrive, centralcloud, etc.
The per-subdir files are not noise — most are substantive design specs scoped to their domain/crate/service. Collapsing them into a single root file would destroy useful structure.
The actual gap: no single way to see "what's queued across all the repos" at once. Today this requires walking N repos by hand.
Wanted:
sf headless triage-all-repos --config ~/.sf/repos.yaml
Where ~/.sf/repos.yaml is a list of repo paths and (optional) per-repo
priority. For each repo:
- If
TODO.mdhas non-template content, runtriageTodoDumpin that repo's SF db. - After all repos triaged, emit a unified report: one row per backlog item across all repos, sortable by priority / tier / inserted_at.
- Optionally produce a single
~/.sf/cross-repo-view.mdfor quick human reading.
Per-repo SF dbs stay separate (each repo owns its work); the cross-repo view is read-only aggregation.
Slash command /todo triage should actually invoke the typed backend
Observed today: sf --print "/todo triage" ran the agent, which read
TODO.md and emitted a triage-shaped markdown response, but the agent
did not call handleTodo → triageTodoDump — it re-implemented the
flow in natural language via Read/Write tools. Side effect: a patched
backend in commands-todo.js was bypassed entirely.
Wanted: when a slash command has a registered typed handler in the
extension surface (i.e. handleTodo, handleNewMilestone, …), the
agent's prompt should require the call go through that handler rather
than letting the LLM improvise. The handler can be invoked as a tool
call so the LLM still has narrative space, but the side effects (DB
writes, file scaffolds, etc.) come from the typed path, not from raw
Write/Edit on TODO.md.
Concretely:
- In
slash-commands.md(or wherever the slash dispatch prompt lives), enumerate handlers and forbid the LLM from "doing the work" itself when a typed handler exists. - Add an integration test that runs
sf --print "/todo triage"against a fixture TODO.md and asserts thattriage_runsrows appear insf.db(i.e. the backend ran, not just the LLM).
Triage result needs structured tier/priority per item
Current shape:
result.implementation_tasks: string[] // titles only
result.memory_requirements: string[]
result.harness_suggestions: string[]
result.docs_or_tests: string[]
result.unclear_notes: string[]
result.eval_candidates: { id, task_input, expected_behavior, … }[]
Tiers (T1 / T2 / T3) appear only in the LLM-prose tier list it appends
to BUILD_PLAN.md. They are not present as a structured field per
item. That blocks any downstream "for each Tier-1 item, scaffold a
milestone" automation — the tier info is locked in prose.
Wanted: extend the triage JSON schema so each implementation task is
{ title: string, tier: "T1" | "T2" | "T3", rationale: string }
and update appendBacklogItems + a future milestone-escalator to read
the structured tier rather than re-parsing markdown.
Detect manual edits to milestone files (sha-tracked, diff on change)
Milestone files (CONTEXT.md, MILESTONE-SUMMARY.md, ROADMAP.md,
SUMMARY.md + each slice's PLAN.md / SUMMARY.md + each task's
PLAN.md / SUMMARY.md) are source of truth for SF planning. Today
nothing notices if a human (or another agent) edits one out of band:
SF keeps using the in-memory or DB-cached state, drifts from disk, and
downstream tools see a different milestone than the human just wrote.
Wanted: on session start (and on each autonomous-cycle entry), walk
.sf/milestones/**/*.md, hash each file, compare to the last-known
sha in sf.db (new column milestone_files.sha256,
milestone_files.last_seen_at). For any file whose sha has changed:
- Compute the diff against the last-seen content (stored alongside the sha as a compressed blob, or just re-fetched from git if the file is tracked).
- Surface to the operator: "Milestone M003-abc123 CONTEXT.md changed since SF last saw it — review or accept?" with the diff inline.
- If accepted (or in autonomous mode with a configured policy), update the DB-cached version + sha and continue. If rejected, restore from the last-known content.
- New files (sha not in DB) → import as if they were a fresh
new-milestonescaffold and add to the index. - Deleted files (DB has sha but file is gone) → mark the milestone archived and prompt the operator before purging.
Useful for: hand-edits, cross-agent edits (another LLM in a different session modified the milestone), git pulls that bring in upstream changes to milestone files, the milestone-from-triage path I sketched earlier (so the autonomous loop notices its own scaffold).
Storage cost: ~20 bytes per file (sha + last_seen_at) plus optional
compressed snapshot. Negligible vs. the rest of sf.db.
Phases-helpers extension-load error on every SF run
Every sf … invocation today prints:
[sf] Extension load error Error: Failed to load extension
"/home/mhugo/.sf/agent/extensions/sf/index.js": The requested module
'./phases-helpers.js' does not provide an export named 'closeoutAndStop'
Non-fatal (SF continues), but noisy and a sign of stale state. Either:
- A recent rename of
closeoutAndStopinphases-helpers.jswasn't propagated to its caller, andnpm run copy-resourcesquietly shipped the partial state, or - A test gap doesn't catch missing exports from
phases-helpers.js.
Add an import-time sanity check (or a test that imports every entry in the extension index and asserts all required symbols resolve).