diff --git a/TODO.md b/TODO.md index 4bc590508..0703d5fd2 100644 --- a/TODO.md +++ b/TODO.md @@ -1,109 +1,59 @@ # TODO -Dump anything here. `/todo triage` (and eventually the autonomous loop) processes it into proper backlog / harness / evals / docs. +Dump anything here. --- -## Auto-triage TODO.md on each autonomous cycle +## Triage should auto-promote Tier 1 items into real milestones -`commands-todo.js` already implements the triage (`/todo triage` → -`triageTodoDump`). Today it's manual only. Wire it to the autonomous -orchestrator so each cycle starts by checking if TODO.md has content -beyond the empty template, and if so runs `triageTodoDump` before -picking the next unit. +Confirmed today: `/todo triage` works — it produces a plan doc and tiers +items into `BUILD_PLAN.md`. **But it stops at tiering.** Tier 1 items — +the ones marked blocking / correctness — should additionally trigger +the milestone-creation path so they land as `.sf/milestones/MNNN/` +scaffolds, not just tier-list rows that have to be promoted by hand. -Triage cost is one LLM call (Minimax M2.7 etc per `PREFERRED_TRIAGE_MODEL_PATTERNS`), -which is cheap relative to a cycle. Skip when TODO.md == `_EMPTY_TODO` -template so cycles aren't penalised when there's nothing new. +Concretely: at the end of `triageTodoDump()` in +`src/resources/extensions/sf/commands-todo.js`, after items are +inserted into `triage_runs`/`triage_items` and reflected in +`BUILD_PLAN.md`, iterate items where `tier === "T1"` (or some configured +threshold) and call into the same milestone-creation flow that +`/sf new-milestone` uses. Each Tier-1 triage item's context (the +already-LLM-rationalised "Why" + "Implementation note") becomes the +spec for one milestone. -Test data: this file. When the wire lands, this section gets converted -to a real backlog item automatically and the file resets to the empty -template. +Two design decisions to make: -## When SF needs to ask a question, resolve via probes instead +1. **Auto-create vs queue.** Safer default is "queue, await human + approval before scaffolding files" — so triage emits a small + `triage_milestone_candidates` row and a follow-up command + (`/sf triage-milestones approve `) does the actual scaffold. + More aggressive: skip the human gate when there's high confidence + from the triage LLM that the item is well-scoped. -Today: agent inside `headless new-milestone` calls -`ask_user_questions`. In headless / autonomous mode that surface stalls -or returns "tool unavailable", and the milestone never lands. +2. **What counts as Tier 1.** The triage LLM already decides this + based on the spec wording. Document the criteria so it's stable + across model swaps (e.g. "correctness / blocking / safety = T1; + ergonomics / nice-to-have = T2-3"). -Wanted behaviour: replace blocking user questions with **adversarial- -collaborative resolution**: +When this ships, today's seven triage items (recorded in +`docs/plans/todo-triage-2026-05-11-plan.md`) should retroactively get +milestones for the two Tier 1 entries (headless unattended fix + +adversarial-collaborative probes). -- **Combatant probe** — adversarial agent challenges the assumption - behind the question. "Why do you need this answer? What if it's the - opposite? What evidence would change your mind?" -- **Partner probe** — collaborative agent does targeted research to - surface the most likely answer from the codebase / existing context / - prior milestones. -- Both run in parallel, with a short budget (e.g. 30 s / 2 tool calls - each). -- If they converge → proceed with the resolved answer, note the - decision and confidence in the milestone artifact. -- If they diverge AND no human is reachable → make the conservative - call (minimal scope) and flag the unresolved question in - `OPEN-QUESTIONS.md` for later human review. -- Only fall back to actually asking the human if interactive mode is - available and the question is high-stakes. +## Triage runner had an extension load error -This makes `headless new-milestone --context …` actually finish -unattended, which is the gap that's blocking batch backlog ingestion -right now. +Today's `/todo triage` run printed at the top of its output: -## Headless `new-milestone` is broken in unattended mode +``` +[sf] Extension load error Error: Failed to load extension +"/home/mhugo/.sf/agent/extensions/sf/index.js": +The requested module './phases-helpers.js' does not provide an +export named 'closeoutAndStop' +``` -Reproduce: `sf headless new-milestone --context-text "…complete spec…"` -→ agent invokes `ask_user_questions` → "tool unavailable" error → no -milestone created. - -Two paths to fix (either works, both ideal): - -1. Prompt-level: instruct the agent that when `--context` or - `--context-text` is provided, that's the complete spec, and to - proceed without follow-up questions. Cheap (prompt change). -2. Tool-level: in headless mode without `--supervised`, have - `ask_user_questions` resolve through the combatant/partner probe - flow described above rather than failing. - -## Bulk roadmap import - -`sf headless import-roadmap --file BACKLOG.md` — read flat markdown -with H2 sections and bullet items, emit one milestone per H2, slices -per item, no LLM. Pure text → SF-structure transform. - -Useful for ingesting `BACKLOG.md` from `.sf/wiki/` (or from a human's -roadmap file) without 16 LLM round-trips. - -Schema: H2 = milestone title. Following paragraph = milestone context. -Each `- ⬜` bullet = one slice (`✅` filters out done items). Optional -H3 = phase boundary inside the milestone. - -## `sf plan list` should have a TTY-free variant - -`sf plan list` fails with "Interactive mode requires a terminal" in -non-TTY. The actual operation (list files in `.sf/milestones/`) needs -no interaction. `sf plan list --plain` or `sf headless plan list` -that emits one milestone-id-and-title per line would be enough. - -## Hand-authorable milestone scaffold - -Today a milestone is a directory tree with `CONTEXT.md`, -`MILESTONE-SUMMARY.md`, `ROADMAP.md`, `SUMMARY.md`, plus `slices/SNN/` -and `tasks/TNN/`. Naming uses an ID + 6-char hash that's not documented. - -A documented "minimum milestone" — say, just `CONTEXT.md` with -frontmatter `id: MNNN\ntitle: …` — that SF will accept and auto-fill -the rest of the tree from on first operation. Lets humans (or other -tools) hand-author milestones when SF's LLM scaffold is unavailable or -overkill. - -## Discoverable `--answers` schema - -`sf headless` has `--answers ` for pre-supplying interactive -answers, but the answer schema for each command isn't discoverable. - -`sf headless new-milestone --print-answer-schema` that emits the JSON -schema of every question the command *might* ask, so a caller can -pre-supply rather than running interactively first to record them. -Complements the probe-resolution flow above — if probes converge, -use that; if they diverge but the caller pre-supplied an answer via -`--answers`, use that instead of falling back to OPEN-QUESTIONS.md. +Triage itself still completed successfully, so it's non-fatal — but +some extension in the agent harness expected `closeoutAndStop` from +`phases-helpers.js` and didn't find it. Either the symbol was renamed +and a caller wasn't updated, or `npm run copy-resources` didn't sync +the right file. Likely caught by a test if there's one for the +phases-helpers exports list.