singularity-forge/TODO.md
Mikael Hugo 7ef58422b1
Some checks are pending
CI / detect-changes (push) Waiting to run
CI / docs-check (push) Blocked by required conditions
CI / lint (push) Blocked by required conditions
CI / build (push) Blocked by required conditions
CI / integration-tests (push) Blocked by required conditions
CI / windows-portability (push) Blocked by required conditions
CI / rtk-portability (linux, blacksmith-4vcpu-ubuntu-2404) (push) Blocked by required conditions
CI / rtk-portability (macos, macos-15) (push) Blocked by required conditions
CI / rtk-portability (windows, blacksmith-4vcpu-windows-2025) (push) Blocked by required conditions
TODO: feature requests for batch backlog ingestion + probe-based resolution
Real dogfood for the auto-triage feature: this is the unstructured dump
that the autonomous cycle should pick up and process into proper backlog
items the next time it runs. Until auto-triage is wired up, the contents
serve as a written spec for what's needed.

Two flagship features:

- Auto-triage TODO.md on each autonomous cycle. `commands-todo.js`
  already implements `/todo triage` (manual). Wire it to the autonomous
  orchestrator and skip when TODO.md == _EMPTY_TODO.

- When the LLM would ask a clarifying question, replace with parallel
  combatant + partner probes (adversarial-challenge + collaborative-
  research) and only fall back to asking a human if probes diverge AND
  interactive mode is available. This unblocks unattended
  `headless new-milestone` (the gap that blocked batch backlog
  ingestion today).

Plus five smaller items (headless milestone stall fix, bulk
import-roadmap, TTY-free plan list, hand-authorable milestone scaffold,
discoverable --answers schema) carried over from the
centralcloud-ops SF-IMPROVEMENTS.md observations.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-11 19:09:26 +02:00

4.7 KiB

TODO

Dump anything here. /todo triage (and eventually the autonomous loop) processes it into proper backlog / harness / evals / docs.


Auto-triage TODO.md on each autonomous cycle

commands-todo.js already implements the triage (/todo triagetriageTodoDump). Today it's manual only. Wire it to the autonomous orchestrator so each cycle starts by checking if TODO.md has content beyond the empty template, and if so runs triageTodoDump before picking the next unit.

Triage cost is one LLM call (Minimax M2.7 etc per PREFERRED_TRIAGE_MODEL_PATTERNS), which is cheap relative to a cycle. Skip when TODO.md == _EMPTY_TODO template so cycles aren't penalised when there's nothing new.

Test data: this file. When the wire lands, this section gets converted to a real backlog item automatically and the file resets to the empty template.

When SF needs to ask a question, resolve via probes instead

Today: agent inside headless new-milestone calls ask_user_questions. In headless / autonomous mode that surface stalls or returns "tool unavailable", and the milestone never lands.

Wanted behaviour: replace blocking user questions with adversarial- collaborative resolution:

  • Combatant probe — adversarial agent challenges the assumption behind the question. "Why do you need this answer? What if it's the opposite? What evidence would change your mind?"
  • Partner probe — collaborative agent does targeted research to surface the most likely answer from the codebase / existing context / prior milestones.
  • Both run in parallel, with a short budget (e.g. 30 s / 2 tool calls each).
  • If they converge → proceed with the resolved answer, note the decision and confidence in the milestone artifact.
  • If they diverge AND no human is reachable → make the conservative call (minimal scope) and flag the unresolved question in OPEN-QUESTIONS.md for later human review.
  • Only fall back to actually asking the human if interactive mode is available and the question is high-stakes.

This makes headless new-milestone --context … actually finish unattended, which is the gap that's blocking batch backlog ingestion right now.

Headless new-milestone is broken in unattended mode

Reproduce: sf headless new-milestone --context-text "…complete spec…" → agent invokes ask_user_questions → "tool unavailable" error → no milestone created.

Two paths to fix (either works, both ideal):

  1. Prompt-level: instruct the agent that when --context or --context-text is provided, that's the complete spec, and to proceed without follow-up questions. Cheap (prompt change).
  2. Tool-level: in headless mode without --supervised, have ask_user_questions resolve through the combatant/partner probe flow described above rather than failing.

Bulk roadmap import

sf headless import-roadmap --file BACKLOG.md — read flat markdown with H2 sections and bullet items, emit one milestone per H2, slices per item, no LLM. Pure text → SF-structure transform.

Useful for ingesting BACKLOG.md from .sf/wiki/ (or from a human's roadmap file) without 16 LLM round-trips.

Schema: H2 = milestone title. Following paragraph = milestone context. Each - ⬜ bullet = one slice ( filters out done items). Optional H3 = phase boundary inside the milestone.

sf plan list should have a TTY-free variant

sf plan list fails with "Interactive mode requires a terminal" in non-TTY. The actual operation (list files in .sf/milestones/) needs no interaction. sf plan list --plain or sf headless plan list that emits one milestone-id-and-title per line would be enough.

Hand-authorable milestone scaffold

Today a milestone is a directory tree with CONTEXT.md, MILESTONE-SUMMARY.md, ROADMAP.md, SUMMARY.md, plus slices/SNN/ and tasks/TNN/. Naming uses an ID + 6-char hash that's not documented.

A documented "minimum milestone" — say, just CONTEXT.md with frontmatter id: MNNN\ntitle: … — that SF will accept and auto-fill the rest of the tree from on first operation. Lets humans (or other tools) hand-author milestones when SF's LLM scaffold is unavailable or overkill.

Discoverable --answers schema

sf headless has --answers <path> for pre-supplying interactive answers, but the answer schema for each command isn't discoverable.

sf headless new-milestone --print-answer-schema that emits the JSON schema of every question the command might ask, so a caller can pre-supply rather than running interactively first to record them. Complements the probe-resolution flow above — if probes converge, use that; if they diverge but the caller pre-supplied an answer via --answers, use that instead of falling back to OPEN-QUESTIONS.md.