singularity-forge/TODO.md
Mikael Hugo 617608347d fix(sf): align auto-mode prompts to canonical sf_task_complete / sf_slice_complete
Auto-mode prompts called legacy aliases (sf_complete_task, sf_complete_slice)
while guided used canonical (sf_task_complete, sf_slice_complete). The
divergence was locked in by the test 'auto execute-task requires legacy
completion alias until prompt contract is aligned' — explicit tech debt
marker.

Migrated:
- workflow-mcp.ts getRequiredWorkflowToolsForAutoUnit: returns canonical
- prompts/execute-task.md: 4 callsites
- prompts/complete-slice.md: 3 callsites
- prompts/reactive-execute.md: any (none on this file)
- workflow-mcp.test.ts: assertion + transport-error fixtures
- Test rename: 'requires legacy completion alias' → 'requires canonical'

The aliases stay registered (sf_complete_task → sf_task_complete) so
external callers and old session resumes don't break. Tool-naming.test.ts
still asserts both names route to the same handler.

Resolves: sf-moohqbza-yyq8sd.
Tests: workflow-mcp + tool-naming 29/29 pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 17:25:53 +02:00

99 lines
6 KiB
Markdown

# Raw Dump Inbox
## Eval Candidates
1. Test note for CI mode verification
---
## SF Hardening Backlog From Claude Code Scan
### Goal
Make SF auto-mode impossible to loop from broken docs, missing artifacts, stale runtime state, or ambiguous background-unit completion. `ROADMAP.md` must become a rendered human artifact, not executable dispatch state.
### Track 1 - Canonical State And No-Doc-Loop Dispatch
- [ ] Add `getCanonicalMilestonePlan(basePath, milestoneId)` as the only dispatch-facing milestone plan accessor.
- [ ] Prefer DB slice rows when DB is available and populated.
- [ ] Add `.sf/milestones/Mxxx/Mxxx-ROADMAP.json` as the structured fallback projection.
- [ ] Treat `ROADMAP.md` as rendered display only.
- [ ] Keep Markdown parsing only for import, migration, doctor repair, and parser tests.
- [ ] Move parallel research, single-slice research, prior-slice guard, UAT/validation dispatch, and prompt slice enumeration to canonical plan state.
- [ ] Add generated marker/hash metadata to `ROADMAP.md`.
- [ ] Stop dispatch when DB/projection/Markdown disagree.
- [ ] Add `/sf doctor --fix` support to re-render generated roadmap artifacts from canonical state.
### Track 2 - Unit Runtime FSM
- [ ] Introduce durable unit runtime state under `.sf/runtime/units/*.json`.
- [ ] Model unit states: `queued`, `claimed`, `running`, `progress`, `completed`, `failed`, `blocked`, `cancelled`, `stale`, `runaway-recovered`, `notified`.
- [ ] Persist `retryCount`, `maxRetries`, `lastHeartbeatAt`, `lastProgressAt`, `lastOutputAt`, `outputPath`, `watchdogReason`, and `notifiedAt`.
- [ ] Prevent redispatch for terminal units with `notifiedAt`.
- [ ] Allow retry only when status is retryable and retry budget remains.
- [ ] Require explicit reset to rerun failed synthetic units like `parallel-research`.
### Track 3 - Progress And Liveness
- [ ] Separate heartbeat, progress, and output growth.
- [ ] Treat silent-but-running as valid only when heartbeat is fresh.
- [ ] Add watchdog classifiers: dead PID, expired lease, no heartbeat, no output growth, permission prompt, interactive prompt, runaway recovery.
- [ ] Extend `sf headless --output-format json query` with active unit, status, elapsed time, retry count, watchdog reason, last progress time, and output path.
- [ ] Later: render TUI/footer status rows from the same runtime model.
### Track 4 - Event And Interrupt Policy
- [ ] Add explicit event origins: `user-message`, `system-steer`, `task-notification`, `memory-event`, `background-completion`, `permission-request`.
- [ ] Add interrupt behaviors: `interrupt`, `queue`, `block`, `drop-if-stale`.
- [ ] Default user-origin messages to interrupt active work.
- [ ] Default system/task/memory events to non-interrupting unless explicitly marked.
- [ ] Scope queues so main loop does not consume subagent-directed events and subagents do not consume main user prompts.
- [ ] Ensure background completions enqueue once and set `notifiedAt`.
### Track 5 - Tool, Plugin, And Permission Boundaries
- [ ] Add explicit tool contracts for read/write behavior, concurrency safety, permission requirements, and interrupt behavior.
- [ ] Treat each background worker as a durable task with `taskId`, `parentUnitId`, status, output path, retry budget, and notification marker.
- [ ] Add doctor checks for suspicious hook/tool/plugin config.
- [ ] Ensure runtime permission errors name the denied tool/action and the relevant policy.
### Track 6 - Release And Privacy Hygiene
- [ ] Add packed-artifact scanner before release.
- [ ] Fail release artifacts containing inline source maps, `sourcesContent`, `.ts/.tsx` source, local absolute paths, secrets, or debug-only strings.
- [ ] Use `npm pack --dry-run --json` plus unpack inspection for npm artifacts.
- [ ] Add telemetry wrappers that allow numeric/boolean metadata by default and require reviewed wrappers for string metadata.
- [ ] Add tests for no-telemetry mode and no-nonessential-network mode.
### Eval Candidates
- [ ] Bad roadmap: DB has 2 slices, `ROADMAP.md` has 6 stale rows. Expected: dispatch uses DB/projection or stops; never dispatches stale rows.
- [ ] Projection fallback: DB unavailable, `ROADMAP.json` exists. Expected: dispatch succeeds from projection.
- [ ] Legacy unsafe fallback: DB unavailable, only `ROADMAP.md` exists. Expected: dispatch stops with doctor/migration instruction.
- [ ] Drift detection: `ROADMAP.md` marker hash mismatches projection. Expected: `/sf doctor` reports drift.
- [ ] Drift repair: `/sf doctor --fix` re-renders Markdown and clears drift.
- [ ] Synthetic unit failure: `parallel-research` is `runaway-recovered`. Expected: cannot redispatch unless explicitly reset.
- [ ] Notification idempotency: terminal unit with `notifiedAt` does not enqueue another completion notification.
- [ ] Retry budget: retryable failure increments `retryCount`; exceeding `maxRetries` becomes terminal.
- [ ] Stale heartbeat: missing heartbeat becomes `stale`, not infinite redispatch.
- [ ] Interrupt policy: user steering interrupts active work; memory/system/task notifications do not interrupt by default.
- [ ] Queue scoping: subagent-scoped notifications are not consumed by the main loop.
- [ ] Release scanner: fixture artifact containing `sourcesContent` fails; clean packed artifact passes.
### Verification Commands
- [ ] `npx vitest run src/resources/extensions/sf/tests/parallel-research-dispatch.test.ts --config vitest.config.ts --reporter=verbose`
- [ ] Focused tests for canonical plan, doctor drift, runtime FSM, interrupt policy, and release scanner.
- [ ] `npm run typecheck:extensions`
- [ ] `sf headless --output-format json query` against bad-roadmap and failed-runtime fixtures.
### Implementation Order
1. Canonical plan accessor and structured roadmap projection.
2. Dispatch migration away from Markdown parsing.
3. Doctor drift detection and repair.
4. Unit runtime FSM and redispatch policy.
5. Headless query liveness fields.
6. Event interrupt policy.
7. Tool/plugin permission contracts.
8. Release artifact scanner and privacy wrappers.