prompts/parallel-research-slices.md step 3 told the dispatcher to verify
research at `.sf/{{mid}}/`, but slice research files actually live at
`.sf/milestones/{{mid}}/slices/<sliceId>/<sliceId>-RESEARCH.md`. Step 3
verification could only ever fail.
prompts/validate-milestone.md sent the three milestone-validation reviewer
agents to wrong paths:
- parentTrace pointed at `.sf/{{milestoneId}}/S0X-SUMMARY.md` (slice
summaries actually live at `.sf/milestones/{{milestoneId}}/slices/S0X/`)
- Reviewer A read `.sf/{{milestoneId}}/REQUIREMENTS.md` (the file is at
project-level `.sf/REQUIREMENTS.md`)
- Reviewer A scanned `.sf/{{milestoneId}}/` for slice SUMMARYs (wrong dir)
- Reviewer C read `.sf/{{milestoneId}}/CONTEXT.md` (actual file is
`.sf/milestones/{{milestoneId}}/{{milestoneId}}-CONTEXT.md`)
Reviewers would either return false MISSING / FAIL verdicts or have to
re-discover the layout.
docs/dev/ADR-{008,009}-IMPLEMENTATION-PLAN.md "Related ADR" links pointed
to absolute paths inside a contributor's old Mac (`/Users/jeremymcspadden/
Github/sf-2/...`). Replaced with sibling-file relative paths.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
14 KiB
ADR-009 Implementation Plan
Related ADR: ADR-009-orchestration-kernel-refactor.md
Status: Draft
Date: 2026-04-14
Target Window: 8-10 waves (incremental, no big-bang rewrite)
Objective
Implement ADR-009 by migrating SF orchestration internals to a Unified Orchestration Kernel (UOK) with six control planes:
- Plan
- Execution
- Model
- Gate
- GitOps
- Audit
without breaking existing CLI/web/MCP workflows.
The first production-safe outcome is:
- existing auto-mode behavior remains stable
- new kernel contracts exist behind feature flags
- every turn is traceable with deterministic gate outcomes
Non-Goals
- Rewriting user-facing command surfaces
- Replacing all legacy modules in a single PR
- Introducing new provider auth flows that bypass existing compliance boundaries
- Forcing
burn-maxbehavior as default
Constraints
- Maintain current runtime compatibility and defaults
- Preserve existing state-on-disk and DB-backed transition model
- Keep provider-agnostic behavior while enforcing provider-specific policy constraints
- All migration steps must be reversible behind flags
- High-risk changes require parity tests against existing behavior
Program Structure
Implementation is organized into parallel workstreams and executed in waves.
Workstream A: Kernel Contracts and Orchestrator Spine
Goal: define typed contracts and a new orchestration spine without changing behavior.
Primary targets:
src/resources/extensions/sf/auto.tssrc/resources/extensions/sf/auto/loop.tssrc/resources/extensions/sf/auto/types.tssrc/resources/extensions/sf/auto/session.ts
Deliverables:
TurnContractandTurnResulttypesGateResultenvelope- kernel entrypoint that wraps current dispatch loop via adapter
Workstream B: Gate Plane
Goal: normalize all checks into a unified gate runner.
Primary targets:
src/resources/extensions/sf/verification-gate.tssrc/resources/extensions/sf/auto-verification.tssrc/resources/extensions/sf/pre-execution-checks.tssrc/resources/extensions/sf/post-execution-checks.tssrc/resources/extensions/sf/milestone-validation-gates.ts
Deliverables:
- unified gate registry and execution API
- deterministic failure classes and retry policies
- explicit terminal status persistence
Workstream C: Model Plane + Policy Engine
Goal: enable any-model-any-phase through requirement-based selection plus policy filtering.
Primary targets:
src/resources/extensions/sf/model-router.tssrc/resources/extensions/sf/auto-model-selection.tssrc/resources/extensions/sf/preferences-models.tssrc/resources/extensions/sf/model-cost-table.tssrc/resources/extensions/sf/custom-execution-policy.ts
Deliverables:
- requirement vector builder for units
- policy filter before capability scoring
- new
burn-maxprofile - policy decision audit events
Workstream D: Execution Graph (Agents/Subagents/Parallel/Teams)
Goal: move to one DAG scheduler contract.
Primary targets:
src/resources/extensions/sf/reactive-graph.tssrc/resources/extensions/sf/slice-parallel-orchestrator.tssrc/resources/extensions/sf/parallel-orchestrator.tssrc/resources/extensions/sf/graph.tssrc/resources/extensions/sf/unit-runtime.ts
Deliverables:
- typed node kinds (
unit,hook,subagent,team-worker,verification,reprocess) - shared dependency/conflict resolver
- scheduler adapter for current parallel and reactive paths
Workstream E: GitOps Transaction Layer
Goal: guarantee git action and metadata record per turn.
Primary targets:
src/resources/extensions/sf/git-service.tssrc/resources/extensions/sf/auto-post-unit.tssrc/resources/extensions/sf/auto-unit-closeout.tssrc/resources/extensions/sf/auto-worktree.ts
Deliverables:
turn-start -> stage -> checkpoint -> publish -> recordtransaction API- configurable turn action mode (
commit|snapshot|status-only) - closeout gate integration for git failures
Workstream F: Unified Audit Plane
Goal: unify journal/activity/metrics into a causal event model.
Primary targets:
src/resources/extensions/sf/journal.tssrc/resources/extensions/sf/activity-log.tssrc/resources/extensions/sf/metrics.tssrc/resources/extensions/sf/workflow-logger.tssrc/resources/extensions/sf/sf-db.ts
Deliverables:
- common
AuditEventEnvelope - trace/turn IDs on all events
- append-only JSONL raw log + DB projection index
Workstream G: Plan Plane v2
Goal: formal multi-round clarify/research/draft/compile flow.
Primary targets:
src/resources/extensions/sf/guided-flow.tssrc/resources/extensions/sf/preparation.tssrc/resources/extensions/sf/auto/phases.tssrc/resources/extensions/sf/auto-prompts.ts- prompt templates under
src/resources/extensions/sf/prompts/
Deliverables:
- bounded multi-round question loop
- plan compile step producing executable unit graph
- plan gate fail-closed behavior
Wave Plan (Execution Order)
Wave 0: Baseline and Flag Scaffolding
Purpose: establish safe rollout controls and baseline telemetry.
Tasks:
- Add feature flags:
uok.enableduok.gates.enableduok.model_policy.enableduok.execution_graph.enableduok.gitops.enableduok.audit_unified.enableduok.plan_v2.enabled
- Add no-op kernel wrapper around current auto loop
- Add baseline metrics for parity comparison
Exit criteria:
- zero behavior change with all flags off
- parity telemetry collected for existing loop
Verification:
npm run typecheck:extensionsnpm run test:unit
Wave 1: Contract Extraction
Purpose: create stable internal API boundaries.
Tasks:
- Introduce:
TurnContractUnitExecutionContextGateResultFailureClassTurnCloseoutRecord
- Adapter layer from legacy auto loop into contracts
- Add contract fixtures and serialization tests
Exit criteria:
- current auto dispatch runs through adapter path without behavior change
- all turn outcomes represented in structured result type
Verification:
- targeted tests in
src/resources/extensions/sf/tests/*auto* npm run test:unit
Wave 2: Gate Plane Unification
Purpose: centralize pre/in/post checks and retries.
Tasks:
- Build
gate-runnerand gate registry - Port existing checks into registered gates:
- policy/input/execution/artifact/verification/closeout
- Implement deterministic retry matrix by failure class
Exit criteria:
- every unit passes through gate runner
- explicit gate result persisted for pass/fail/retry/manual-attention
Verification:
- extend
verification-gate.test.ts - extend
validation-gate-patterns.test.ts - add integration tests for retry escalation
Wave 3: Model Plane + Policy Filter
Purpose: enable requirement-based selection constrained by policy.
Tasks:
- Add requirement extraction from unit metadata
- Insert policy filter before model scoring
- Add
burn-maxtoken profile wiring - Emit model policy allow/deny events
Exit criteria:
- units can select any eligible model across phases
- policy-denied routes fail before dispatch
- fallback chains remain deterministic
Verification:
- extend
model-cost-table.test.ts - extend model routing tests (
interactive-routing-bypass,tool-compatibility, related router suites) - add policy denial regression tests
Wave 4: Execution Graph Scheduler
Purpose: unify hooks/subagents/parallel/team work under one scheduler contract.
Tasks:
- Introduce graph scheduler facade
- Map reactive execution nodes to shared node model
- Map slice/milestone parallel orchestrators onto scheduler
- Add file IO conflict lock integration
Exit criteria:
- same task set can execute in deterministic single-worker or parallel graph mode
- no deadlock under known reactive/parallel fixtures
Verification:
slice-parallel-orchestrator.test.tsslice-parallel-conflict.test.tssidecar-queue.test.ts- integration:
src/resources/extensions/sf/tests/integration/*.test.ts
Wave 5: GitOps Transactions Per Turn
Purpose: enforce turn-level git actions and closeout discipline.
Tasks:
- Implement turn transaction API
- Wire turn transactions into auto closeout path
- Add configurable
turn_actionandturn_pushsemantics - Persist git transaction metadata into audit stream
Exit criteria:
- each turn has a git transaction record
- blocked git states surface as closeout gate failures
Verification:
git-serviceintegration tests- worktree-related integration suites
- closeout and merge regression suites
Wave 6: Unified Audit Plane
Purpose: converge logging/metrics/journal into one causal model.
Tasks:
- Define
AuditEventEnvelopeschema - Add
traceId,turnId,causedByto event emitters - Write projection pipeline into DB index tables
- Maintain append-only raw JSONL logs
Exit criteria:
- action-level traceability across model/tool/git/gate/test events
- legacy readers remain functional through compatibility projection
Verification:
workflow-logger*.test.tsworkflow-events.test.tsjournalandmetricsregression tests
Wave 7: Plan Plane v2
Purpose: deliver full multi-round planning and compile-to-unit graph.
Tasks:
- Implement bounded clarify rounds
- Add explicit research synthesis stage
- Add plan compile stage with dependency graph output
- Add plan gate with fail-closed checks
Exit criteria:
- full roadmap and unit graph produced before execution begins (when enabled)
- invalid plans cannot proceed to execution
Verification:
- prompt and plan parsing tests
- planning tool tests (
plan-milestone,plan-slice,plan-task) - discuss/guided flow regression tests
Wave 8: Legacy Branch Retirement + Default Flip
Purpose: reduce maintenance burden and enable UOK as default.
Tasks:
- remove superseded code paths in
auto.ts,auto-phases, and legacy closeout paths - keep legacy fallback behind emergency flag for one release window
- update docs and preferences reference
Exit criteria:
- UOK default in stable channel
- no critical parity regressions in one full release cycle
Verification:
- full
npm test - smoke + integration suites
- targeted manual UAT for CLI/web/headless
Testing and Validation Matrix
1. Unit
- contract serialization
- gate runner behavior by failure class
- model policy filter decisions
- git transaction state machine
- event envelope schema validation
2. Integration
- auto dispatch across plan/execute/complete/reassess/uat
- worktree/branch/none isolation behaviors
- parallel and reactive execution parity
- policy-denied dispatch fast-fail
3. End-to-End
- greenfield milestone from discuss -> plan -> execute -> complete -> merge
- failure reprocessing (test failure, tool failure, model failure)
- full audit trace reconstruction by
traceId - provider compliance scenarios (allowed vs denied paths)
4. Parity Harness
- replay selected historical workflows against legacy and UOK paths
- compare:
- state transitions
- produced artifacts
- gate decisions
- commit outcomes
Rollout Strategy
Stages
- Internal dogfood with flags on
- Beta cohort opt-in via project preference
- General availability with flags default-on
- Legacy fallback removed after stability window
Safety Controls
- runtime kill-switch for each plane
- release-note explicit migration warnings
- auto-rollback trigger on critical regressions (gates, git integrity, state corruption)
Data and Schema Changes
Expected schema additions:
- audit projection tables in
sf.db - gate result persistence tables
- turn transaction metadata
Rules:
- additive migrations only until Wave 8
- keep backwards-compatible readers during migration window
Dependencies
- Stable contract definitions before gate/model/scheduler rewires
- Gate plane before gitops hard enforcement
- Model policy engine before enabling any-model-any-phase by default
- Audit envelope before legacy logger removal
- Plan v2 before enforcing front-loaded planning defaults
Risk Register
Risk 1: Hidden Coupling in Auto Loop
Impact: migration bugs due to implicit side effects.
Mitigation: adapter-first extraction and parity harness before path switch.
Risk 2: Parallel Deadlocks
Impact: blocked runs or inconsistent state.
Mitigation: graph-level deadlock checks, IO lock tests, staged rollout behind flags.
Risk 3: Git Noise / Team Workflow Friction
Impact: commit churn and review overhead.
Mitigation: milestone squash defaults and configurable turn transaction modes.
Risk 4: Policy Drift Across Providers
Impact: compliance regressions.
Mitigation: provider policy registry tests and release checklist gates.
Risk 5: Telemetry Volume Growth
Impact: storage/perf pressure in long-running projects.
Mitigation: append-only raw + indexed projection + retention policies.
Definition of Done (ADR-009)
ADR-009 is complete when all are true:
- UOK path is default and stable.
- All units execute through unified gate runner.
- Model selection supports any eligible model in any phase with policy enforcement.
- Hooks/agents/subagents/parallel/team execution runs through one scheduler contract.
- Turn-level git transaction record exists for every executed turn.
- Unified audit events provide causal traceability across orchestration, model, tool, git, and test actions.
- Plan v2 can produce a complete unit graph with fail-closed plan gate.
burn-maxprofile is available and policy-safe.- Legacy orchestration branches are retired or behind emergency-only fallback.
- CLI/web/headless behavior remains user-compatible.
Recommended Immediate Next Tasks (Week 1)
- Add Wave 0 feature flags and default-off wiring.
- Introduce contract types and adapter shell (Wave 1 scaffolding).
- Add parity telemetry capture for legacy loop baseline.
- Land initial tests for contract serialization and turn result envelopes.