land 6 parallel codex/sonnet rescue outputs
- R016 swarm bus deliver verify (uok/swarm-dispatch.js + test): _busDispatch now force-refreshes target inbox and verifies messageId visibility before returning ok:true; ack-without-deliver class closed. - R082 drift detection UokGate (uok/drift-detection-gate.js + test): single-task + sweep scope; 3 drift classes (artifact-missing, prose-status mismatch, broken-import); follows ADR-0075 id/type/execute -> GateResult contract. - R087 PDD typed contracts (engine-types.js + test): ADR-0000 8 PDD fields + 7-dim run-control policy + ADR-0075 GateResult typedefs and validators. - R090 planning-execute lane split (auto/unit-lanes.js + auto/loop.js + 2 tests): lane classifier + capacity-aware tick dispatcher; SF_LANES=0 fallback is byte-equivalent to pre-R090. - R053 + R054 Wiggums detectors (detectors/repeated-feedback-kind.js + detectors/artifact-flap.js + 2 tests); R055 stale-lock + R056 periodic-runner source landed without tests (gap filed as self-feedback). - M053 per-repo supervisor design + skeleton (supervisor/repo-supervisor.js + test + design doc): RepoSupervisor class, zero module-global state, tick stub, failure isolation; M056 trust-boundary called out as follow-up. 85/85 tests green across the 8 new test files. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
parent
eaac4f0bd3
commit
d2ff4e84ba
19 changed files with 2740 additions and 1 deletions
57
docs/dev/drafts/M053-per-repo-supervisor.md
Normal file
57
docs/dev/drafts/M053-per-repo-supervisor.md
Normal file
|
|
@ -0,0 +1,57 @@
|
|||
# M053 Per-Repo Supervisor
|
||||
|
||||
SF needs a lightweight supervisor layer so one SF operator can watch and steer many independent repositories without collapsing them into one planning state. ADR-0000 frames SF as a purpose-to-software compiler; this milestone treats each repository as a separate compilation unit with its own `.sf/` directory, its own SQLite state, and its own autonomous-loop boundary.
|
||||
|
||||
The goal is to supervise N independent repos from a single SF process or from a coordinated set of SF processes. A supervisor decides when a repo is eligible for one autonomous tick, reports current health to an operator, and leaves actual task execution to later milestones. M053 is deliberately not a dispatch implementation. It defines the process boundary and the status contract needed before real work can be scheduled safely.
|
||||
|
||||
## Architecture
|
||||
|
||||
The core runtime unit is one `RepoSupervisor` instance per repository. The class has no module-global state. Each instance owns its `repoPath`, `supervisorId`, inferred `.sf/sf.db` path, runaway-guard counters, dispatch queue placeholder, lifecycle flags, and status counters. A caller can create many supervisors in the same Node process without shared mutable state leaking between repositories.
|
||||
|
||||
Each supervisor is responsible for opening or receiving its own database connection for the repo-local `.sf/sf.db`. The current skeleton records the inferred database path and keeps a placeholder connection state; real SQLite access belongs in the next implementation slice. The important contract is that repo A never writes repo B's `.sf/` state and that cross-repo roll-ups are read-only summaries produced above the per-repo boundary.
|
||||
|
||||
The dispatch queue is also per instance. For M053, `tick()` only logs intent such as `would dispatch unit X` and returns a result object. It must not import or call the existing autonomous loop, `parallel-orchestrator.js`, `swarm-dispatch.js`, or any detector. This avoids the known module-global state in the old orchestrator and keeps the new supervisor boundary testable.
|
||||
|
||||
## Lifecycle
|
||||
|
||||
`start()` marks the supervisor as running and prepares repo-local resources. Future versions will open the SQLite connection and load pending queue state here.
|
||||
|
||||
`tick()` is one autonomous loop iteration for exactly one repository. It checks lifecycle state, returns `{ status: "paused" }` while paused, selects the next queued unit placeholder, logs the dispatch intent, and records the result. All exceptions are caught inside the method so one repo cannot crash the whole supervising process.
|
||||
|
||||
`pause()` prevents future ticks from dispatching work while preserving counters and queue state. `resume()` clears the pause flag. `stop()` closes the supervisor boundary for that repo and marks its placeholder database connection as closed. `getStatus()` returns a plain JSON-safe object containing `repoId`, `lastTickAt`, `lastTickResult`, `paused`, `ticksTotal`, and `errorsTotal`.
|
||||
|
||||
## Coordination
|
||||
|
||||
Two coordination models are viable.
|
||||
|
||||
Option A: each supervisor runs independently while sharing a global rate limiter for scarce resources such as LLM requests, Codex sessions, and heavy disk I/O. This keeps the POC small. Supervisors remain isolated, but the outer process injects shared budget controls before calling `tick()`.
|
||||
|
||||
Option B: a central scheduler chooses which supervisor ticks next based on repo priority, staleness, backoff state, and resource availability. This gives stronger fairness and operator control, but it needs a richer scheduling policy and more status history before it is worth the complexity.
|
||||
|
||||
M053 recommends Option A for the POC. It proves the per-repo boundary and status contract without creating a new orchestration kernel. Option B is the evolution path once multiple repos are being supervised and starvation, priority, or budget contention becomes observable.
|
||||
|
||||
## Failure Isolation
|
||||
|
||||
A panic in one repo supervisor must not take down other supervisors. `tick()` catches every error, increments `errorsTotal`, records `lastTickResult`, and returns an error result instead of throwing. The next layer should wrap each supervisor tick in its own try/catch as a second barrier.
|
||||
|
||||
The supervisor should also carry a small circuit breaker. After repeated failures, the supervisor can stop dispatching and report a degraded state until an operator resets it or a backoff window expires. The skeleton keeps the counters needed for that policy; the real breaker thresholds should be introduced with behaviour tests when dispatch is wired.
|
||||
|
||||
## Persistence
|
||||
|
||||
Each repository state lives in that repository's `.sf/sf.db`. There are no cross-repo writes. Repo-local status output may be written to `.sf/supervisor-status.json` by the caller using `getStatus()`. A top-level operator can produce a roll-up file from those plain status objects, but that roll-up is a report, not an authority that mutates repo-local planning state.
|
||||
|
||||
This preserves the SF planning model: `.sf/sf.db` remains the canonical structured store for a repo, while generated JSON status is an operator-facing projection.
|
||||
|
||||
## Operator Interface
|
||||
|
||||
The intended operator surface is `sf supervisor status`. It should show one row per supervised repository with repo id, path, pause state, last tick time, last tick result, total ticks, total errors, and any circuit-breaker state. The command can also emit JSON for automation.
|
||||
|
||||
Each repo may expose `.sf/supervisor-status.json`, written from `RepoSupervisor.getStatus()`. A supervising process can additionally write a top-level roll-up containing all repo statuses and aggregate resource budget information. The roll-up should identify stale repos, paused repos, and repos with recent errors without reaching into another repo's `.sf/sf.db`.
|
||||
|
||||
## Open Questions
|
||||
|
||||
Secrets per repo are unresolved. A supervised repo may need different provider keys, different vault scopes, or no network access at all. The supervisor should not assume the operator process has permission to reuse its own secrets in every repo.
|
||||
|
||||
LLM key rotation is also unresolved. Shared rate limiting works for a POC, but production supervision needs a policy for provider quotas, key health, rotation, and per-repo attribution.
|
||||
|
||||
The repository trust boundary belongs to M056. A supervisor that watches arbitrary repos must distinguish trusted, restricted, and untrusted repos before it runs tools, loads instructions, or passes repo content to external models. M053 only creates the lightweight supervision boundary; it does not solve trust policy.
|
||||
|
|
@ -28,6 +28,12 @@ import { dispatchSelfFeedbackInlineFixIfNeeded } from "../self-feedback-drain.js
|
|||
import { recordSelfFeedback } from "../self-feedback.js";
|
||||
import { detectSameUnitLoop } from "../detectors/same-unit-loop.js";
|
||||
import { insertSelfFeedbackEntry } from "../sf-db.js";
|
||||
import {
|
||||
areLanesEnabled,
|
||||
classifyUnitLane,
|
||||
getExecuteLaneCapacity,
|
||||
getPlanningLaneCapacity,
|
||||
} from "./unit-lanes.js";
|
||||
import { getDatabase } from "../sf-db/sf-db-core.js";
|
||||
import {
|
||||
ExecutionGraphScheduler,
|
||||
|
|
@ -745,6 +751,13 @@ export async function autoLoop(ctx, pi, s, deps) {
|
|||
// every iteration while still picking up entries filed during
|
||||
// the run (which previously sat until next session_start).
|
||||
lastInlineFixDispatchIteration: 0,
|
||||
// ── R090: per-tick lane in-flight counters ────────────────────────
|
||||
// Reset at the start of each tick. Track how many planning/execute
|
||||
// units have been dispatched this tick so the lane capacity gates
|
||||
// can enforce caps without blocking other-lane candidates.
|
||||
// Only active when areLanesEnabled() is true.
|
||||
tickPlanningDispatched: 0,
|
||||
tickExecuteDispatched: 0,
|
||||
};
|
||||
let consecutiveErrors = 0;
|
||||
let consecutiveCooldowns = 0;
|
||||
|
|
@ -809,6 +822,9 @@ export async function autoLoop(ctx, pi, s, deps) {
|
|||
const backoffMs = Math.min(30_000, 1000 * 2 ** Math.min(5, stuckCycles));
|
||||
await new Promise((resolve) => setTimeout(resolve, backoffMs));
|
||||
}
|
||||
// ── R090: reset per-tick lane counters at top of each iteration ──
|
||||
loopState.tickPlanningDispatched = 0;
|
||||
loopState.tickExecuteDispatched = 0;
|
||||
// ── Journal: per-iteration flow grouping ──
|
||||
const flowId = randomUUID();
|
||||
let seqCounter = 0;
|
||||
|
|
@ -1443,6 +1459,46 @@ export async function autoLoop(ctx, pi, s, deps) {
|
|||
iterData = dispatchResult.data;
|
||||
observedUnitType = iterData.unitType;
|
||||
observedUnitId = iterData.unitId;
|
||||
// ── R090: lane-aware capacity gate ───────────────────────────────
|
||||
// When SF_LANES is enabled, enforce per-tick planning/execute caps.
|
||||
// If the selected unit's lane is already at capacity for this tick,
|
||||
// skip it (continue) so the loop can re-derive state and pick a
|
||||
// candidate from the other lane. When SF_LANES=0, this block is a
|
||||
// no-op — back-compat one-at-a-time behavior is preserved exactly.
|
||||
if (areLanesEnabled()) {
|
||||
const _dispatchedLane = classifyUnitLane(iterData.unitType);
|
||||
const _planCap = getPlanningLaneCapacity();
|
||||
const _execCap = getExecuteLaneCapacity();
|
||||
const _laneDispatched =
|
||||
_dispatchedLane === "planning"
|
||||
? loopState.tickPlanningDispatched
|
||||
: loopState.tickExecuteDispatched;
|
||||
const _laneCap = _dispatchedLane === "planning" ? _planCap : _execCap;
|
||||
if (_laneDispatched >= _laneCap) {
|
||||
debugLog("autoLoop", {
|
||||
phase: "lane-cap-skip",
|
||||
lane: _dispatchedLane,
|
||||
unitType: iterData.unitType,
|
||||
unitId: iterData.unitId,
|
||||
dispatched: _laneDispatched,
|
||||
cap: _laneCap,
|
||||
});
|
||||
finishTurn("skipped", "none", "lane-cap-saturated");
|
||||
continue;
|
||||
}
|
||||
// Lane has capacity — record the dispatch
|
||||
if (_dispatchedLane === "planning") {
|
||||
loopState.tickPlanningDispatched++;
|
||||
} else {
|
||||
loopState.tickExecuteDispatched++;
|
||||
}
|
||||
debugLog("autoLoop", {
|
||||
phase: "lane-dispatch",
|
||||
lane: _dispatchedLane,
|
||||
unitType: iterData.unitType,
|
||||
unitId: iterData.unitId,
|
||||
});
|
||||
}
|
||||
if (maybeSkipSameUnitDispatchLoop(ic, iterData)) {
|
||||
finishTurn("skipped", "manual-attention", "same-unit-dispatch-loop");
|
||||
continue;
|
||||
|
|
|
|||
159
src/resources/extensions/sf/auto/unit-lanes.js
Normal file
159
src/resources/extensions/sf/auto/unit-lanes.js
Normal file
|
|
@ -0,0 +1,159 @@
|
|||
/**
|
||||
* auto/unit-lanes.js — Planning/Execute lane classification for R090.
|
||||
*
|
||||
* Splits autonomous units into two conceptual lanes so the planner can
|
||||
* work on milestone N while the executor works on milestone N-1.
|
||||
*
|
||||
* Exports:
|
||||
* LANE_PLANNING — string constant "planning"
|
||||
* LANE_EXECUTE — string constant "execute"
|
||||
* classifyUnitLane() — pure function: unitType → lane
|
||||
* areLanesEnabled() — reads SF_LANES env; default true
|
||||
* getPlanningLaneCapacity() — reads SF_LANES_PLANNING_N; default 1
|
||||
* getExecuteLaneCapacity() — reads SF_LANES_EXECUTE_N; default 1
|
||||
* selectLaneAwareUnits() — pick up to cap units per lane from a candidate list
|
||||
*/
|
||||
|
||||
export const LANE_PLANNING = "planning";
|
||||
export const LANE_EXECUTE = "execute";
|
||||
|
||||
/**
|
||||
* Explicit membership map for the well-known planning-lane unit types.
|
||||
* Anything not listed here is classified by fallback rules, then defaults
|
||||
* to the execute lane.
|
||||
*/
|
||||
const PLANNING_TYPES = new Set([
|
||||
"plan-milestone",
|
||||
"plan-slice",
|
||||
"research-slice",
|
||||
"assess-slice",
|
||||
"synthesize-slice",
|
||||
"reassess-roadmap",
|
||||
"validate-milestone",
|
||||
]);
|
||||
|
||||
/**
|
||||
* Classify a unit type into its lane.
|
||||
*
|
||||
* Planning lane:
|
||||
* - Any type in PLANNING_TYPES (explicit set)
|
||||
* - Any type ending in "-research"
|
||||
* - Any type starting with "plan-" or "research-"
|
||||
*
|
||||
* Execute lane (default):
|
||||
* - execute-task, complete-task, complete-slice, complete-milestone
|
||||
* - repair-* (any type starting with "repair-")
|
||||
* - Anything else not matched above
|
||||
*
|
||||
* @param {string} unitType
|
||||
* @returns {"planning" | "execute"}
|
||||
*/
|
||||
export function classifyUnitLane(unitType) {
|
||||
if (typeof unitType !== "string" || unitType.length === 0) {
|
||||
return LANE_EXECUTE;
|
||||
}
|
||||
|
||||
// Explicit set check (fast path for common types)
|
||||
if (PLANNING_TYPES.has(unitType)) {
|
||||
return LANE_PLANNING;
|
||||
}
|
||||
|
||||
// Suffix rule: ends with "-research"
|
||||
if (unitType.endsWith("-research")) {
|
||||
return LANE_PLANNING;
|
||||
}
|
||||
|
||||
// Prefix rules
|
||||
if (unitType.startsWith("plan-") || unitType.startsWith("research-")) {
|
||||
return LANE_PLANNING;
|
||||
}
|
||||
|
||||
// Default: execute lane
|
||||
return LANE_EXECUTE;
|
||||
}
|
||||
|
||||
/**
|
||||
* Return true when lane splitting is active.
|
||||
*
|
||||
* Default: ON (lanes enabled).
|
||||
* Set SF_LANES=0 or SF_LANES=false to disable (back-compat one-at-a-time mode).
|
||||
*
|
||||
* @param {NodeJS.ProcessEnv} [env]
|
||||
* @returns {boolean}
|
||||
*/
|
||||
export function areLanesEnabled(env = process.env) {
|
||||
const val = env.SF_LANES;
|
||||
if (val === "0" || val === "false") return false;
|
||||
return true;
|
||||
}
|
||||
|
||||
/**
|
||||
* How many planning-lane units may be dispatched per tick.
|
||||
*
|
||||
* Default: 1. Override with SF_LANES_PLANNING_N.
|
||||
*
|
||||
* @param {NodeJS.ProcessEnv} [env]
|
||||
* @returns {number}
|
||||
*/
|
||||
export function getPlanningLaneCapacity(env = process.env) {
|
||||
const raw = env.SF_LANES_PLANNING_N;
|
||||
if (raw !== undefined && raw !== "") {
|
||||
const n = Number.parseInt(raw, 10);
|
||||
if (Number.isFinite(n) && n > 0) return n;
|
||||
}
|
||||
return 1;
|
||||
}
|
||||
|
||||
/**
|
||||
* How many execute-lane units may be dispatched per tick.
|
||||
*
|
||||
* Default: 1. Override with SF_LANES_EXECUTE_N.
|
||||
*
|
||||
* @param {NodeJS.ProcessEnv} [env]
|
||||
* @returns {number}
|
||||
*/
|
||||
export function getExecuteLaneCapacity(env = process.env) {
|
||||
const raw = env.SF_LANES_EXECUTE_N;
|
||||
if (raw !== undefined && raw !== "") {
|
||||
const n = Number.parseInt(raw, 10);
|
||||
if (Number.isFinite(n) && n > 0) return n;
|
||||
}
|
||||
return 1;
|
||||
}
|
||||
|
||||
/**
|
||||
* Walk a list of candidate units and select up to planningCap planning-lane
|
||||
* units and up to executeCap execute-lane units.
|
||||
*
|
||||
* The walk continues past filled slots: a saturated execute slot does NOT
|
||||
* stop the scan — subsequent planning candidates can still fill the planning
|
||||
* slot (and vice-versa).
|
||||
*
|
||||
* @param {Array<{unitType: string, unitId?: string, [k: string]: unknown}>} candidates
|
||||
* Ordered list of ready units (front = highest priority).
|
||||
* @param {number} planningCap Maximum planning-lane dispatches per tick.
|
||||
* @param {number} executeCap Maximum execute-lane dispatches per tick.
|
||||
* @returns {Array<{unitType: string, unitId?: string, [k: string]: unknown}>}
|
||||
* The selected subset, in the order they were encountered.
|
||||
*/
|
||||
export function selectLaneAwareUnits(candidates, planningCap, executeCap) {
|
||||
let planningRemaining = planningCap;
|
||||
let executeRemaining = executeCap;
|
||||
const selected = [];
|
||||
|
||||
for (const candidate of candidates) {
|
||||
if (planningRemaining <= 0 && executeRemaining <= 0) break;
|
||||
|
||||
const lane = classifyUnitLane(candidate.unitType);
|
||||
if (lane === LANE_PLANNING && planningRemaining > 0) {
|
||||
selected.push(candidate);
|
||||
planningRemaining--;
|
||||
} else if (lane === LANE_EXECUTE && executeRemaining > 0) {
|
||||
selected.push(candidate);
|
||||
executeRemaining--;
|
||||
}
|
||||
// else: this lane is saturated — keep scanning for the other lane
|
||||
}
|
||||
|
||||
return selected;
|
||||
}
|
||||
69
src/resources/extensions/sf/detectors/artifact-flap.js
Normal file
69
src/resources/extensions/sf/detectors/artifact-flap.js
Normal file
|
|
@ -0,0 +1,69 @@
|
|||
/**
|
||||
* artifact-flap.js — detect artifacts oscillating between two hashes.
|
||||
*
|
||||
* Purpose: stop autonomous mode from repeatedly rewriting a generated artifact
|
||||
* back and forth while satisfying no stable predicate.
|
||||
*
|
||||
* Consumer: periodic detector sweeps over caller-provided artifact hash history.
|
||||
*/
|
||||
export const MIN_FLAPS = 3;
|
||||
export const WINDOW_MS = 30 * 60 * 1000;
|
||||
|
||||
function timestampMs(value) {
|
||||
if (typeof value === "number" && Number.isFinite(value)) return value;
|
||||
if (typeof value === "string") {
|
||||
const parsed = Date.parse(value);
|
||||
if (Number.isFinite(parsed)) return parsed;
|
||||
}
|
||||
return 0;
|
||||
}
|
||||
|
||||
/**
|
||||
* Detect whether one artifact path is flapping between two content hashes.
|
||||
*
|
||||
* Purpose: distinguish real artifact evolution from two-state oscillation so
|
||||
* SF can stop retrying a predicate that is not converging.
|
||||
*
|
||||
* Consumer: periodic detector runner.
|
||||
*/
|
||||
export function detectArtifactFlap(artifactHistory, options = {}) {
|
||||
const minFlaps = options.minFlaps ?? MIN_FLAPS;
|
||||
const windowMs = options.windowMs ?? WINDOW_MS;
|
||||
const now = options.now ?? Date.now();
|
||||
const cutoff = now - windowMs;
|
||||
const rows = Array.isArray(artifactHistory) ? artifactHistory : [];
|
||||
const byPath = new Map();
|
||||
|
||||
for (const row of rows) {
|
||||
if (!row?.path || !row.hash) continue;
|
||||
const occurredAt = timestampMs(row.timestamp);
|
||||
if (occurredAt < cutoff) continue;
|
||||
const entries = byPath.get(row.path) ?? [];
|
||||
entries.push({ hash: row.hash, timestamp: occurredAt });
|
||||
byPath.set(row.path, entries);
|
||||
}
|
||||
|
||||
for (const [path, entries] of byPath) {
|
||||
entries.sort((left, right) => left.timestamp - right.timestamp);
|
||||
const hashes = entries.map((entry) => entry.hash);
|
||||
const uniqueHashes = new Set(hashes);
|
||||
let flaps = 0;
|
||||
for (let index = 1; index < hashes.length; index += 1) {
|
||||
if (hashes[index] !== hashes[index - 1]) flaps += 1;
|
||||
}
|
||||
if (flaps >= minFlaps && uniqueHashes.size <= 2) {
|
||||
return {
|
||||
stuck: true,
|
||||
reason: "artifact-flap",
|
||||
signature: {
|
||||
path,
|
||||
flaps,
|
||||
uniqueHashes: uniqueHashes.size,
|
||||
windowMs,
|
||||
},
|
||||
};
|
||||
}
|
||||
}
|
||||
|
||||
return { stuck: false };
|
||||
}
|
||||
120
src/resources/extensions/sf/detectors/periodic-runner.js
Normal file
120
src/resources/extensions/sf/detectors/periodic-runner.js
Normal file
|
|
@ -0,0 +1,120 @@
|
|||
/**
|
||||
* periodic-runner.js — run Wiggums detectors on a bounded cadence.
|
||||
*
|
||||
* Purpose: centralize periodic stuck-state detection so autonomous surfaces can
|
||||
* share detector throttling, error isolation, and signatures.
|
||||
*
|
||||
* Consumer: SF autonomous mode at run-control boundaries.
|
||||
*/
|
||||
import { detectArtifactFlap } from "./artifact-flap.js";
|
||||
import { detectRepeatedFeedbackKind } from "./repeated-feedback-kind.js";
|
||||
import { detectSameUnitLoop } from "./same-unit-loop.js";
|
||||
import { detectStaleLock } from "./stale-lock.js";
|
||||
import { detectZeroProgress } from "./zero-progress.js";
|
||||
|
||||
export const SWEEP_CADENCE_MS = 60 * 1000;
|
||||
|
||||
async function optionalDetector(modulePath, exportName) {
|
||||
try {
|
||||
const module = await import(modulePath);
|
||||
return module?.[exportName] ?? null;
|
||||
} catch {
|
||||
return null;
|
||||
}
|
||||
}
|
||||
|
||||
function defaultDetectors(ctx, options) {
|
||||
return [
|
||||
{
|
||||
name: "same-unit-loop",
|
||||
run: () => detectSameUnitLoop(ctx?.unitId, ctx?.recentDispatches, options),
|
||||
},
|
||||
{
|
||||
name: "zero-progress",
|
||||
run: () =>
|
||||
detectZeroProgress(
|
||||
ctx?.unitMetrics,
|
||||
{
|
||||
tool_calls: 0,
|
||||
elapsedMs: 0,
|
||||
fingerprint: ctx?.sessionFingerprint,
|
||||
},
|
||||
{
|
||||
...options,
|
||||
collectWorktreeFingerprint: () => ctx?.sessionFingerprint ?? null,
|
||||
},
|
||||
),
|
||||
},
|
||||
{
|
||||
name: "repeated-feedback-kind",
|
||||
run: () => detectRepeatedFeedbackKind(ctx?.recentFeedback, options),
|
||||
},
|
||||
{
|
||||
name: "artifact-flap",
|
||||
run: () => detectArtifactFlap(ctx?.artifactHistory, options),
|
||||
},
|
||||
{
|
||||
name: "stale-lock",
|
||||
run: () => detectStaleLock(ctx?.lockPaths, options),
|
||||
},
|
||||
];
|
||||
}
|
||||
|
||||
/**
|
||||
* Run all available periodic stuck-state detectors once.
|
||||
*
|
||||
* Purpose: return a compact sweep result while keeping individual detector
|
||||
* failures from blocking the rest of the autonomy health check.
|
||||
*
|
||||
* Consumer: machine-surface autonomous controllers.
|
||||
*/
|
||||
export async function runDetectorSweep(ctx = {}, options = {}) {
|
||||
const startedAt = Date.now();
|
||||
const throttleMs = options.throttleMs ?? 60_000;
|
||||
const throttleState =
|
||||
options.throttleState instanceof Map ? options.throttleState : new Map();
|
||||
const optionalDetectors = (
|
||||
await Promise.all([
|
||||
optionalDetector("./model-route-flap.js", "detectModelRouteFlap"),
|
||||
optionalDetector("./prompt-drift.js", "detectPromptDrift"),
|
||||
])
|
||||
)
|
||||
.filter(Boolean)
|
||||
.map((detector, index) => ({
|
||||
name: index === 0 ? "model-route-flap" : "prompt-drift",
|
||||
run: () => detector(ctx, options),
|
||||
}));
|
||||
const detectors =
|
||||
options.detectors ?? defaultDetectors(ctx, options).concat(optionalDetectors);
|
||||
const detectorsFired = [];
|
||||
let totalChecked = 0;
|
||||
|
||||
for (const detector of detectors) {
|
||||
const lastFiredAt = throttleState.get(detector.name);
|
||||
if (
|
||||
typeof lastFiredAt === "number" &&
|
||||
startedAt - lastFiredAt < throttleMs
|
||||
) {
|
||||
continue;
|
||||
}
|
||||
totalChecked += 1;
|
||||
try {
|
||||
const result = await detector.run();
|
||||
if (result?.stuck) {
|
||||
throttleState.set(detector.name, startedAt);
|
||||
detectorsFired.push({
|
||||
name: detector.name,
|
||||
signature: result.signature,
|
||||
});
|
||||
}
|
||||
} catch (error) {
|
||||
options.onError?.(error, detector.name);
|
||||
}
|
||||
}
|
||||
|
||||
return {
|
||||
detectorsFired,
|
||||
totalChecked,
|
||||
durationMs: Date.now() - startedAt,
|
||||
};
|
||||
}
|
||||
|
|
@ -0,0 +1,72 @@
|
|||
/**
|
||||
* repeated-feedback-kind.js — detect repeated self-feedback for one target.
|
||||
*
|
||||
* Purpose: stop autonomous mode from re-emitting the same self-feedback signal
|
||||
* against the same planning target without changing strategy.
|
||||
*
|
||||
* Consumer: periodic detector sweeps before the next autonomous run-control
|
||||
* boundary.
|
||||
*/
|
||||
export const MIN_OCCURRENCES = 3;
|
||||
export const WINDOW_MS = 30 * 60 * 1000;
|
||||
|
||||
function timestampMs(value) {
|
||||
if (typeof value === "number" && Number.isFinite(value)) return value;
|
||||
if (typeof value === "string") {
|
||||
const parsed = Date.parse(value);
|
||||
if (Number.isFinite(parsed)) return parsed;
|
||||
}
|
||||
return 0;
|
||||
}
|
||||
|
||||
function canonicalTarget(occurredIn) {
|
||||
if (!occurredIn || typeof occurredIn !== "object") return "";
|
||||
if (occurredIn.task) return `task:${occurredIn.task}`;
|
||||
if (occurredIn.slice) return `slice:${occurredIn.slice}`;
|
||||
if (occurredIn.milestone) return `milestone:${occurredIn.milestone}`;
|
||||
return "";
|
||||
}
|
||||
|
||||
/**
|
||||
* Detect repeated self-feedback of the same kind against the same target.
|
||||
*
|
||||
* Purpose: identify loops where SF diagnoses the same local failure repeatedly
|
||||
* and should escalate or change plan instead of appending another identical
|
||||
* feedback row.
|
||||
*
|
||||
* Consumer: periodic detector runner.
|
||||
*/
|
||||
export function detectRepeatedFeedbackKind(recentFeedback, options = {}) {
|
||||
const minOccurrences = options.minOccurrences ?? MIN_OCCURRENCES;
|
||||
const windowMs = options.windowMs ?? WINDOW_MS;
|
||||
const now = options.now ?? Date.now();
|
||||
const cutoff = now - windowMs;
|
||||
const rows = Array.isArray(recentFeedback) ? recentFeedback : [];
|
||||
const groups = new Map();
|
||||
|
||||
for (const row of rows) {
|
||||
const kind = row?.kind;
|
||||
if (!kind) continue;
|
||||
const occurredAt = timestampMs(row.timestamp);
|
||||
if (occurredAt < cutoff) continue;
|
||||
const target = canonicalTarget(row.occurredIn);
|
||||
const key = `${kind}\0${target}`;
|
||||
const group = groups.get(key) ?? { kind, target, occurrences: 0 };
|
||||
group.occurrences += 1;
|
||||
groups.set(key, group);
|
||||
if (group.occurrences >= minOccurrences) {
|
||||
return {
|
||||
stuck: true,
|
||||
reason: "repeated-feedback-kind",
|
||||
signature: {
|
||||
kind,
|
||||
target,
|
||||
occurrences: group.occurrences,
|
||||
windowMs,
|
||||
},
|
||||
};
|
||||
}
|
||||
}
|
||||
|
||||
return { stuck: false };
|
||||
}
|
||||
98
src/resources/extensions/sf/detectors/stale-lock.js
Normal file
98
src/resources/extensions/sf/detectors/stale-lock.js
Normal file
|
|
@ -0,0 +1,98 @@
|
|||
/**
|
||||
* stale-lock.js — detect lock files whose owning process is gone.
|
||||
*
|
||||
* Purpose: unblock autonomous operation when a prior SF process died after
|
||||
* writing a lock but before releasing it.
|
||||
*
|
||||
* Consumer: periodic detector sweeps before dispatch selection.
|
||||
*/
|
||||
import { promises as fs } from "node:fs";
|
||||
|
||||
function parsePid(contents) {
|
||||
const trimmed = contents.trim();
|
||||
if (!trimmed) return null;
|
||||
try {
|
||||
const parsed = JSON.parse(trimmed);
|
||||
const pid = Number(parsed?.pid);
|
||||
return Number.isInteger(pid) && pid > 0 ? pid : null;
|
||||
} catch {
|
||||
const pid = Number(trimmed);
|
||||
return Number.isInteger(pid) && pid > 0 ? pid : null;
|
||||
}
|
||||
}
|
||||
|
||||
function lockedAtMs(contents, stats) {
|
||||
try {
|
||||
const parsed = JSON.parse(contents);
|
||||
const lockedAt = parsed?.lockedAt;
|
||||
if (typeof lockedAt === "number" && Number.isFinite(lockedAt)) {
|
||||
return lockedAt;
|
||||
}
|
||||
if (typeof lockedAt === "string") {
|
||||
const value = Date.parse(lockedAt);
|
||||
if (Number.isFinite(value)) return value;
|
||||
}
|
||||
} catch {
|
||||
// Plain PID locks fall back to mtime.
|
||||
}
|
||||
return stats.mtimeMs;
|
||||
}
|
||||
|
||||
function isProcessAlive(pid) {
|
||||
try {
|
||||
process.kill(pid, 0);
|
||||
return true;
|
||||
} catch (error) {
|
||||
return error?.code === "EPERM";
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Detect stale lock files and optionally remove the first stale lock found.
|
||||
*
|
||||
* Purpose: let SF recover from abandoned process locks without treating live
|
||||
* locks as failures.
|
||||
*
|
||||
* Consumer: periodic detector runner.
|
||||
*/
|
||||
export async function detectStaleLock(lockPaths, options = {}) {
|
||||
const autoRecover = options.autoRecover ?? false;
|
||||
const now = options.now ?? Date.now();
|
||||
const paths = Array.isArray(lockPaths) ? lockPaths : [];
|
||||
|
||||
for (const lockPath of paths) {
|
||||
let contents;
|
||||
let stats;
|
||||
try {
|
||||
[contents, stats] = await Promise.all([
|
||||
fs.readFile(lockPath, "utf8"),
|
||||
fs.stat(lockPath),
|
||||
]);
|
||||
} catch (error) {
|
||||
if (error?.code === "ENOENT") continue;
|
||||
throw error;
|
||||
}
|
||||
|
||||
const pid = parsePid(contents);
|
||||
if (!pid || isProcessAlive(pid)) continue;
|
||||
|
||||
let recovered = false;
|
||||
if (autoRecover) {
|
||||
await fs.unlink(lockPath);
|
||||
recovered = true;
|
||||
}
|
||||
|
||||
return {
|
||||
stuck: true,
|
||||
reason: "stale-lock",
|
||||
signature: {
|
||||
lockPath,
|
||||
deadPid: pid,
|
||||
ageMs: Math.max(0, now - lockedAtMs(contents, stats)),
|
||||
},
|
||||
recovered,
|
||||
};
|
||||
}
|
||||
|
||||
return { stuck: false };
|
||||
}
|
||||
|
|
@ -5,4 +5,338 @@
|
|||
* Only `node:` imports are permitted. All engine/policy interfaces
|
||||
* depend on these types; nothing here depends on SF internals.
|
||||
*/
|
||||
export {};
|
||||
|
||||
const REQUIRED_PDD_FIELDS = [
|
||||
"purpose",
|
||||
"consumer",
|
||||
"contract",
|
||||
"failureBoundary",
|
||||
"evidence",
|
||||
"invariants",
|
||||
];
|
||||
|
||||
const RISK_LEVELS = ["low", "medium", "high", "critical"];
|
||||
const BLAST_RADIUS_LEVELS = ["unit", "slice", "milestone", "project", "org"];
|
||||
const CUSTOMER_IMPACT_LEVELS = [
|
||||
"none",
|
||||
"internal",
|
||||
"customer-visible",
|
||||
"customer-critical",
|
||||
];
|
||||
const GATE_OUTCOMES = ["pass", "fail", "retry", "manual-attention"];
|
||||
const FAILURE_CLASSES = [
|
||||
"policy",
|
||||
"verification",
|
||||
"execution",
|
||||
"artifact",
|
||||
"git",
|
||||
"timeout",
|
||||
"input",
|
||||
"closeout",
|
||||
"manual-attention",
|
||||
"unknown",
|
||||
];
|
||||
|
||||
const RISK_SCORES = { low: 0.85, medium: 0.55, high: 0.25, critical: 0 };
|
||||
const BLAST_RADIUS_SCORES = {
|
||||
unit: 1,
|
||||
slice: 0.75,
|
||||
milestone: 0.5,
|
||||
project: 0.25,
|
||||
org: 0,
|
||||
};
|
||||
const CUSTOMER_IMPACT_SCORES = {
|
||||
none: 1,
|
||||
internal: 0.75,
|
||||
"customer-visible": 0.35,
|
||||
"customer-critical": 0,
|
||||
};
|
||||
|
||||
/**
|
||||
* @typedef {object} PDDFields
|
||||
* @property {string} purpose
|
||||
* @property {string} consumer
|
||||
* @property {string} contract
|
||||
* @property {string} failureBoundary
|
||||
* @property {string|object} evidence
|
||||
* @property {string[]|string} [nonGoals]
|
||||
* @property {string[]|string} invariants
|
||||
* @property {string[]|string} [assumptions]
|
||||
*/
|
||||
|
||||
/**
|
||||
* Validate PDD fields before a plan or implementation can claim purpose coverage.
|
||||
*
|
||||
* Purpose: protect spec-first flow from running units whose consumer, contract,
|
||||
* evidence, or invariant boundaries are absent.
|
||||
*
|
||||
* Consumer: SF planning and gate code that needs a leaf-node PDD contract check.
|
||||
*
|
||||
* @param {unknown} input
|
||||
* @returns {{ ok: boolean, missing: string[], errors: string[] }}
|
||||
*/
|
||||
export function validatePDDFields(input) {
|
||||
const missing = [];
|
||||
const errors = [];
|
||||
|
||||
if (!isRecord(input)) {
|
||||
return {
|
||||
ok: false,
|
||||
missing: [...REQUIRED_PDD_FIELDS],
|
||||
errors: ["input must be an object"],
|
||||
};
|
||||
}
|
||||
|
||||
for (const field of REQUIRED_PDD_FIELDS) {
|
||||
if (!hasNonEmptyValue(input[field])) {
|
||||
missing.push(field);
|
||||
}
|
||||
}
|
||||
|
||||
for (const field of ["purpose", "consumer", "contract", "failureBoundary"]) {
|
||||
if (input[field] !== undefined && !isNonEmptyString(input[field])) {
|
||||
errors.push(`${field} must be a non-empty string`);
|
||||
}
|
||||
}
|
||||
|
||||
if (
|
||||
input.evidence !== undefined &&
|
||||
!isNonEmptyString(input.evidence) &&
|
||||
!isRecord(input.evidence)
|
||||
) {
|
||||
errors.push("evidence must be a non-empty string or object");
|
||||
}
|
||||
|
||||
for (const field of ["nonGoals", "invariants", "assumptions"]) {
|
||||
if (input[field] !== undefined && !isStringOrStringArray(input[field])) {
|
||||
errors.push(`${field} must be a string or string array`);
|
||||
}
|
||||
}
|
||||
|
||||
return { ok: missing.length === 0 && errors.length === 0, missing, errors };
|
||||
}
|
||||
|
||||
/**
|
||||
* @typedef {object} RunControlPolicy
|
||||
* @property {number} confidence
|
||||
* @property {number|"low"|"medium"|"high"|"critical"} risk
|
||||
* @property {number} reversibility
|
||||
* @property {"unit"|"slice"|"milestone"|"project"|"org"} blastRadius
|
||||
* @property {number} cost
|
||||
* @property {boolean} legal
|
||||
* @property {"none"|"internal"|"customer-visible"|"customer-critical"} customerImpact
|
||||
*/
|
||||
|
||||
/**
|
||||
* Validate run-control policy dimensions before autonomous action scoring.
|
||||
*
|
||||
* Purpose: keep run-control decisions grounded in bounded, typed policy inputs
|
||||
* instead of implicit judgment calls.
|
||||
*
|
||||
* Consumer: SF gates that decide whether work can proceed autonomously,
|
||||
* assisted, or with manual attention.
|
||||
*
|
||||
* @param {unknown} input
|
||||
* @returns {{ ok: boolean, errors: string[] }}
|
||||
*/
|
||||
export function validateRunControlPolicy(input) {
|
||||
const errors = [];
|
||||
|
||||
if (!isRecord(input)) {
|
||||
return { ok: false, errors: ["input must be an object"] };
|
||||
}
|
||||
|
||||
if (!isUnitInterval(input.confidence)) {
|
||||
errors.push("confidence must be a number between 0 and 1");
|
||||
}
|
||||
if (
|
||||
!(
|
||||
isUnitInterval(input.risk) ||
|
||||
RISK_LEVELS.includes(/** @type {string} */ (input.risk))
|
||||
)
|
||||
) {
|
||||
errors.push(
|
||||
"risk must be a number between 0 and 1 or low|medium|high|critical",
|
||||
);
|
||||
}
|
||||
if (!isUnitInterval(input.reversibility)) {
|
||||
errors.push("reversibility must be a number between 0 and 1");
|
||||
}
|
||||
if (
|
||||
!BLAST_RADIUS_LEVELS.includes(/** @type {string} */ (input.blastRadius))
|
||||
) {
|
||||
errors.push("blastRadius must be unit|slice|milestone|project|org");
|
||||
}
|
||||
if (!isFiniteNumber(input.cost) || input.cost < 0) {
|
||||
errors.push("cost must be a non-negative number");
|
||||
}
|
||||
if (typeof input.legal !== "boolean") {
|
||||
errors.push("legal must be a boolean");
|
||||
}
|
||||
if (
|
||||
!CUSTOMER_IMPACT_LEVELS.includes(
|
||||
/** @type {string} */ (input.customerImpact),
|
||||
)
|
||||
) {
|
||||
errors.push(
|
||||
"customerImpact must be none|internal|customer-visible|customer-critical",
|
||||
);
|
||||
}
|
||||
|
||||
return { ok: errors.length === 0, errors };
|
||||
}
|
||||
|
||||
/**
|
||||
* Score a run-control policy as a deterministic 0..1 actionability value.
|
||||
*
|
||||
* Purpose: give autonomous run-control gates a stable ordering where higher
|
||||
* confidence and reversibility, lower risk, narrower blast radius, lower cost,
|
||||
* no legal scope, and lower customer impact all increase actionability.
|
||||
*
|
||||
* Consumer: SF policy gates that need a scalar threshold without losing the
|
||||
* underlying typed policy dimensions.
|
||||
*
|
||||
* @param {RunControlPolicy} policy
|
||||
* @returns {number}
|
||||
*/
|
||||
export function scoreRunControlPolicy(policy) {
|
||||
if (!isRecord(policy)) {
|
||||
return 0;
|
||||
}
|
||||
|
||||
const blastRadius =
|
||||
BLAST_RADIUS_SCORES[
|
||||
/** @type {keyof typeof BLAST_RADIUS_SCORES} */ (policy.blastRadius)
|
||||
] ?? 0;
|
||||
const customerImpact =
|
||||
CUSTOMER_IMPACT_SCORES[
|
||||
/** @type {keyof typeof CUSTOMER_IMPACT_SCORES} */ (policy.customerImpact)
|
||||
] ?? 0;
|
||||
|
||||
return clamp01(
|
||||
clamp01(policy.confidence) * 0.2 +
|
||||
normalizeRisk(policy.risk) * 0.2 +
|
||||
clamp01(policy.reversibility) * 0.15 +
|
||||
blastRadius * 0.15 +
|
||||
normalizeCost(policy.cost) * 0.1 +
|
||||
(policy.legal === false ? 1 : 0) * 0.1 +
|
||||
customerImpact * 0.1,
|
||||
);
|
||||
}
|
||||
|
||||
/**
|
||||
* @typedef {object} GateResult
|
||||
* @property {"pass"|"fail"|"retry"|"manual-attention"} outcome
|
||||
* @property {"policy"|"verification"|"execution"|"artifact"|"git"|"timeout"|"input"|"closeout"|"manual-attention"|"unknown"} failureClass
|
||||
* @property {string} rationale
|
||||
* @property {string} [findings]
|
||||
* @property {string} [recommendation]
|
||||
*/
|
||||
|
||||
/**
|
||||
* Build a validated gate result object.
|
||||
*
|
||||
* Purpose: keep gate closeout payloads shaped consistently across policy,
|
||||
* verification, execution, artifact, git, timeout, input, and manual paths.
|
||||
*
|
||||
* Consumer: SF gate runners and closeout recorders that exchange plain
|
||||
* serializable gate results.
|
||||
*
|
||||
* @param {"pass"|"fail"|"retry"|"manual-attention"} outcome
|
||||
* @param {{ failureClass?: "policy"|"verification"|"execution"|"artifact"|"git"|"timeout"|"input"|"closeout"|"manual-attention"|"unknown", rationale?: string, findings?: string, recommendation?: string }} opts
|
||||
* @returns {GateResult}
|
||||
*/
|
||||
export function buildGateResult(outcome, opts = {}) {
|
||||
if (!GATE_OUTCOMES.includes(outcome)) {
|
||||
throw new TypeError(`invalid gate outcome: ${String(outcome)}`);
|
||||
}
|
||||
|
||||
const failureClass = opts.failureClass ?? "unknown";
|
||||
if (!FAILURE_CLASSES.includes(failureClass)) {
|
||||
throw new TypeError(`invalid failureClass: ${String(failureClass)}`);
|
||||
}
|
||||
if (!isNonEmptyString(opts.rationale)) {
|
||||
throw new TypeError("rationale must be a non-empty string");
|
||||
}
|
||||
|
||||
const result = { outcome, failureClass, rationale: opts.rationale };
|
||||
if (opts.findings !== undefined) {
|
||||
result.findings = opts.findings;
|
||||
}
|
||||
if (opts.recommendation !== undefined) {
|
||||
result.recommendation = opts.recommendation;
|
||||
}
|
||||
return result;
|
||||
}
|
||||
|
||||
/**
|
||||
* Return true when a value has the minimum valid GateResult contract.
|
||||
*
|
||||
* Purpose: let gate consumers fail closed on malformed closeout payloads before
|
||||
* recording or acting on them.
|
||||
*
|
||||
* Consumer: SF gate result readers that accept plain JSON from multiple gate
|
||||
* implementations.
|
||||
*
|
||||
* @param {unknown} value
|
||||
* @returns {value is GateResult}
|
||||
*/
|
||||
export function isGateResult(value) {
|
||||
return (
|
||||
isRecord(value) &&
|
||||
GATE_OUTCOMES.includes(/** @type {string} */ (value.outcome)) &&
|
||||
isNonEmptyString(value.rationale) &&
|
||||
(value.failureClass === undefined ||
|
||||
FAILURE_CLASSES.includes(/** @type {string} */ (value.failureClass)))
|
||||
);
|
||||
}
|
||||
|
||||
function isRecord(value) {
|
||||
return value !== null && typeof value === "object" && !Array.isArray(value);
|
||||
}
|
||||
|
||||
function isNonEmptyString(value) {
|
||||
return typeof value === "string" && value.trim().length > 0;
|
||||
}
|
||||
|
||||
function isStringOrStringArray(value) {
|
||||
return (
|
||||
isNonEmptyString(value) ||
|
||||
(Array.isArray(value) && value.length > 0 && value.every(isNonEmptyString))
|
||||
);
|
||||
}
|
||||
|
||||
function hasNonEmptyValue(value) {
|
||||
return (
|
||||
isNonEmptyString(value) || isStringOrStringArray(value) || isRecord(value)
|
||||
);
|
||||
}
|
||||
|
||||
function isFiniteNumber(value) {
|
||||
return typeof value === "number" && Number.isFinite(value);
|
||||
}
|
||||
|
||||
function isUnitInterval(value) {
|
||||
return isFiniteNumber(value) && value >= 0 && value <= 1;
|
||||
}
|
||||
|
||||
function clamp01(value) {
|
||||
if (!isFiniteNumber(value)) {
|
||||
return 0;
|
||||
}
|
||||
return Math.min(1, Math.max(0, value));
|
||||
}
|
||||
|
||||
function normalizeRisk(value) {
|
||||
if (isUnitInterval(value)) {
|
||||
return 1 - value;
|
||||
}
|
||||
return RISK_SCORES[/** @type {keyof typeof RISK_SCORES} */ (value)] ?? 0;
|
||||
}
|
||||
|
||||
function normalizeCost(value) {
|
||||
if (!isFiniteNumber(value) || value < 0) {
|
||||
return 0;
|
||||
}
|
||||
return 1 / (1 + value / 100);
|
||||
}
|
||||
|
|
|
|||
147
src/resources/extensions/sf/supervisor/repo-supervisor.js
Normal file
147
src/resources/extensions/sf/supervisor/repo-supervisor.js
Normal file
|
|
@ -0,0 +1,147 @@
|
|||
import { join, resolve } from "node:path";
|
||||
|
||||
/**
|
||||
* Supervises one repository as an independent SF compilation unit.
|
||||
*
|
||||
* Purpose: keep repo-local autonomous state isolated so a multi-repo operator can
|
||||
* tick one repo without sharing dispatch queues, database handles, or guard state
|
||||
* with another repo.
|
||||
*
|
||||
* Consumer: future `sf supervisor` machine and CLI surfaces that coordinate many
|
||||
* repository supervisors from one process.
|
||||
*/
|
||||
export class RepoSupervisor {
|
||||
constructor({ repoPath, supervisorId, options = {} }) {
|
||||
if (!repoPath) {
|
||||
throw new Error("repoPath is required");
|
||||
}
|
||||
if (!supervisorId) {
|
||||
throw new Error("supervisorId is required");
|
||||
}
|
||||
|
||||
this.repoPath = resolve(repoPath);
|
||||
this.supervisorId = supervisorId;
|
||||
this.repoId = options.repoId ?? this.repoPath;
|
||||
this.dbPath = options.dbPath ?? join(this.repoPath, ".sf", "sf.db");
|
||||
this.dbConnection = { path: this.dbPath, open: false };
|
||||
this.dispatchQueue = [...(options.dispatchQueue ?? [{ id: "stub-unit" }])];
|
||||
this.runawayGuard = {
|
||||
consecutiveErrors: 0,
|
||||
circuitOpen: false,
|
||||
};
|
||||
this.logger = options.logger ?? console;
|
||||
this.shouldFailTick = options.shouldFailTick;
|
||||
this.now = options.now ?? (() => new Date());
|
||||
this.started = false;
|
||||
this.paused = false;
|
||||
this.stopped = false;
|
||||
this.lastTickAt = null;
|
||||
this.lastTickResult = null;
|
||||
this.ticksTotal = 0;
|
||||
this.errorsTotal = 0;
|
||||
}
|
||||
|
||||
async start() {
|
||||
this.started = true;
|
||||
this.stopped = false;
|
||||
this.dbConnection.open = true;
|
||||
}
|
||||
|
||||
async tick() {
|
||||
if (this.paused) {
|
||||
const result = { dispatched: false, errors: [], status: "paused" };
|
||||
this.recordTick(result);
|
||||
return result;
|
||||
}
|
||||
|
||||
try {
|
||||
if (this.stopped) {
|
||||
throw new Error("supervisor is stopped");
|
||||
}
|
||||
if (!this.started) {
|
||||
await this.start();
|
||||
}
|
||||
if (this.runawayGuard.circuitOpen) {
|
||||
const result = {
|
||||
dispatched: false,
|
||||
errors: [],
|
||||
status: "circuit_open",
|
||||
};
|
||||
this.recordTick(result);
|
||||
return result;
|
||||
}
|
||||
if (typeof this.shouldFailTick === "function") {
|
||||
await this.shouldFailTick();
|
||||
}
|
||||
|
||||
const unit = this.dispatchQueue.shift();
|
||||
const now = this.now().toISOString();
|
||||
this.lastTickAt = now;
|
||||
this.ticksTotal += 1;
|
||||
|
||||
if (!unit) {
|
||||
const result = { dispatched: false, errors: [], status: "idle" };
|
||||
this.lastTickResult = result;
|
||||
this.runawayGuard.consecutiveErrors = 0;
|
||||
return result;
|
||||
}
|
||||
|
||||
this.logger.info?.(
|
||||
`[${this.supervisorId}] ${this.repoPath}: would dispatch unit ${unit.id}`,
|
||||
);
|
||||
const result = {
|
||||
dispatched: true,
|
||||
errors: [],
|
||||
status: "ok",
|
||||
unitId: unit.id,
|
||||
};
|
||||
this.lastTickResult = result;
|
||||
this.runawayGuard.consecutiveErrors = 0;
|
||||
return result;
|
||||
} catch (error) {
|
||||
const message = error instanceof Error ? error.message : String(error);
|
||||
const result = {
|
||||
dispatched: false,
|
||||
error: message,
|
||||
errors: [message],
|
||||
status: "error",
|
||||
};
|
||||
this.errorsTotal += 1;
|
||||
this.runawayGuard.consecutiveErrors += 1;
|
||||
this.lastTickAt = this.now().toISOString();
|
||||
this.lastTickResult = result;
|
||||
return result;
|
||||
}
|
||||
}
|
||||
|
||||
async pause() {
|
||||
this.paused = true;
|
||||
}
|
||||
|
||||
async resume() {
|
||||
this.paused = false;
|
||||
}
|
||||
|
||||
async stop() {
|
||||
this.stopped = true;
|
||||
this.started = false;
|
||||
this.dbConnection.open = false;
|
||||
}
|
||||
|
||||
getStatus() {
|
||||
return {
|
||||
repoId: this.repoId,
|
||||
lastTickAt: this.lastTickAt,
|
||||
lastTickResult: this.lastTickResult,
|
||||
paused: this.paused,
|
||||
ticksTotal: this.ticksTotal,
|
||||
errorsTotal: this.errorsTotal,
|
||||
};
|
||||
}
|
||||
|
||||
recordTick(result) {
|
||||
this.lastTickAt = this.now().toISOString();
|
||||
this.lastTickResult = result;
|
||||
this.ticksTotal += 1;
|
||||
}
|
||||
}
|
||||
|
|
@ -0,0 +1,91 @@
|
|||
/**
|
||||
* detector-artifact-flap.test.mjs — artifact flap detector contracts.
|
||||
*
|
||||
* Purpose: prove R054 catches two-hash oscillation without treating normal
|
||||
* multi-hash artifact evolution as stuck.
|
||||
*/
|
||||
import assert from "node:assert/strict";
|
||||
import { test } from "vitest";
|
||||
import {
|
||||
MIN_FLAPS,
|
||||
WINDOW_MS,
|
||||
detectArtifactFlap,
|
||||
} from "../detectors/artifact-flap.js";
|
||||
|
||||
const NOW = Date.parse("2026-05-17T12:00:00.000Z");
|
||||
|
||||
function artifact(overrides = {}) {
|
||||
return {
|
||||
path: "docs/specs/runtime.md",
|
||||
hash: "a",
|
||||
timestamp: NOW - 5 * 60 * 1000,
|
||||
...overrides,
|
||||
};
|
||||
}
|
||||
|
||||
test("detectArtifactFlap_when_two_hashes_transition_at_threshold_returns_stuck", () => {
|
||||
const result = detectArtifactFlap(
|
||||
[
|
||||
artifact({ hash: "a", timestamp: NOW - 20 * 60 * 1000 }),
|
||||
artifact({ hash: "b", timestamp: NOW - 15 * 60 * 1000 }),
|
||||
artifact({ hash: "a", timestamp: NOW - 10 * 60 * 1000 }),
|
||||
artifact({ hash: "b", timestamp: NOW - 5 * 60 * 1000 }),
|
||||
],
|
||||
{ now: NOW },
|
||||
);
|
||||
|
||||
assert.equal(result.stuck, true);
|
||||
assert.equal(result.reason, "artifact-flap");
|
||||
assert.deepEqual(result.signature, {
|
||||
path: "docs/specs/runtime.md",
|
||||
flaps: MIN_FLAPS,
|
||||
uniqueHashes: 2,
|
||||
windowMs: WINDOW_MS,
|
||||
});
|
||||
});
|
||||
|
||||
test("detectArtifactFlap_when_transitions_are_below_threshold_returns_not_stuck", () => {
|
||||
const result = detectArtifactFlap(
|
||||
[
|
||||
artifact({ hash: "a", timestamp: NOW - 15 * 60 * 1000 }),
|
||||
artifact({ hash: "b", timestamp: NOW - 10 * 60 * 1000 }),
|
||||
artifact({ hash: "a", timestamp: NOW - 5 * 60 * 1000 }),
|
||||
],
|
||||
{ now: NOW },
|
||||
);
|
||||
|
||||
assert.deepEqual(result, { stuck: false });
|
||||
});
|
||||
|
||||
test("detectArtifactFlap_when_more_than_two_hashes_change_returns_not_stuck", () => {
|
||||
const result = detectArtifactFlap(
|
||||
[
|
||||
artifact({ hash: "a", timestamp: NOW - 20 * 60 * 1000 }),
|
||||
artifact({ hash: "b", timestamp: NOW - 15 * 60 * 1000 }),
|
||||
artifact({ hash: "c", timestamp: NOW - 10 * 60 * 1000 }),
|
||||
artifact({ hash: "a", timestamp: NOW - 5 * 60 * 1000 }),
|
||||
],
|
||||
{ now: NOW },
|
||||
);
|
||||
|
||||
assert.deepEqual(result, { stuck: false });
|
||||
});
|
||||
|
||||
test("detectArtifactFlap_when_multiple_artifacts_flap_returns_first_flapping_path", () => {
|
||||
const result = detectArtifactFlap(
|
||||
[
|
||||
artifact({ path: "b.md", hash: "a", timestamp: NOW - 20 * 60 * 1000 }),
|
||||
artifact({ path: "a.md", hash: "x", timestamp: NOW - 20 * 60 * 1000 }),
|
||||
artifact({ path: "b.md", hash: "b", timestamp: NOW - 15 * 60 * 1000 }),
|
||||
artifact({ path: "a.md", hash: "y", timestamp: NOW - 15 * 60 * 1000 }),
|
||||
artifact({ path: "b.md", hash: "a", timestamp: NOW - 10 * 60 * 1000 }),
|
||||
artifact({ path: "a.md", hash: "x", timestamp: NOW - 10 * 60 * 1000 }),
|
||||
artifact({ path: "b.md", hash: "b", timestamp: NOW - 5 * 60 * 1000 }),
|
||||
artifact({ path: "a.md", hash: "y", timestamp: NOW - 5 * 60 * 1000 }),
|
||||
],
|
||||
{ now: NOW },
|
||||
);
|
||||
|
||||
assert.equal(result.stuck, true);
|
||||
assert.equal(result.signature.path, "b.md");
|
||||
});
|
||||
|
|
@ -0,0 +1,81 @@
|
|||
/**
|
||||
* detector-repeated-feedback-kind.test.mjs — repeated feedback detector contracts.
|
||||
*
|
||||
* Purpose: prove R053 fires only when the same feedback kind targets the same
|
||||
* planning object often enough inside the detector window.
|
||||
*/
|
||||
import assert from "node:assert/strict";
|
||||
import { test } from "vitest";
|
||||
import {
|
||||
MIN_OCCURRENCES,
|
||||
WINDOW_MS,
|
||||
detectRepeatedFeedbackKind,
|
||||
} from "../detectors/repeated-feedback-kind.js";
|
||||
|
||||
const NOW = Date.parse("2026-05-17T12:00:00.000Z");
|
||||
|
||||
function feedback(overrides = {}) {
|
||||
return {
|
||||
kind: "zero-progress",
|
||||
occurredIn: { task: "T01" },
|
||||
timestamp: NOW - 5 * 60 * 1000,
|
||||
...overrides,
|
||||
};
|
||||
}
|
||||
|
||||
test("detectRepeatedFeedbackKind_when_same_kind_and_target_reaches_threshold_returns_stuck", () => {
|
||||
const result = detectRepeatedFeedbackKind(
|
||||
[
|
||||
feedback(),
|
||||
feedback({ timestamp: NOW - 10 * 60 * 1000 }),
|
||||
feedback({ timestamp: NOW - 20 * 60 * 1000 }),
|
||||
],
|
||||
{ now: NOW },
|
||||
);
|
||||
|
||||
assert.equal(result.stuck, true);
|
||||
assert.equal(result.reason, "repeated-feedback-kind");
|
||||
assert.deepEqual(result.signature, {
|
||||
kind: "zero-progress",
|
||||
target: "task:T01",
|
||||
occurrences: MIN_OCCURRENCES,
|
||||
windowMs: WINDOW_MS,
|
||||
});
|
||||
});
|
||||
|
||||
test("detectRepeatedFeedbackKind_when_kind_or_target_differs_returns_not_stuck", () => {
|
||||
const result = detectRepeatedFeedbackKind(
|
||||
[
|
||||
feedback(),
|
||||
feedback({ kind: "stale-lock" }),
|
||||
feedback({ occurredIn: { task: "T02" } }),
|
||||
],
|
||||
{ now: NOW },
|
||||
);
|
||||
|
||||
assert.deepEqual(result, { stuck: false });
|
||||
});
|
||||
|
||||
test("detectRepeatedFeedbackKind_when_exactly_one_below_threshold_returns_not_stuck", () => {
|
||||
const result = detectRepeatedFeedbackKind(
|
||||
Array.from({ length: MIN_OCCURRENCES - 1 }, (_, index) =>
|
||||
feedback({ timestamp: NOW - index * 60 * 1000 }),
|
||||
),
|
||||
{ now: NOW },
|
||||
);
|
||||
|
||||
assert.deepEqual(result, { stuck: false });
|
||||
});
|
||||
|
||||
test("detectRepeatedFeedbackKind_when_oldest_entry_is_outside_window_returns_not_stuck", () => {
|
||||
const result = detectRepeatedFeedbackKind(
|
||||
[
|
||||
feedback({ timestamp: NOW - WINDOW_MS - 1 }),
|
||||
feedback({ timestamp: NOW - 10 * 60 * 1000 }),
|
||||
feedback({ timestamp: NOW - 20 * 60 * 1000 }),
|
||||
],
|
||||
{ now: NOW },
|
||||
);
|
||||
|
||||
assert.deepEqual(result, { stuck: false });
|
||||
});
|
||||
150
src/resources/extensions/sf/tests/dispatcher-lanes.test.mjs
Normal file
150
src/resources/extensions/sf/tests/dispatcher-lanes.test.mjs
Normal file
|
|
@ -0,0 +1,150 @@
|
|||
/**
|
||||
* R090/S02 — Two-lane dispatcher integration tests.
|
||||
*
|
||||
* Tests the lane-aware dispatch selection logic that is used by loop.js.
|
||||
* The dispatch sink is mocked — these tests verify SELECTION decisions, not
|
||||
* unit execution.
|
||||
*
|
||||
* Scenarios:
|
||||
* 1. lanes ON, cap 1/1: plan-slice-A + execute-task-B dispatched in same tick
|
||||
* 2. saturated execute slot does not block subsequent planning dispatch
|
||||
* 3. SF_LANES=0 → only one dispatch per tick (back-compat)
|
||||
* 4. planningCap=2 → two planning units can dispatch in one tick
|
||||
* 5. execute-only queue: only execute units are selected
|
||||
* 6. mixed queue exhausts both caps before end of list
|
||||
*/
|
||||
import { describe, expect, it } from "vitest";
|
||||
import {
|
||||
areLanesEnabled,
|
||||
getExecuteLaneCapacity,
|
||||
getPlanningLaneCapacity,
|
||||
selectLaneAwareUnits,
|
||||
} from "../auto/unit-lanes.js";
|
||||
|
||||
// ── Helpers ───────────────────────────────────────────────────────────────────
|
||||
|
||||
/**
|
||||
* Simulate a dispatch tick over a candidate list.
|
||||
*
|
||||
* When lanes are enabled: selects up to planCap planning + execCap execute units.
|
||||
* When lanes are disabled: selects only the first candidate (back-compat).
|
||||
*
|
||||
* Returns the list of dispatched units (just their unitType/unitId).
|
||||
*
|
||||
* @param {Array<{unitType: string, unitId: string}>} candidates
|
||||
* @param {NodeJS.ProcessEnv} env
|
||||
*/
|
||||
function simulateTick(candidates, env = {}) {
|
||||
if (!areLanesEnabled(env)) {
|
||||
// Back-compat: one-at-a-time
|
||||
return candidates.length > 0 ? [candidates[0]] : [];
|
||||
}
|
||||
const planCap = getPlanningLaneCapacity(env);
|
||||
const execCap = getExecuteLaneCapacity(env);
|
||||
return selectLaneAwareUnits(candidates, planCap, execCap);
|
||||
}
|
||||
|
||||
// ── Test data ─────────────────────────────────────────────────────────────────
|
||||
|
||||
const MIXED_QUEUE = [
|
||||
{ unitType: "plan-slice", unitId: "plan-slice-A" },
|
||||
{ unitType: "execute-task", unitId: "execute-task-B" },
|
||||
{ unitType: "plan-slice", unitId: "plan-slice-C" },
|
||||
{ unitType: "execute-task", unitId: "execute-task-D" },
|
||||
{ unitType: "plan-slice", unitId: "execute-slice-E" },
|
||||
];
|
||||
|
||||
// ── Tests ─────────────────────────────────────────────────────────────────────
|
||||
|
||||
describe("R090/S02 — two-lane dispatcher", () => {
|
||||
it("dispatches plan-slice-A AND execute-task-B in the same tick when lanes ON (cap 1/1)", () => {
|
||||
const dispatched = simulateTick(MIXED_QUEUE, {}); // default: lanes ON, cap 1/1
|
||||
expect(dispatched).toHaveLength(2);
|
||||
expect(dispatched[0].unitId).toBe("plan-slice-A");
|
||||
expect(dispatched[1].unitId).toBe("execute-task-B");
|
||||
});
|
||||
|
||||
it("saturated execute slot does NOT block subsequent planning dispatch", () => {
|
||||
// Execute cap = 0 → only planning units should be selected,
|
||||
// and the scan must continue past execute candidates to find them.
|
||||
const env = { SF_LANES_EXECUTE_N: "0" };
|
||||
// We call selectLaneAwareUnits directly here to sidestep the env parse
|
||||
// (getExecuteLaneCapacity returns default 1 for "0"). Use cap=0 directly.
|
||||
const selected = selectLaneAwareUnits(MIXED_QUEUE, 1, 0);
|
||||
expect(selected).toHaveLength(1);
|
||||
expect(selected[0].unitType).toBe("plan-slice");
|
||||
// Scan must have continued past execute-task-B to find plan-slice-A
|
||||
expect(selected[0].unitId).toBe("plan-slice-A");
|
||||
});
|
||||
|
||||
it("saturated planning slot does NOT block subsequent execute dispatch", () => {
|
||||
// Planning cap = 0 → only execute units should be selected
|
||||
const selected = selectLaneAwareUnits(MIXED_QUEUE, 0, 1);
|
||||
expect(selected).toHaveLength(1);
|
||||
expect(selected[0].unitType).toBe("execute-task");
|
||||
expect(selected[0].unitId).toBe("execute-task-B");
|
||||
});
|
||||
|
||||
it("SF_LANES=0 → only one dispatch per tick (back-compat one-at-a-time)", () => {
|
||||
const dispatched = simulateTick(MIXED_QUEUE, { SF_LANES: "0" });
|
||||
expect(dispatched).toHaveLength(1);
|
||||
expect(dispatched[0].unitId).toBe("plan-slice-A");
|
||||
});
|
||||
|
||||
it("SF_LANES=false → only one dispatch per tick (back-compat)", () => {
|
||||
const dispatched = simulateTick(MIXED_QUEUE, { SF_LANES: "false" });
|
||||
expect(dispatched).toHaveLength(1);
|
||||
expect(dispatched[0].unitId).toBe("plan-slice-A");
|
||||
});
|
||||
|
||||
it("planningCap=2 → two planning units can dispatch in one tick", () => {
|
||||
const selected = selectLaneAwareUnits(MIXED_QUEUE, 2, 1);
|
||||
// Expected: plan-slice-A (planning), execute-task-B (execute), plan-slice-C (planning)
|
||||
expect(selected).toHaveLength(3);
|
||||
const types = selected.map((u) => u.unitType);
|
||||
expect(types.filter((t) => t === "plan-slice")).toHaveLength(2);
|
||||
expect(types.filter((t) => t === "execute-task")).toHaveLength(1);
|
||||
});
|
||||
|
||||
it("execute-only queue: selects only execute units", () => {
|
||||
const execQueue = [
|
||||
{ unitType: "execute-task", unitId: "T1" },
|
||||
{ unitType: "complete-slice", unitId: "S1" },
|
||||
{ unitType: "execute-task", unitId: "T2" },
|
||||
];
|
||||
const selected = selectLaneAwareUnits(execQueue, 1, 1);
|
||||
expect(selected).toHaveLength(1); // only 1 execute cap, no planning found
|
||||
expect(selected[0].unitType).toBe("execute-task");
|
||||
});
|
||||
|
||||
it("planning-only queue: selects only planning units", () => {
|
||||
const planQueue = [
|
||||
{ unitType: "plan-slice", unitId: "P1" },
|
||||
{ unitType: "research-slice", unitId: "R1" },
|
||||
];
|
||||
const selected = selectLaneAwareUnits(planQueue, 1, 1);
|
||||
expect(selected).toHaveLength(1); // only 1 planning cap, no execute found
|
||||
expect(selected[0].unitId).toBe("P1");
|
||||
});
|
||||
|
||||
it("mixed queue exhausts both caps without overselecting", () => {
|
||||
const dispatched = simulateTick(MIXED_QUEUE, {}); // lanes ON, cap 1/1
|
||||
// Must not dispatch more than cap allows
|
||||
const planningCount = dispatched.filter(
|
||||
(u) => u.unitType === "plan-slice",
|
||||
).length;
|
||||
const executeCount = dispatched.filter(
|
||||
(u) => u.unitType === "execute-task",
|
||||
).length;
|
||||
expect(planningCount).toBeLessThanOrEqual(1);
|
||||
expect(executeCount).toBeLessThanOrEqual(1);
|
||||
});
|
||||
|
||||
it("areLanesEnabled returns true by default", () => {
|
||||
expect(areLanesEnabled({})).toBe(true);
|
||||
});
|
||||
|
||||
it("areLanesEnabled returns false for SF_LANES=0", () => {
|
||||
expect(areLanesEnabled({ SF_LANES: "0" })).toBe(false);
|
||||
});
|
||||
});
|
||||
237
src/resources/extensions/sf/tests/drift-detection-gate.test.mjs
Normal file
237
src/resources/extensions/sf/tests/drift-detection-gate.test.mjs
Normal file
|
|
@ -0,0 +1,237 @@
|
|||
/**
|
||||
* drift-detection-gate.test.mjs — ADR-0075 drift detection gate contracts.
|
||||
*
|
||||
* Purpose: prove task completion drift is detected from mocked task rows and a
|
||||
* temporary working tree without requiring a live sf.db.
|
||||
*/
|
||||
import { mkdirSync, mkdtempSync, rmSync, writeFileSync } from "node:fs";
|
||||
import { tmpdir } from "node:os";
|
||||
import { dirname, join } from "node:path";
|
||||
import { afterEach, describe, expect, it } from "vitest";
|
||||
import driftDetectionGate from "../uok/drift-detection-gate.js";
|
||||
|
||||
const tmpRoots = [];
|
||||
|
||||
afterEach(() => {
|
||||
for (const root of tmpRoots.splice(0)) {
|
||||
rmSync(root, { recursive: true, force: true });
|
||||
}
|
||||
});
|
||||
|
||||
function makeProject() {
|
||||
const root = mkdtempSync(join(tmpdir(), "sf-drift-detection-gate-"));
|
||||
tmpRoots.push(root);
|
||||
return root;
|
||||
}
|
||||
|
||||
function writeProjectFile(root, relPath, content = "") {
|
||||
const fullPath = join(root, relPath);
|
||||
mkdirSync(dirname(fullPath), { recursive: true });
|
||||
writeFileSync(fullPath, content, "utf-8");
|
||||
return fullPath;
|
||||
}
|
||||
|
||||
function taskRow(overrides = {}) {
|
||||
return {
|
||||
milestone_id: "M014",
|
||||
slice_id: "S03",
|
||||
id: "R082",
|
||||
status: "complete",
|
||||
key_files: JSON.stringify(["docs/plans/R082.md"]),
|
||||
narrative: "Implemented and verified.",
|
||||
verification_result: "pass",
|
||||
known_issues: "None.",
|
||||
deviations: "None.",
|
||||
full_summary_md: "# R082\nComplete.",
|
||||
...overrides,
|
||||
};
|
||||
}
|
||||
|
||||
async function runSingle(root, row) {
|
||||
return driftDetectionGate.execute({
|
||||
basePath: root,
|
||||
scope: "single-task",
|
||||
taskRow: row,
|
||||
});
|
||||
}
|
||||
|
||||
function expectGateResultShape(result) {
|
||||
expect(result).toHaveProperty("outcome");
|
||||
expect(result).toHaveProperty("failureClass");
|
||||
expect(result).toHaveProperty("rationale");
|
||||
expect(["pass", "fail", "retry", "manual-attention"]).toContain(
|
||||
result.outcome,
|
||||
);
|
||||
expect(result.failureClass).toBe("verification");
|
||||
expect(typeof result.rationale).toBe("string");
|
||||
expect(result.rationale.length).toBeGreaterThan(0);
|
||||
}
|
||||
|
||||
describe("driftDetectionGate", () => {
|
||||
it("exports_adr0075_gate_contract", () => {
|
||||
expect(driftDetectionGate.id).toBe("drift-detection");
|
||||
expect(driftDetectionGate.type).toBe("verification");
|
||||
expect(typeof driftDetectionGate.execute).toBe("function");
|
||||
});
|
||||
|
||||
it("checkArtifactsExist_when_complete_artifact_missing_returns_fail", async () => {
|
||||
const root = makeProject();
|
||||
const result = await runSingle(
|
||||
root,
|
||||
taskRow({ key_files: JSON.stringify(["docs/plans/missing.md"]) }),
|
||||
);
|
||||
|
||||
expectGateResultShape(result);
|
||||
expect(result.outcome).toBe("fail");
|
||||
expect(result.rationale).toContain(
|
||||
"task marked complete but artifact at docs/plans/missing.md missing on disk",
|
||||
);
|
||||
expect(result.findings).toEqual(["docs/plans/missing.md"]);
|
||||
expect(result.recommendation).toBe(
|
||||
"revert task status to in_progress + re-dispatch",
|
||||
);
|
||||
});
|
||||
|
||||
it("checkArtifactsExist_when_complete_artifact_exists_returns_pass", async () => {
|
||||
const root = makeProject();
|
||||
writeProjectFile(root, "docs/plans/R082.md", "# plan\n");
|
||||
|
||||
const result = await runSingle(root, taskRow());
|
||||
|
||||
expectGateResultShape(result);
|
||||
expect(result.outcome).toBe("pass");
|
||||
});
|
||||
|
||||
it("detectProseStatusMismatch_when_complete_prose_says_blocked_returns_manual_attention", async () => {
|
||||
const root = makeProject();
|
||||
writeProjectFile(root, "docs/plans/R082.md", "# plan\n");
|
||||
|
||||
const result = await runSingle(
|
||||
root,
|
||||
taskRow({
|
||||
narrative: "Implementation is blocked on missing schema evidence.",
|
||||
}),
|
||||
);
|
||||
|
||||
expectGateResultShape(result);
|
||||
expect(result.outcome).toBe("manual-attention");
|
||||
expect(result.rationale).toContain("prose indicates blocked");
|
||||
expect(result.findings[0]).toContain("blocked");
|
||||
});
|
||||
|
||||
it("detectProseStatusMismatch_when_complete_prose_is_clean_returns_pass", async () => {
|
||||
const root = makeProject();
|
||||
writeProjectFile(root, "docs/plans/R082.md", "# plan\n");
|
||||
|
||||
const result = await runSingle(
|
||||
root,
|
||||
taskRow({
|
||||
narrative: "Implemented the verification gate.",
|
||||
full_summary_md: "Verification passed and artifacts are present.",
|
||||
}),
|
||||
);
|
||||
|
||||
expectGateResultShape(result);
|
||||
expect(result.outcome).toBe("pass");
|
||||
});
|
||||
|
||||
it("checkImportsResolve_when_src_artifact_import_missing_returns_fail", async () => {
|
||||
const root = makeProject();
|
||||
writeProjectFile(
|
||||
root,
|
||||
"src/resources/extensions/sf/uok/example.js",
|
||||
'import { missing } from "./missing.js";\nexport const value = missing;\n',
|
||||
);
|
||||
|
||||
const result = await runSingle(
|
||||
root,
|
||||
taskRow({
|
||||
key_files: JSON.stringify([
|
||||
"src/resources/extensions/sf/uok/example.js",
|
||||
]),
|
||||
}),
|
||||
);
|
||||
|
||||
expectGateResultShape(result);
|
||||
expect(result.outcome).toBe("fail");
|
||||
expect(result.rationale).toBe(
|
||||
"import to ./missing.js resolves to missing file",
|
||||
);
|
||||
expect(result.findings[0]).toContain("example.js:1 -> ./missing.js");
|
||||
});
|
||||
|
||||
it("checkImportsResolve_when_src_artifact_import_exists_returns_pass", async () => {
|
||||
const root = makeProject();
|
||||
writeProjectFile(
|
||||
root,
|
||||
"src/resources/extensions/sf/uok/example.js",
|
||||
'import { present } from "./present.js";\nexport const value = present;\n',
|
||||
);
|
||||
writeProjectFile(
|
||||
root,
|
||||
"src/resources/extensions/sf/uok/present.js",
|
||||
"export const present = true;\n",
|
||||
);
|
||||
|
||||
const result = await runSingle(
|
||||
root,
|
||||
taskRow({
|
||||
key_files: JSON.stringify([
|
||||
"src/resources/extensions/sf/uok/example.js",
|
||||
]),
|
||||
}),
|
||||
);
|
||||
|
||||
expectGateResultShape(result);
|
||||
expect(result.outcome).toBe("pass");
|
||||
});
|
||||
|
||||
it("execute_when_sweep_mode_iterates_mock_db_rows_and_returns_worst_outcome", async () => {
|
||||
const root = makeProject();
|
||||
writeProjectFile(root, "docs/plans/clean.md", "# clean\n");
|
||||
const rows = [
|
||||
taskRow({
|
||||
id: "T01",
|
||||
key_files: JSON.stringify(["docs/plans/clean.md"]),
|
||||
}),
|
||||
taskRow({
|
||||
id: "T02",
|
||||
key_files: JSON.stringify(["docs/plans/missing.md"]),
|
||||
}),
|
||||
taskRow({
|
||||
id: "T03",
|
||||
key_files: JSON.stringify(["docs/plans/clean.md"]),
|
||||
narrative: "Needs review before it can really be considered done.",
|
||||
}),
|
||||
];
|
||||
const seen = [];
|
||||
|
||||
const result = await driftDetectionGate.execute({
|
||||
basePath: root,
|
||||
scope: "sweep",
|
||||
dbRead: (query) => {
|
||||
seen.push(query);
|
||||
return rows;
|
||||
},
|
||||
});
|
||||
|
||||
expect(seen).toHaveLength(1);
|
||||
expect(seen[0].kind).toBe("recently-completed-tasks");
|
||||
expectGateResultShape(result);
|
||||
expect(result.outcome).toBe("fail");
|
||||
expect(result.findings).toContain("docs/plans/missing.md");
|
||||
});
|
||||
|
||||
it("execute_when_single_task_missing_task_row_returns_manual_attention_shape", async () => {
|
||||
const root = makeProject();
|
||||
|
||||
const result = await driftDetectionGate.execute({
|
||||
basePath: root,
|
||||
scope: "single-task",
|
||||
});
|
||||
|
||||
expectGateResultShape(result);
|
||||
expect(result.outcome).toBe("manual-attention");
|
||||
expect(result.rationale).toContain("ctx.taskRow");
|
||||
});
|
||||
});
|
||||
157
src/resources/extensions/sf/tests/engine-types.test.mjs
Normal file
157
src/resources/extensions/sf/tests/engine-types.test.mjs
Normal file
|
|
@ -0,0 +1,157 @@
|
|||
/**
|
||||
* engine-types.test.mjs — leaf-node SF engine contract tests.
|
||||
*
|
||||
* Purpose: pin typed PDD, run-control, and gate result contracts before other
|
||||
* engine modules consume them.
|
||||
*/
|
||||
import { describe, expect, test } from "vitest";
|
||||
|
||||
import {
|
||||
buildGateResult,
|
||||
isGateResult,
|
||||
scoreRunControlPolicy,
|
||||
validatePDDFields,
|
||||
validateRunControlPolicy,
|
||||
} from "../engine-types.js";
|
||||
|
||||
const validPDDFields = {
|
||||
purpose: "Keep autonomous work tied to a production purpose.",
|
||||
consumer: "autonomous dispatch",
|
||||
contract: "typed purpose fields",
|
||||
failureBoundary: "manual attention when fields are missing",
|
||||
evidence: "unit test",
|
||||
invariants: ["required fields are present"],
|
||||
};
|
||||
|
||||
const validRunControlPolicy = {
|
||||
confidence: 0.8,
|
||||
risk: "low",
|
||||
reversibility: 0.9,
|
||||
blastRadius: "slice",
|
||||
cost: 12,
|
||||
legal: false,
|
||||
customerImpact: "internal",
|
||||
};
|
||||
|
||||
describe("validatePDDFields", () => {
|
||||
for (const field of [
|
||||
"purpose",
|
||||
"consumer",
|
||||
"contract",
|
||||
"failureBoundary",
|
||||
"evidence",
|
||||
"invariants",
|
||||
]) {
|
||||
test(`validatePDDFields_when_${field}_missing_rejects_with_missing_field`, () => {
|
||||
const input = { ...validPDDFields };
|
||||
delete input[field];
|
||||
|
||||
const result = validatePDDFields(input);
|
||||
|
||||
expect(result.ok).toBe(false);
|
||||
expect(result.missing).toEqual([field]);
|
||||
});
|
||||
}
|
||||
|
||||
test("validatePDDFields_when_full_valid_input_accepts", () => {
|
||||
const result = validatePDDFields({
|
||||
...validPDDFields,
|
||||
nonGoals: "avoid refactoring consumers",
|
||||
assumptions: ["leaf-node module stays import-free"],
|
||||
});
|
||||
|
||||
expect(result).toEqual({ ok: true, missing: [], errors: [] });
|
||||
});
|
||||
});
|
||||
|
||||
describe("validateRunControlPolicy", () => {
|
||||
test("validateRunControlPolicy_when_numbers_out_of_range_rejects", () => {
|
||||
const result = validateRunControlPolicy({
|
||||
...validRunControlPolicy,
|
||||
confidence: 1.5,
|
||||
reversibility: -0.1,
|
||||
});
|
||||
|
||||
expect(result.ok).toBe(false);
|
||||
expect(result.errors).toContain(
|
||||
"confidence must be a number between 0 and 1",
|
||||
);
|
||||
expect(result.errors).toContain(
|
||||
"reversibility must be a number between 0 and 1",
|
||||
);
|
||||
});
|
||||
|
||||
test("validateRunControlPolicy_when_enums_invalid_rejects", () => {
|
||||
const result = validateRunControlPolicy({
|
||||
...validRunControlPolicy,
|
||||
risk: "severe",
|
||||
blastRadius: "planet",
|
||||
customerImpact: "external",
|
||||
});
|
||||
|
||||
expect(result.ok).toBe(false);
|
||||
expect(result.errors).toContain(
|
||||
"risk must be a number between 0 and 1 or low|medium|high|critical",
|
||||
);
|
||||
expect(result.errors).toContain(
|
||||
"blastRadius must be unit|slice|milestone|project|org",
|
||||
);
|
||||
expect(result.errors).toContain(
|
||||
"customerImpact must be none|internal|customer-visible|customer-critical",
|
||||
);
|
||||
});
|
||||
|
||||
test("validateRunControlPolicy_when_full_valid_input_accepts", () => {
|
||||
const result = validateRunControlPolicy(validRunControlPolicy);
|
||||
|
||||
expect(result).toEqual({ ok: true, errors: [] });
|
||||
});
|
||||
});
|
||||
|
||||
describe("scoreRunControlPolicy", () => {
|
||||
test("scoreRunControlPolicy_when_fixed_input_is_deterministic_in_unit_interval", () => {
|
||||
const first = scoreRunControlPolicy(validRunControlPolicy);
|
||||
const second = scoreRunControlPolicy(validRunControlPolicy);
|
||||
|
||||
expect(first).toBe(second);
|
||||
expect(first).toBeGreaterThanOrEqual(0);
|
||||
expect(first).toBeLessThanOrEqual(1);
|
||||
});
|
||||
});
|
||||
|
||||
describe("GateResult contracts", () => {
|
||||
test("isGateResult_when_valid_gate_result_returns_true", () => {
|
||||
const result = buildGateResult("fail", {
|
||||
failureClass: "policy",
|
||||
rationale: "Risk exceeds autonomous threshold.",
|
||||
findings: "customer-visible blast radius",
|
||||
recommendation: "request manual attention",
|
||||
});
|
||||
|
||||
expect(isGateResult(result)).toBe(true);
|
||||
});
|
||||
|
||||
test("isGateResult_when_malformed_returns_false", () => {
|
||||
expect(isGateResult(null)).toBe(false);
|
||||
expect(isGateResult({ outcome: "pass" })).toBe(false);
|
||||
expect(isGateResult({ outcome: "done", rationale: "ok" })).toBe(false);
|
||||
});
|
||||
|
||||
test("buildGateResult_when_valid_input_creates_correct_shape", () => {
|
||||
const result = buildGateResult("pass", {
|
||||
rationale: "Verification passed.",
|
||||
});
|
||||
|
||||
expect(result).toEqual({
|
||||
outcome: "pass",
|
||||
failureClass: "unknown",
|
||||
rationale: "Verification passed.",
|
||||
});
|
||||
});
|
||||
|
||||
test("buildGateResult_when_outcome_invalid_throws", () => {
|
||||
expect(() =>
|
||||
buildGateResult("blocked", { rationale: "Unsupported outcome." }),
|
||||
).toThrow(/invalid gate outcome/);
|
||||
});
|
||||
});
|
||||
|
|
@ -0,0 +1,127 @@
|
|||
/**
|
||||
* r016-bus-deliver-verify.test.mjs — R016 MessageBus dispatch delivery verification.
|
||||
*
|
||||
* Purpose: prove `_busDispatch()` does not acknowledge delivery until the target
|
||||
* PersistentAgent inbox can read the just-sent message from SQLite.
|
||||
*
|
||||
* Consumer: CI unit-test suite (`npm run test:unit`).
|
||||
*/
|
||||
|
||||
import { mkdtempSync, rmSync } from "node:fs";
|
||||
import { tmpdir } from "node:os";
|
||||
import { join } from "node:path";
|
||||
import { afterEach, beforeEach, describe, expect, test, vi } from "vitest";
|
||||
import { _clearSfRootCache } from "../paths.js";
|
||||
import { closeDatabase } from "../sf-db.js";
|
||||
import { runAgentTurn } from "../uok/agent-runner.js";
|
||||
import {
|
||||
BUS_DELIVERY_VERIFY_TIMEOUT_MS,
|
||||
SwarmDispatchLayer,
|
||||
} from "../uok/swarm-dispatch.js";
|
||||
|
||||
vi.mock("@singularity-forge/coding-agent", () => ({
|
||||
runSubagent: vi.fn(async () => ({
|
||||
ok: true,
|
||||
output: "mock runner output",
|
||||
exitCode: 0,
|
||||
stderr: "",
|
||||
})),
|
||||
}));
|
||||
|
||||
const tmpRoots = [];
|
||||
|
||||
beforeEach(() => {
|
||||
vi.clearAllMocks();
|
||||
});
|
||||
|
||||
afterEach(() => {
|
||||
closeDatabase();
|
||||
_clearSfRootCache();
|
||||
for (const root of tmpRoots.splice(0)) {
|
||||
rmSync(root, { recursive: true, force: true });
|
||||
}
|
||||
});
|
||||
|
||||
function makeProject() {
|
||||
const root = mkdtempSync(join(tmpdir(), "sf-r016-bus-"));
|
||||
tmpRoots.push(root);
|
||||
return root;
|
||||
}
|
||||
|
||||
function buildEnvelope(unitId, payload = "work payload") {
|
||||
return {
|
||||
unitId,
|
||||
unitType: "task",
|
||||
workMode: "build",
|
||||
payload,
|
||||
priority: 5,
|
||||
scope: "r016",
|
||||
};
|
||||
}
|
||||
|
||||
describe("R016 bus dispatch delivery verification", () => {
|
||||
test("dispatch_when_agent_inbox_cache_warm_forces_refresh_before_ack", async () => {
|
||||
const root = makeProject();
|
||||
const firstLayer = new SwarmDispatchLayer(root);
|
||||
const secondLayer = new SwarmDispatchLayer(root);
|
||||
|
||||
const first = await firstLayer.dispatch(
|
||||
buildEnvelope("r016-first", "first"),
|
||||
);
|
||||
const secondSwarm = await secondLayer.getOrCreateSwarm();
|
||||
const warmAgent = secondSwarm.get(first.targetAgent);
|
||||
|
||||
const firstTurn = await runAgentTurn(warmAgent);
|
||||
expect(firstTurn.turnsProcessed).toBe(1);
|
||||
expect(warmAgent.receive(true)).toHaveLength(0);
|
||||
|
||||
const second = await secondLayer.dispatch(
|
||||
buildEnvelope("r016-second", "second"),
|
||||
);
|
||||
expect(second.ok).toBe(true);
|
||||
expect(second.messageId).toBeTruthy();
|
||||
|
||||
const secondTurn = await runAgentTurn(warmAgent);
|
||||
expect(secondTurn.turnsProcessed).toBe(1);
|
||||
expect(secondTurn.response).toBe("mock runner output");
|
||||
});
|
||||
|
||||
test("busDispatch_when_refresh_does_not_expose_sent_message_returns_delivery_verification_error", async () => {
|
||||
const root = makeProject();
|
||||
const layer = new SwarmDispatchLayer(root);
|
||||
const swarm = await layer.getOrCreateSwarm();
|
||||
const target = swarm.route(buildEnvelope("r016-failure-target"));
|
||||
const refreshSpy = vi
|
||||
.spyOn(target._inbox, "refresh")
|
||||
.mockImplementation(() => {
|
||||
target._inbox._messages = [];
|
||||
target._inbox._lastRefresh = Date.now();
|
||||
});
|
||||
vi.spyOn(target, "receive").mockReturnValue([]);
|
||||
|
||||
const result = await layer._busDispatch(buildEnvelope("r016-failure"));
|
||||
|
||||
expect(refreshSpy).toHaveBeenCalled();
|
||||
expect(result.ok).toBe(false);
|
||||
expect(result.error).toMatchObject({
|
||||
code: "BUS_DELIVERY_VERIFICATION_FAILED",
|
||||
messageId: expect.stringMatching(/^msg-/),
|
||||
});
|
||||
expect(result.error.reason).toContain("not visible");
|
||||
});
|
||||
|
||||
test("busDispatch_when_message_visible_returns_success_shape", async () => {
|
||||
const root = makeProject();
|
||||
const layer = new SwarmDispatchLayer(root);
|
||||
|
||||
const result = await layer._busDispatch(buildEnvelope("r016-happy"));
|
||||
|
||||
expect(BUS_DELIVERY_VERIFY_TIMEOUT_MS).toBe(2000);
|
||||
expect(result.ok).toBe(true);
|
||||
expect(result.messageId).toMatch(/^msg-/);
|
||||
expect(result.targetAgent).toBeTruthy();
|
||||
expect(result.swarmName).toBe("default");
|
||||
expect(result.envelope.unitId).toBe("r016-happy");
|
||||
expect(result.error).toBeUndefined();
|
||||
});
|
||||
});
|
||||
169
src/resources/extensions/sf/tests/repo-supervisor.test.mjs
Normal file
169
src/resources/extensions/sf/tests/repo-supervisor.test.mjs
Normal file
|
|
@ -0,0 +1,169 @@
|
|||
/**
|
||||
* Repo Supervisor — skeleton behaviour contracts.
|
||||
*
|
||||
* Purpose: verify that the per-repo supervisor boundary keeps lifecycle, guard,
|
||||
* and dispatch-placeholder state isolated before any real dispatch is wired.
|
||||
*
|
||||
* Consumer: M053 supervisor implementation and future `sf supervisor` CLI work.
|
||||
*/
|
||||
import assert from "node:assert/strict";
|
||||
import { mkdtempSync, rmSync } from "node:fs";
|
||||
import { tmpdir } from "node:os";
|
||||
import { join } from "node:path";
|
||||
import { afterEach, describe, it } from "vitest";
|
||||
import { RepoSupervisor } from "../supervisor/repo-supervisor.js";
|
||||
|
||||
const tempRoots = [];
|
||||
|
||||
function makeRepo(name) {
|
||||
const root = mkdtempSync(join(tmpdir(), `sf-${name}-`));
|
||||
tempRoots.push(root);
|
||||
return root;
|
||||
}
|
||||
|
||||
function makeLogger() {
|
||||
const lines = [];
|
||||
return {
|
||||
lines,
|
||||
info(message) {
|
||||
lines.push(message);
|
||||
},
|
||||
};
|
||||
}
|
||||
|
||||
afterEach(() => {
|
||||
while (tempRoots.length > 0) {
|
||||
const root = tempRoots.pop();
|
||||
rmSync(root, { force: true, recursive: true });
|
||||
}
|
||||
});
|
||||
|
||||
describe("RepoSupervisor", () => {
|
||||
it("isolates_state_between_two_repo_supervisors", async () => {
|
||||
const loggerA = makeLogger();
|
||||
const loggerB = makeLogger();
|
||||
const repoA = makeRepo("repo-a");
|
||||
const repoB = makeRepo("repo-b");
|
||||
const supervisorA = new RepoSupervisor({
|
||||
repoPath: repoA,
|
||||
supervisorId: "supervisor-a",
|
||||
options: {
|
||||
dispatchQueue: [{ id: "A-001" }],
|
||||
logger: loggerA,
|
||||
},
|
||||
});
|
||||
const supervisorB = new RepoSupervisor({
|
||||
repoPath: repoB,
|
||||
supervisorId: "supervisor-b",
|
||||
options: {
|
||||
dispatchQueue: [{ id: "B-001" }],
|
||||
logger: loggerB,
|
||||
},
|
||||
});
|
||||
|
||||
const resultA = await supervisorA.tick();
|
||||
|
||||
assert.equal(resultA.status, "ok");
|
||||
assert.equal(resultA.unitId, "A-001");
|
||||
assert.equal(supervisorA.getStatus().ticksTotal, 1);
|
||||
assert.equal(supervisorB.getStatus().ticksTotal, 0);
|
||||
assert.equal(supervisorA.getStatus().errorsTotal, 0);
|
||||
assert.equal(supervisorB.getStatus().errorsTotal, 0);
|
||||
assert.match(loggerA.lines[0], /would dispatch unit A-001/);
|
||||
assert.equal(loggerB.lines.length, 0);
|
||||
|
||||
const resultB = await supervisorB.tick();
|
||||
|
||||
assert.equal(resultB.status, "ok");
|
||||
assert.equal(resultB.unitId, "B-001");
|
||||
assert.equal(supervisorA.getStatus().ticksTotal, 1);
|
||||
assert.equal(supervisorB.getStatus().ticksTotal, 1);
|
||||
assert.notEqual(supervisorA.dbPath, supervisorB.dbPath);
|
||||
});
|
||||
|
||||
it("tick_returns_result_object_and_catches_simulated_errors", async () => {
|
||||
const supervisor = new RepoSupervisor({
|
||||
repoPath: makeRepo("error-repo"),
|
||||
supervisorId: "supervisor-error",
|
||||
options: {
|
||||
logger: makeLogger(),
|
||||
shouldFailTick() {
|
||||
throw new Error("simulated failure");
|
||||
},
|
||||
},
|
||||
});
|
||||
|
||||
const result = await supervisor.tick();
|
||||
|
||||
assert.deepEqual(result, {
|
||||
dispatched: false,
|
||||
error: "simulated failure",
|
||||
errors: ["simulated failure"],
|
||||
status: "error",
|
||||
});
|
||||
assert.equal(supervisor.getStatus().errorsTotal, 1);
|
||||
assert.equal(supervisor.getStatus().lastTickResult.status, "error");
|
||||
});
|
||||
|
||||
it("pause_resume_toggles_tick_dispatch_behaviour", async () => {
|
||||
const logger = makeLogger();
|
||||
const supervisor = new RepoSupervisor({
|
||||
repoPath: makeRepo("paused-repo"),
|
||||
supervisorId: "supervisor-paused",
|
||||
options: {
|
||||
dispatchQueue: [{ id: "P-001" }],
|
||||
logger,
|
||||
},
|
||||
});
|
||||
|
||||
await supervisor.pause();
|
||||
const pausedResult = await supervisor.tick();
|
||||
|
||||
assert.deepEqual(pausedResult, {
|
||||
dispatched: false,
|
||||
errors: [],
|
||||
status: "paused",
|
||||
});
|
||||
assert.equal(supervisor.getStatus().paused, true);
|
||||
assert.equal(logger.lines.length, 0);
|
||||
|
||||
await supervisor.resume();
|
||||
const resumedResult = await supervisor.tick();
|
||||
|
||||
assert.equal(supervisor.getStatus().paused, false);
|
||||
assert.equal(resumedResult.status, "ok");
|
||||
assert.equal(resumedResult.unitId, "P-001");
|
||||
assert.match(logger.lines[0], /would dispatch unit P-001/);
|
||||
});
|
||||
|
||||
it("getStatus_shape_matches_design_contract", async () => {
|
||||
const supervisor = new RepoSupervisor({
|
||||
repoPath: makeRepo("status-repo"),
|
||||
supervisorId: "supervisor-status",
|
||||
options: {
|
||||
dispatchQueue: [{ id: "S-001" }],
|
||||
logger: makeLogger(),
|
||||
repoId: "status-repo",
|
||||
},
|
||||
});
|
||||
|
||||
assert.deepEqual(Object.keys(supervisor.getStatus()), [
|
||||
"repoId",
|
||||
"lastTickAt",
|
||||
"lastTickResult",
|
||||
"paused",
|
||||
"ticksTotal",
|
||||
"errorsTotal",
|
||||
]);
|
||||
|
||||
await supervisor.tick();
|
||||
const status = supervisor.getStatus();
|
||||
|
||||
assert.equal(status.repoId, "status-repo");
|
||||
assert.equal(typeof status.lastTickAt, "string");
|
||||
assert.equal(status.lastTickResult.status, "ok");
|
||||
assert.equal(status.paused, false);
|
||||
assert.equal(status.ticksTotal, 1);
|
||||
assert.equal(status.errorsTotal, 0);
|
||||
});
|
||||
});
|
||||
220
src/resources/extensions/sf/tests/unit-lanes.test.mjs
Normal file
220
src/resources/extensions/sf/tests/unit-lanes.test.mjs
Normal file
|
|
@ -0,0 +1,220 @@
|
|||
/**
|
||||
* R090/S01 — unit-lanes module tests.
|
||||
*
|
||||
* Covers:
|
||||
* - classifyUnitLane: each known type, fallback for unknown
|
||||
* - areLanesEnabled: default (on), SF_LANES=0, SF_LANES=false
|
||||
* - getPlanningLaneCapacity / getExecuteLaneCapacity: defaults and overrides
|
||||
* - selectLaneAwareUnits: basic selection and cross-lane scan behaviour
|
||||
*/
|
||||
import { describe, expect, it } from "vitest";
|
||||
import {
|
||||
LANE_EXECUTE,
|
||||
LANE_PLANNING,
|
||||
areLanesEnabled,
|
||||
classifyUnitLane,
|
||||
getExecuteLaneCapacity,
|
||||
getPlanningLaneCapacity,
|
||||
selectLaneAwareUnits,
|
||||
} from "../auto/unit-lanes.js";
|
||||
|
||||
// ── classifyUnitLane ──────────────────────────────────────────────────────────
|
||||
|
||||
describe("classifyUnitLane — explicit planning types", () => {
|
||||
it("classifies plan-milestone as planning", () => {
|
||||
expect(classifyUnitLane("plan-milestone")).toBe(LANE_PLANNING);
|
||||
});
|
||||
|
||||
it("classifies plan-slice as planning", () => {
|
||||
expect(classifyUnitLane("plan-slice")).toBe(LANE_PLANNING);
|
||||
});
|
||||
|
||||
it("classifies research-slice as planning", () => {
|
||||
expect(classifyUnitLane("research-slice")).toBe(LANE_PLANNING);
|
||||
});
|
||||
|
||||
it("classifies assess-slice as planning", () => {
|
||||
expect(classifyUnitLane("assess-slice")).toBe(LANE_PLANNING);
|
||||
});
|
||||
|
||||
it("classifies synthesize-slice as planning", () => {
|
||||
expect(classifyUnitLane("synthesize-slice")).toBe(LANE_PLANNING);
|
||||
});
|
||||
|
||||
it("classifies reassess-roadmap as planning", () => {
|
||||
expect(classifyUnitLane("reassess-roadmap")).toBe(LANE_PLANNING);
|
||||
});
|
||||
|
||||
it("classifies validate-milestone as planning", () => {
|
||||
expect(classifyUnitLane("validate-milestone")).toBe(LANE_PLANNING);
|
||||
});
|
||||
});
|
||||
|
||||
describe("classifyUnitLane — prefix/suffix rules", () => {
|
||||
it("classifies arbitrary plan-* types as planning", () => {
|
||||
expect(classifyUnitLane("plan-whatever")).toBe(LANE_PLANNING);
|
||||
expect(classifyUnitLane("plan-deep-dive")).toBe(LANE_PLANNING);
|
||||
});
|
||||
|
||||
it("classifies arbitrary research-* types as planning", () => {
|
||||
expect(classifyUnitLane("research-codebase")).toBe(LANE_PLANNING);
|
||||
});
|
||||
|
||||
it("classifies types ending in -research as planning", () => {
|
||||
expect(classifyUnitLane("deep-research")).toBe(LANE_PLANNING);
|
||||
expect(classifyUnitLane("codebase-research")).toBe(LANE_PLANNING);
|
||||
});
|
||||
});
|
||||
|
||||
describe("classifyUnitLane — execute types", () => {
|
||||
it("classifies execute-task as execute", () => {
|
||||
expect(classifyUnitLane("execute-task")).toBe(LANE_EXECUTE);
|
||||
});
|
||||
|
||||
it("classifies complete-task as execute", () => {
|
||||
expect(classifyUnitLane("complete-task")).toBe(LANE_EXECUTE);
|
||||
});
|
||||
|
||||
it("classifies complete-slice as execute", () => {
|
||||
expect(classifyUnitLane("complete-slice")).toBe(LANE_EXECUTE);
|
||||
});
|
||||
|
||||
it("classifies complete-milestone as execute", () => {
|
||||
expect(classifyUnitLane("complete-milestone")).toBe(LANE_EXECUTE);
|
||||
});
|
||||
|
||||
it("classifies repair-* as execute", () => {
|
||||
expect(classifyUnitLane("repair-something")).toBe(LANE_EXECUTE);
|
||||
expect(classifyUnitLane("repair-slice")).toBe(LANE_EXECUTE);
|
||||
});
|
||||
});
|
||||
|
||||
describe("classifyUnitLane — fallback for unknown types", () => {
|
||||
it("defaults unknown types to execute lane", () => {
|
||||
expect(classifyUnitLane("completely-unknown-type")).toBe(LANE_EXECUTE);
|
||||
expect(classifyUnitLane("deploy")).toBe(LANE_EXECUTE);
|
||||
expect(classifyUnitLane("run-uat")).toBe(LANE_EXECUTE);
|
||||
});
|
||||
|
||||
it("handles empty string as execute", () => {
|
||||
expect(classifyUnitLane("")).toBe(LANE_EXECUTE);
|
||||
});
|
||||
|
||||
it("handles non-string as execute", () => {
|
||||
expect(classifyUnitLane(null)).toBe(LANE_EXECUTE);
|
||||
expect(classifyUnitLane(undefined)).toBe(LANE_EXECUTE);
|
||||
expect(classifyUnitLane(42)).toBe(LANE_EXECUTE);
|
||||
});
|
||||
});
|
||||
|
||||
// ── areLanesEnabled ───────────────────────────────────────────────────────────
|
||||
|
||||
describe("areLanesEnabled", () => {
|
||||
it("returns true when SF_LANES is not set", () => {
|
||||
expect(areLanesEnabled({})).toBe(true);
|
||||
});
|
||||
|
||||
it("returns true when SF_LANES=1", () => {
|
||||
expect(areLanesEnabled({ SF_LANES: "1" })).toBe(true);
|
||||
});
|
||||
|
||||
it("returns true when SF_LANES=true", () => {
|
||||
expect(areLanesEnabled({ SF_LANES: "true" })).toBe(true);
|
||||
});
|
||||
|
||||
it("returns false when SF_LANES=0", () => {
|
||||
expect(areLanesEnabled({ SF_LANES: "0" })).toBe(false);
|
||||
});
|
||||
|
||||
it("returns false when SF_LANES=false", () => {
|
||||
expect(areLanesEnabled({ SF_LANES: "false" })).toBe(false);
|
||||
});
|
||||
});
|
||||
|
||||
// ── getPlanningLaneCapacity ───────────────────────────────────────────────────
|
||||
|
||||
describe("getPlanningLaneCapacity", () => {
|
||||
it("returns 1 by default", () => {
|
||||
expect(getPlanningLaneCapacity({})).toBe(1);
|
||||
});
|
||||
|
||||
it("returns the override value when SF_LANES_PLANNING_N is set", () => {
|
||||
expect(getPlanningLaneCapacity({ SF_LANES_PLANNING_N: "3" })).toBe(3);
|
||||
});
|
||||
|
||||
it("ignores invalid values and returns default", () => {
|
||||
expect(getPlanningLaneCapacity({ SF_LANES_PLANNING_N: "abc" })).toBe(1);
|
||||
expect(getPlanningLaneCapacity({ SF_LANES_PLANNING_N: "0" })).toBe(1);
|
||||
expect(getPlanningLaneCapacity({ SF_LANES_PLANNING_N: "-1" })).toBe(1);
|
||||
});
|
||||
});
|
||||
|
||||
// ── getExecuteLaneCapacity ────────────────────────────────────────────────────
|
||||
|
||||
describe("getExecuteLaneCapacity", () => {
|
||||
it("returns 1 by default", () => {
|
||||
expect(getExecuteLaneCapacity({})).toBe(1);
|
||||
});
|
||||
|
||||
it("returns the override value when SF_LANES_EXECUTE_N is set", () => {
|
||||
expect(getExecuteLaneCapacity({ SF_LANES_EXECUTE_N: "2" })).toBe(2);
|
||||
});
|
||||
|
||||
it("ignores invalid values and returns default", () => {
|
||||
expect(getExecuteLaneCapacity({ SF_LANES_EXECUTE_N: "" })).toBe(1);
|
||||
expect(getExecuteLaneCapacity({ SF_LANES_EXECUTE_N: "0" })).toBe(1);
|
||||
});
|
||||
});
|
||||
|
||||
// ── selectLaneAwareUnits ──────────────────────────────────────────────────────
|
||||
|
||||
describe("selectLaneAwareUnits", () => {
|
||||
const units = [
|
||||
{ unitType: "plan-slice", unitId: "A" },
|
||||
{ unitType: "execute-task", unitId: "B" },
|
||||
{ unitType: "plan-slice", unitId: "C" },
|
||||
{ unitType: "execute-task", unitId: "D" },
|
||||
{ unitType: "plan-slice", unitId: "E" },
|
||||
];
|
||||
|
||||
it("selects one planning and one execute unit with default caps", () => {
|
||||
const selected = selectLaneAwareUnits(units, 1, 1);
|
||||
expect(selected).toHaveLength(2);
|
||||
expect(selected[0].unitId).toBe("A"); // first planning candidate
|
||||
expect(selected[1].unitId).toBe("B"); // first execute candidate
|
||||
});
|
||||
|
||||
it("saturated execute slot does not block subsequent planning dispatch", () => {
|
||||
// Only execute cap=0 — should only pick planning candidates
|
||||
const selected = selectLaneAwareUnits(units, 1, 0);
|
||||
expect(selected).toHaveLength(1);
|
||||
expect(selected[0].unitType).toBe("plan-slice");
|
||||
});
|
||||
|
||||
it("saturated planning slot does not block subsequent execute dispatch", () => {
|
||||
// Only planning cap=0 — should only pick execute candidates
|
||||
const selected = selectLaneAwareUnits(units, 0, 1);
|
||||
expect(selected).toHaveLength(1);
|
||||
expect(selected[0].unitType).toBe("execute-task");
|
||||
});
|
||||
|
||||
it("selects up to planningCap=2 planning units", () => {
|
||||
const selected = selectLaneAwareUnits(units, 2, 1);
|
||||
// Should pick A (planning), B (execute), C (planning)
|
||||
expect(selected).toHaveLength(3);
|
||||
expect(selected[0].unitId).toBe("A");
|
||||
expect(selected[1].unitId).toBe("B");
|
||||
expect(selected[2].unitId).toBe("C");
|
||||
});
|
||||
|
||||
it("returns empty array for empty candidate list", () => {
|
||||
const selected = selectLaneAwareUnits([], 1, 1);
|
||||
expect(selected).toHaveLength(0);
|
||||
});
|
||||
|
||||
it("stops early once both caps are exhausted", () => {
|
||||
const selected = selectLaneAwareUnits(units, 1, 1);
|
||||
// Should not scan past the point where both are saturated
|
||||
expect(selected.map((u) => u.unitId)).toEqual(["A", "B"]);
|
||||
});
|
||||
});
|
||||
325
src/resources/extensions/sf/uok/drift-detection-gate.js
Normal file
325
src/resources/extensions/sf/uok/drift-detection-gate.js
Normal file
|
|
@ -0,0 +1,325 @@
|
|||
/**
|
||||
* drift-detection-gate.js — ADR-0075 verification gate for task-state drift.
|
||||
*
|
||||
* Purpose: catch completed task rows whose recorded artifacts, completion prose,
|
||||
* or source imports no longer match the working tree before downstream UOK flows
|
||||
* trust stale completion evidence.
|
||||
*
|
||||
* Consumer: UOK gate-runner in single-task verification and sweep audits.
|
||||
*/
|
||||
import { existsSync, readFileSync } from "node:fs";
|
||||
import { dirname, extname, isAbsolute, resolve } from "node:path";
|
||||
|
||||
const OUTCOME_RANK = {
|
||||
pass: 0,
|
||||
retry: 1,
|
||||
"manual-attention": 2,
|
||||
fail: 3,
|
||||
};
|
||||
|
||||
const CODE_EXTENSIONS = new Set([".ts", ".tsx", ".js", ".jsx", ".mjs", ".cjs"]);
|
||||
const RESOLUTION_EXTENSIONS = [
|
||||
".ts",
|
||||
".tsx",
|
||||
".js",
|
||||
".jsx",
|
||||
".mjs",
|
||||
".cjs",
|
||||
".json",
|
||||
];
|
||||
|
||||
/**
|
||||
* @typedef {import("./contracts.js").GateResult} GateResult
|
||||
*/
|
||||
|
||||
function passResult(rationale = "No drift detected.") {
|
||||
return {
|
||||
outcome: "pass",
|
||||
failureClass: "verification",
|
||||
rationale,
|
||||
};
|
||||
}
|
||||
|
||||
function normalizeArrayColumn(value) {
|
||||
if (Array.isArray(value))
|
||||
return value.filter((entry) => typeof entry === "string");
|
||||
if (typeof value !== "string" || value.trim() === "") return [];
|
||||
try {
|
||||
const parsed = JSON.parse(value);
|
||||
if (Array.isArray(parsed)) {
|
||||
return parsed.filter((entry) => typeof entry === "string");
|
||||
}
|
||||
if (typeof parsed === "string" && parsed.trim()) return [parsed.trim()];
|
||||
} catch {
|
||||
// Legacy rows may contain comma-separated paths.
|
||||
}
|
||||
return value
|
||||
.split(",")
|
||||
.map((entry) => entry.trim())
|
||||
.filter(Boolean);
|
||||
}
|
||||
|
||||
function getTaskArtifacts(taskRow) {
|
||||
return normalizeArrayColumn(taskRow?.output_artifacts ?? taskRow?.key_files);
|
||||
}
|
||||
|
||||
function isCompleteTask(taskRow) {
|
||||
return taskRow?.status === "complete";
|
||||
}
|
||||
|
||||
function taskLabel(taskRow) {
|
||||
const parts = [taskRow?.milestone_id, taskRow?.slice_id, taskRow?.id].filter(
|
||||
(value) => typeof value === "string" && value.length > 0,
|
||||
);
|
||||
return parts.length > 0 ? parts.join("/") : "task";
|
||||
}
|
||||
|
||||
function artifactAbsolutePath(artifactPath, basePath) {
|
||||
return isAbsolute(artifactPath)
|
||||
? artifactPath
|
||||
: resolve(basePath, artifactPath);
|
||||
}
|
||||
|
||||
function isSrcCodeArtifact(artifactPath, basePath) {
|
||||
const absPath = artifactAbsolutePath(artifactPath, basePath);
|
||||
const relPath = absPath.startsWith(resolve(basePath))
|
||||
? absPath.slice(resolve(basePath).length + 1)
|
||||
: artifactPath;
|
||||
return (
|
||||
relPath.startsWith("src/") && CODE_EXTENSIONS.has(extname(artifactPath))
|
||||
);
|
||||
}
|
||||
|
||||
function checkArtifactsExist(taskRow, basePath) {
|
||||
if (!isCompleteTask(taskRow)) return passResult("Task is not complete.");
|
||||
const missing = getTaskArtifacts(taskRow).filter(
|
||||
(artifactPath) => !existsSync(artifactAbsolutePath(artifactPath, basePath)),
|
||||
);
|
||||
if (missing.length === 0)
|
||||
return passResult("Completed task artifacts exist.");
|
||||
return {
|
||||
outcome: "fail",
|
||||
failureClass: "verification",
|
||||
rationale: `task marked complete but artifact at ${missing[0]} missing on disk`,
|
||||
findings: missing,
|
||||
recommendation: "revert task status to in_progress + re-dispatch",
|
||||
};
|
||||
}
|
||||
|
||||
function proseFields(taskRow) {
|
||||
return [
|
||||
taskRow?.summary,
|
||||
taskRow?.result,
|
||||
taskRow?.narrative,
|
||||
taskRow?.verification_result,
|
||||
taskRow?.known_issues,
|
||||
taskRow?.deviations,
|
||||
taskRow?.full_summary_md,
|
||||
]
|
||||
.filter((value) => typeof value === "string" && value.trim().length > 0)
|
||||
.join("\n");
|
||||
}
|
||||
|
||||
function detectProseStatusMismatch(taskRow) {
|
||||
if (!isCompleteTask(taskRow)) return passResult("Task is not complete.");
|
||||
const prose = proseFields(taskRow);
|
||||
if (!prose) return passResult("Completed task has no drift prose markers.");
|
||||
const blockedPattern = /\b(blocked|blocker|unable to|deferred|todo)\b/i;
|
||||
const needsPattern =
|
||||
/\bneeds\s+(manual|follow[- ]?up|work|attention|implementation|fix|review)\b/i;
|
||||
if (!blockedPattern.test(prose) && !needsPattern.test(prose)) {
|
||||
return passResult(
|
||||
"Completed task prose does not indicate unfinished work.",
|
||||
);
|
||||
}
|
||||
return {
|
||||
outcome: "manual-attention",
|
||||
failureClass: "verification",
|
||||
rationale: `task marked complete but ${taskLabel(taskRow)} prose indicates blocked or unfinished work`,
|
||||
findings: prose
|
||||
.split("\n")
|
||||
.map((line) => line.trim())
|
||||
.filter((line) => blockedPattern.test(line) || needsPattern.test(line))
|
||||
.slice(0, 5),
|
||||
recommendation:
|
||||
"review task summary and either update prose or reopen the task",
|
||||
};
|
||||
}
|
||||
|
||||
function extractImportsWithRegex(source) {
|
||||
const imports = [];
|
||||
const lines = source.split("\n");
|
||||
const pattern =
|
||||
/(?:import\s+(?:[^'"]*?\s+from\s+)?|export\s+[^'"]*?\s+from\s+|require\s*\(\s*)(['"])([^'"]+)\1/g;
|
||||
let inBlockComment = false;
|
||||
for (let i = 0; i < lines.length; i += 1) {
|
||||
const line = lines[i];
|
||||
const trimmed = line.trimStart();
|
||||
if (inBlockComment) {
|
||||
if (line.includes("*/")) inBlockComment = false;
|
||||
continue;
|
||||
}
|
||||
const blockStart = line.indexOf("/*");
|
||||
const blockEnd = line.indexOf("*/");
|
||||
if (blockStart !== -1 && (blockEnd === -1 || blockEnd < blockStart)) {
|
||||
inBlockComment = true;
|
||||
continue;
|
||||
}
|
||||
if (trimmed.startsWith("//") || trimmed.startsWith("*")) continue;
|
||||
pattern.lastIndex = 0;
|
||||
let match;
|
||||
while ((match = pattern.exec(line)) !== null) {
|
||||
if (line.slice(0, match.index).includes("//")) continue;
|
||||
imports.push({ specifier: match[2], line: i + 1 });
|
||||
}
|
||||
}
|
||||
return imports;
|
||||
}
|
||||
|
||||
function candidateImportPaths(specifier, sourceFilePath, basePath) {
|
||||
const sourceDir = dirname(sourceFilePath);
|
||||
const rawPath = specifier.startsWith("/")
|
||||
? resolve(basePath, specifier.slice(1))
|
||||
: resolve(sourceDir, specifier);
|
||||
const candidates = [rawPath];
|
||||
const currentExt = extname(rawPath);
|
||||
if (currentExt) {
|
||||
candidates.push(rawPath);
|
||||
if ([".js", ".jsx", ".mjs", ".cjs"].includes(currentExt)) {
|
||||
candidates.push(rawPath.slice(0, -currentExt.length));
|
||||
}
|
||||
} else {
|
||||
for (const ext of RESOLUTION_EXTENSIONS) candidates.push(rawPath + ext);
|
||||
}
|
||||
const baseWithoutRuntimeExt = rawPath.replace(/\.(js|jsx|mjs|cjs)$/u, "");
|
||||
for (const ext of RESOLUTION_EXTENSIONS) {
|
||||
candidates.push(baseWithoutRuntimeExt + ext);
|
||||
candidates.push(resolve(baseWithoutRuntimeExt, `index${ext}`));
|
||||
}
|
||||
return [...new Set(candidates)];
|
||||
}
|
||||
|
||||
function importResolves(specifier, sourceFilePath, basePath) {
|
||||
return candidateImportPaths(specifier, sourceFilePath, basePath).some(
|
||||
(candidate) => existsSync(candidate),
|
||||
);
|
||||
}
|
||||
|
||||
function checkImportsResolve(filePath, basePath) {
|
||||
const absolutePath = artifactAbsolutePath(filePath, basePath);
|
||||
if (!isSrcCodeArtifact(filePath, basePath) || !existsSync(absolutePath)) {
|
||||
return passResult("No source imports to check.");
|
||||
}
|
||||
let source = "";
|
||||
try {
|
||||
source = readFileSync(absolutePath, "utf-8");
|
||||
} catch {
|
||||
return passResult("Source file could not be read for import drift check.");
|
||||
}
|
||||
const unresolved = extractImportsWithRegex(source)
|
||||
.filter(
|
||||
(entry) =>
|
||||
entry.specifier.startsWith(".") || entry.specifier.startsWith("/"),
|
||||
)
|
||||
.filter(
|
||||
(entry) => !importResolves(entry.specifier, absolutePath, basePath),
|
||||
);
|
||||
if (unresolved.length === 0) return passResult("Source imports resolve.");
|
||||
const first = unresolved[0];
|
||||
return {
|
||||
outcome: "fail",
|
||||
failureClass: "verification",
|
||||
rationale: `import to ${first.specifier} resolves to missing file`,
|
||||
findings: unresolved.map(
|
||||
(entry) => `${filePath}:${entry.line} -> ${entry.specifier}`,
|
||||
),
|
||||
recommendation: "restore the imported file or update the import path",
|
||||
};
|
||||
}
|
||||
|
||||
function checkTaskImports(taskRow, basePath) {
|
||||
const codeArtifacts = getTaskArtifacts(taskRow).filter((artifactPath) =>
|
||||
isSrcCodeArtifact(artifactPath, basePath),
|
||||
);
|
||||
return aggregateResults(
|
||||
codeArtifacts.map((artifactPath) =>
|
||||
checkImportsResolve(artifactPath, basePath),
|
||||
),
|
||||
"No source import drift detected.",
|
||||
);
|
||||
}
|
||||
|
||||
function aggregateResults(results, passRationale) {
|
||||
const actionable = results.filter(Boolean);
|
||||
if (actionable.length === 0) return passResult(passRationale);
|
||||
const worst = actionable.reduce((current, next) =>
|
||||
OUTCOME_RANK[next.outcome] > OUTCOME_RANK[current.outcome] ? next : current,
|
||||
);
|
||||
if (worst.outcome === "pass") return passResult(passRationale);
|
||||
return {
|
||||
...worst,
|
||||
findings: actionable
|
||||
.filter((result) => result.outcome === worst.outcome)
|
||||
.flatMap((result) =>
|
||||
Array.isArray(result.findings)
|
||||
? result.findings
|
||||
: result.findings
|
||||
? [result.findings]
|
||||
: [],
|
||||
),
|
||||
};
|
||||
}
|
||||
|
||||
function runTaskChecks(taskRow, basePath) {
|
||||
return aggregateResults(
|
||||
[
|
||||
checkArtifactsExist(taskRow, basePath),
|
||||
detectProseStatusMismatch(taskRow),
|
||||
checkTaskImports(taskRow, basePath),
|
||||
],
|
||||
`No drift detected for ${taskLabel(taskRow)}.`,
|
||||
);
|
||||
}
|
||||
|
||||
async function fetchSweepTasks(dbRead) {
|
||||
if (typeof dbRead === "function") {
|
||||
const rows = await dbRead({
|
||||
kind: "recently-completed-tasks",
|
||||
sql: "SELECT * FROM tasks WHERE status = 'complete' ORDER BY completed_at DESC LIMIT 50",
|
||||
});
|
||||
return Array.isArray(rows) ? rows : [];
|
||||
}
|
||||
if (dbRead?.prepare && typeof dbRead.prepare === "function") {
|
||||
const stmt = dbRead.prepare(
|
||||
"SELECT * FROM tasks WHERE status = 'complete' ORDER BY completed_at DESC LIMIT 50",
|
||||
);
|
||||
if (stmt?.all && typeof stmt.all === "function") return stmt.all();
|
||||
}
|
||||
return [];
|
||||
}
|
||||
|
||||
export const driftDetectionGate = {
|
||||
id: "drift-detection",
|
||||
type: "verification",
|
||||
async execute(ctx) {
|
||||
const basePath = ctx?.basePath ?? process.cwd();
|
||||
if (ctx?.scope === "sweep") {
|
||||
const rows = await fetchSweepTasks(ctx.dbRead);
|
||||
return aggregateResults(
|
||||
rows.map((taskRow) => runTaskChecks(taskRow, basePath)),
|
||||
`No drift detected across ${rows.length} recently completed task(s).`,
|
||||
);
|
||||
}
|
||||
if (!ctx?.taskRow) {
|
||||
return {
|
||||
outcome: "manual-attention",
|
||||
failureClass: "verification",
|
||||
rationale: "drift detection requires ctx.taskRow in single-task mode",
|
||||
recommendation: "provide taskRow or run with scope sweep and dbRead",
|
||||
};
|
||||
}
|
||||
return runTaskChecks(ctx.taskRow, basePath);
|
||||
},
|
||||
};
|
||||
|
||||
export default driftDetectionGate;
|
||||
|
|
@ -31,6 +31,18 @@ import { AgentSwarm } from "./agent-swarm.js";
|
|||
import { MessageBus } from "./message-bus.js";
|
||||
import { createDefaultSwarm } from "./swarm-roles.js";
|
||||
|
||||
/**
|
||||
* Delivery verification timeout for bus-backed dispatch acknowledgement.
|
||||
*
|
||||
* Purpose: bound the time `_busDispatch()` spends proving a target agent inbox
|
||||
* can read a just-sent SQLite message before reporting delivery failure.
|
||||
*
|
||||
* Consumer: SwarmDispatchLayer._busDispatch().
|
||||
*/
|
||||
export const BUS_DELIVERY_VERIFY_TIMEOUT_MS = 2000;
|
||||
|
||||
const BUS_DELIVERY_VERIFY_BACKOFF_MS = [50, 100, 200, 400, 800];
|
||||
|
||||
// Lazily imported to avoid circular deps — agent-runner depends on runSubagent
|
||||
// which should not be pulled in for pure message-routing code paths.
|
||||
let _runAgentTurnFn = null;
|
||||
|
|
@ -54,6 +66,46 @@ function isMeaningfulAgentTurnEvent(event) {
|
|||
return true;
|
||||
}
|
||||
|
||||
function sleep(ms) {
|
||||
return new Promise((resolve) => setTimeout(resolve, ms));
|
||||
}
|
||||
|
||||
function refreshAndFindInboxMessage(agent, messageId) {
|
||||
agent._inbox.refresh();
|
||||
return agent.receive(false).find((message) => message.id === messageId);
|
||||
}
|
||||
|
||||
async function verifyBusDelivery(agent, messageId) {
|
||||
const deadline = Date.now() + BUS_DELIVERY_VERIFY_TIMEOUT_MS;
|
||||
let attempt = 0;
|
||||
|
||||
while (Date.now() <= deadline) {
|
||||
const visible = refreshAndFindInboxMessage(agent, messageId);
|
||||
if (visible) {
|
||||
return {
|
||||
ok: true,
|
||||
attempts: attempt + 1,
|
||||
};
|
||||
}
|
||||
|
||||
const remainingMs = deadline - Date.now();
|
||||
if (remainingMs <= 0) break;
|
||||
|
||||
const backoffMs =
|
||||
BUS_DELIVERY_VERIFY_BACKOFF_MS[
|
||||
Math.min(attempt, BUS_DELIVERY_VERIFY_BACKOFF_MS.length - 1)
|
||||
];
|
||||
await sleep(Math.min(backoffMs, remainingMs));
|
||||
attempt++;
|
||||
}
|
||||
|
||||
return {
|
||||
ok: false,
|
||||
attempts: attempt + 1,
|
||||
reason: `message ${messageId} not visible in target inbox after force refresh within ${BUS_DELIVERY_VERIFY_TIMEOUT_MS}ms`,
|
||||
};
|
||||
}
|
||||
|
||||
async function runAgentTurnWithOuterWatchdogs(runAgentTurn, agent, opts = {}) {
|
||||
const { timeoutMs = 480_000, noOutputTimeoutMs, signal, onEvent } = opts;
|
||||
const controller = new AbortController();
|
||||
|
|
@ -316,8 +368,26 @@ export class SwarmDispatchLayer {
|
|||
};
|
||||
|
||||
const messageId = this._bus.send(from, to, envelope.payload, metadata);
|
||||
const delivery = await verifyBusDelivery(target, messageId);
|
||||
|
||||
if (!delivery.ok) {
|
||||
return {
|
||||
ok: false,
|
||||
messageId,
|
||||
targetAgent: target.identity.name,
|
||||
swarmName: this._swarmName,
|
||||
envelope,
|
||||
...(memoryContext ? { memoryContext } : {}),
|
||||
error: {
|
||||
code: "BUS_DELIVERY_VERIFICATION_FAILED",
|
||||
messageId,
|
||||
reason: delivery.reason,
|
||||
},
|
||||
};
|
||||
}
|
||||
|
||||
return {
|
||||
ok: true,
|
||||
messageId,
|
||||
targetAgent: target.identity.name,
|
||||
swarmName: this._swarmName,
|
||||
|
|
|
|||
Loading…
Add table
Reference in a new issue