Schema now accepts the same five levels used elsewhere in the codebase
(minimal/low/medium/high/bypassed) instead of the stale full/restricted/
sandbox triple. Docs and env test updated to match.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Codex audit (Q4) flagged that the mutation gate landed in slice 3a but
the test suite only verified the three earlier gates. Add coverage:
- agree-path: mutation-gate fires with outcome=fail, rejectedCount=1,
resolvedCount=0 (the test fixture has no real ledger entry for the
decision id, so markResolved rejects it — the gate correctly surfaces
the partial failure)
- disagree-path: mutation-gate does NOT fire (apply phase skipped)
Pins the 4-gate contract end-to-end. Suite: 4/4 in this file.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds the missing test case that confirms the fail-closed semantics
the parallel worker shipped in slice 3a: when the trace writer
cannot persist a UOK gate record (e.g. .sf/traces is unwritable),
runTriageApply MUST abort before any subagent runs and surface the
emission failure as the run error.
This pins down the contract codex Q5 noted as soft: enrichment
failures are debug-only, but PRIMARY gate emission for the apply
flow is hard-required. Without observable gates, an apply that
mutates the ledger has no audit trail — refusing is the right call.
Test asserts: trace-dir write failure → ok=false, error contains
"UOK gate emission failed for trusted-agent-source-gate", and the
mocked agentRunner was never invoked.
Suite: 1682/1682.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
First production caller of the schema-v2 writer chain. Every
`sf headless triage --apply` invocation now emits four gate_run trace
events with surface=headless, runControl=supervised, permissionProfile=
high, traceId=flowId — making the gates visible in `status uok --json`
with coverageStatus: "ok" (or fail/manual-attention on reject paths).
Gates emitted, in order:
1. trusted-agent-source-gate — fires on the trust precondition:
pass: both triage-decider and rubber-duck are SF-shipped built-ins
fail: missing-agent OR non-builtin source OR untrusted custom runner
(covers all three pre-dispatch refusal paths so operators see the
failure in status uok, not just in the journal)
2. triage-plan-validation-gate — fires on the strict-parse contract:
pass: parseTriagePlanStrict returns a valid plan covering expectedIds
fail: missing marker / bad yaml / unknown id / outcome-required field missing
3. triage-apply-review-gate — fires on the rubber-duck verdict:
pass: rubber-duck: agree → apply phase proceeds
fail: rubber-duck disagreed → clean pause, no mutations
manual-attention: rubber-duck subagent failed to complete
4. triage-apply-mutation-gate — fires after applyTriagePlan:
pass: every approved mutation landed
fail: any rejected mutation
manual-attention: zero approved mutations (all decisions were "fix")
Includes counts in extra: resolvedCount, rejectedCount, pendingFixCount.
Reader-side fixes (codex review follow-up on slice 3a):
- getDistinctGateIds (sf-db-gates.js) now UNIONs trace-event IDs with
quality_gates DB IDs instead of returning trace IDs early when any
exist. The old behavior silently hid slice-scoped DB-only gates the
moment a flow-scoped trace landed.
- getGateMeta (headless-uok-status.ts) now reads BOTH trace events and
DB row, then picks whichever has the later evaluatedAt. Tie-break
prefers trace (flow-scoped gates with no quality_gates FK row are
trace-only). Old behavior preferred trace whenever surface was set,
regardless of timestamp.
Live verification: ran `sf headless triage --apply` 4 times against the
operator's environment (rubber-duck is a project-level override).
trusted-agent-source-gate now shows in `sf headless status uok --json`
with total: 4, fail: 4, coverageStatus: "ok" — proving the schema-v2
metadata round-trips through the trace events and reaches the
classifier.
Tests:
- headless-triage-uok-gates.test.ts (3 new tests): agree path emits
3 pass gates with v2 metadata; disagree path emits review fail;
unknown-id path emits validation fail with no review gate.
- Existing test suites adjusted for the GateMetadataRow →
GateRunContextRow rename (classifier helpers renamed consistently
across .ts source and the .mjs test mirror).
- Full SF + headless apply: 1681/1681.
Still legacy in production (slice 3b targets these next):
- phases-pre-dispatch.js gates: resource-version-guard, pre-dispatch-
health-gate, planning-flow-gate. None of these pass uokContext yet.
- phases-unit.js gates: unit-verification-gate, plan-gate.
- plan-slice.js: Q3/Q4/Q5/Q6/Q7/Q8 seed gates.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Codex review follow-up (2026-05-14) addressed all three remaining
issues from the earlier rescue pass:
1. Strict plan validation. parseTriagePlanStrict refuses the WHOLE
plan on any malformed item instead of silently dropping. Enforces:
- completion marker "Self-feedback triage complete" present
- exactly one fenced ```yaml block
- every decision has non-empty id + outcome ∈ {fix, promote, close}
- outcome-specific required fields (close → reason; promote →
reason + requirement_id; fix → proposed_approach)
- duplicate ids rejected
- when expectedIds is supplied, decisions must cover the candidate
set exactly — no extras (hallucinated ids), no missing
Returns ParseTriagePlanResult with {plan, error} so the caller can
surface the specific failure reason.
2. Custom-runner trust guard. runTriageApply refuses an injected
options.agentRunner unless allowUntrustedRunner is also explicitly
set. Production callers cannot inject a runner. Without this guard
a custom runner could side-channel-mutate the ledger despite the
read-only tool override (codex Q2).
3. Per-decision failure surfacing. applyTriagePlan now returns
{resolvedIds, rejectedIds, pendingFixIds} instead of just
resolvedIds. runTriageApply reports ok=false if rejectedIds is
non-empty, with the count + ids in the error message. Mutations
still happen one-by-one (no SQL transaction wrapping) but the
failure is no longer silent (codex Q3).
Tests: src/tests/headless-triage-apply.test.ts now covers:
- agree-path runs both agents in order; apply fails on missing
ledger entry → ok=false, rejectedIds populated (the realistic
contract for a test fixture without a seeded DB)
- custom runner without allowUntrustedRunner refuses, agentRunner
never invoked
- rubber-duck disagrees → clean pause, ok=false, agreed=false
- decider fails → skip rubber-duck
- unknown id in plan rejected before review
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Codifies AC4 of sf-mp4w2dij-xm6cwj: the regex-only path is the
today-default fast mode. SF_SECURITY_FAST=1 is the explicit opt-in for
callers that want to assert "regex-only, no LLM escalation, sub-100ms"
regardless of any future tiered reviewer landing in the script.
Today the env var changes only the trailing status line so operators
can verify the contract is observable. When the LLM-backed review hook
(AC1) lands, the absence of SF_SECURITY_FAST becomes the trigger for
escalation; setting it=1 keeps offline / pre-commit callers on the
fast path. Locked in by tests in both the .sh and .mjs scanners.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Three .test.mjs files now import describe/it from vitest, matching the
harness CLAUDE.md mandates for the SF extension suite.
- schedule-e2e local readEntries threshold raised 50ms → 100ms with a
comment noting full-suite parallelism adds scheduler/filesystem jitter
on dev machines (CI threshold unchanged at 200ms).
- e2e-smoke "headless new-milestone without --context" timeout raised
10s → 30s so the exit-1 assertion isn't flaky under load.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Fold sf-usage-bar, sf-notify, sf-inturn-guard, sf-permissions,
slash-commands into sf extension (ui/, notifications/, guards/,
permissions/, commands/legacy/)
- Delete vectordrive extension
- Migrate uok/kernel.js to TypeScript (kernel.ts) with full interfaces
- Add allowJs/checkJs:false to tsconfig.resources.json for incremental TS migration
- Add symlink dedup to extension-discovery.ts (seenRealPaths Set)
- Add before_provider_request delegate back to native-search.js so
session budget tests exercise the middleware end-to-end
- Fix parseSfNativeTools() to return all SF manifest tools (drop sf_ filter)
- Fix test assertions: plan_milestone/complete_task/validate_milestone
- Remove subagent from app-smoke.test.ts (folded into sf/subagent/)
- Remove sf-permissions/sf-inturn-guard/subagent from features-inventory test
- Fix resolveSearchProvider autonomous mode test to pass 'auto' explicitly
- Remove legacy /clear slash command (conflicts with built-in clear_terminal)
- Update web-command-parity-contract.test.ts for clear removal
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Create config.ts with McpServerConfig types and readMcpConfigs/getServerConfig
- Create auth.ts with buildHttpTransportOpts and createCliOAuthProvider
- Create connection-manager.ts with McpConnectionManager class
- Create index.ts re-exporting the public API
- Export McpConnectionManager and helpers from @singularity-forge/coding-agent
- Rewrite mcp-client extension as thin wrapper using McpConnectionManager
- Rewrite auth.js as re-export shim from @singularity-forge/coding-agent
- Update test to import buildHttpTransportOpts from @singularity-forge/coding-agent
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
v1 no longer exists — the suffix is just noise. Update all import sites
and rename the test file to match.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Fix memory-embeddings-llm-gateway tests: add queryInstruction field to
expected config objects after loadGatewayConfigFromEnv was updated to
return it
- Add STYLEGUIDE.md: SF code standards adapted from ace-coder patterns
(purpose doctrine, principles, anti-patterns STY001-012, thresholds,
naming, patterns, documentation sections)
- Phase 2 /sf prefix removal: update all web components, browser dispatch,
and tests to use direct commands (/autonomous, /stop, /next, /discuss,
/init, /new-milestone) instead of /sf-prefixed forms
- workflow-actions.ts: all command strings updated
- chat-mode.tsx: SF_ACTIONS array updated
- project-welcome.tsx: primaryCommand values updated
- command-surface.tsx: fallback display updated
- remaining-command-panels.tsx: usage examples updated
- browser-slash-command-dispatch.ts: add stop/new-milestone/init to
SF_PASSTHROUGH_COMMANDS so they route correctly to the extension
- recovery-diagnostics-service.ts: suggestion commands updated
- welcome-screen.ts: hint text updated
- All affected tests updated to match new command strings
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Integration of 3 quick wins into existing UOK infrastructure:
1. Model Learning (Quick Win #2) → metrics.js
- Record outcomes to model-learner for per-task-type performance tracking
- Hook: recordUnitOutcome() now calls ModelLearner.recordOutcome()
- Fire-and-forget: never blocks outcome recording on learning failure
- Enables adaptive model routing decisions in downstream gates
2. Self-Report Fixing (Quick Win #1) → triage-self-feedback.js
- Auto-fix high-confidence reports (>0.85) in applyTriageReport()
- Hook: After triage and requirement promotion, apply auto-fixes
- Fire-and-forget: never blocks report application on fix failure
- Returns reportsAutoFixed count for triage metrics
3. Knowledge Injection (Quick Win #3) → already integrated in auto-prompts.js
- Already active in execute-task prompt template
- Semantic matching with graceful degradation
All integration points:
- Fire-and-forget: learning/fixing failures never block dispatch
- UOK-native: use existing outcome recording, db, gates
- Backward compatible: applyTriageReport now async, but callers handle it
- No new dependencies: all modules already in codebase
Testing: 2934 tests pass (no regressions from integration)
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>