Commit graph

365 commits

Author SHA1 Message Date
Mikael Hugo
0b187b9f62 fix(headless): remove legacy v1 fallback path 2026-05-15 20:12:00 +02:00
Mikael Hugo
ced90e84a8 test(headless): update v2 migration tests for fatal-by-default fallback policy
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-15 20:00:02 +02:00
Mikael Hugo
92ff8186ba feat(prompts): add v2 migration regression tests + fix template variable drift
- Migrate all remaining v1 builders (research-milestone, complete-slice,
  run-uat, reassess-roadmap, deploy, smoke-production, release, rollback,
  challenge) from composeInlinedContext to composeUnitContext v2.
- Remove unused composeInlinedContext import from auto-prompts.js.
- Add 7 regression tests in auto-prompts-v2-migration.test.mjs covering
  all migrated builders.
- Fix template variable drift: deploy.md expected {{releaseVersion}} and
  release.md expected {{newVersion}} — neither builder provided them.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-15 19:46:13 +02:00
Mikael Hugo
725affd126 feat(self-feedback): purpose_anchor on entries (ADR-0000 restoration, v71)
SF is a purpose-to-software compiler — every self_feedback row must name
the milestone vision or slice goal it's filed against, so triage can
prioritize against purpose rather than treating each row as floating.

  - Schema v71 ALTERs self_feedback ADD COLUMN purpose_anchor TEXT.
    NULL allowed for legacy rows; fresh-DB CREATE includes the column.
  - sf-db-self-feedback.js: insertSelfFeedbackEntry accepts purposeAnchor
    (camelCase), stored as :purpose_anchor; listSelfFeedbackEntries({purpose})
    pushes a LIKE %fragment% filter into the DB layer so triage doesn't
    have to pull the full table.
  - rowToSelfFeedback exposes purposeAnchor, falling back to the JSON
    projection for legacy rows where the column is NULL.
  - headless-feedback CLI: `feedback add --purpose <fragment>` persists
    the anchor; `feedback list --purpose <fragment>` filters by it.
    Omission stays valid — restoration is additive, not breaking.
  - help-text + migration test updated; new vitest covers add/list
    round-trip, NULL-on-omit legacy compat, substring match, and the
    help-text documentation contract.

Restores the doctrine in docs/adr/0000-purpose-to-software-compiler.md:
"non-trivial artifacts must name their purpose and consumer."

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-15 18:51:52 +02:00
Mikael Hugo
5e2c7a7166 merge(P1): vision quality gate on sf new-milestone (ADR-0000) 2026-05-15 18:45:08 +02:00
Mikael Hugo
aa0d57371e feat(headless): enforce ADR-0000 PDD-fields gate at new-milestone
Restoration of forgotten doctrine: ADR-0000 declares the eight PDD
fields (Purpose, Consumer, Contract, Failure boundary, Evidence,
Non-goals, Invariants, Assumptions) the purpose gate, but
`sf headless new-milestone --context <file>` was accepting any
context including empty or trivially-thin seed docs. This wires a
pre-create check that refuses the run when fields are missing or
too thin, naming exactly which ones so the operator can fix the
seed doc and retry.

- new src/resources/extensions/sf/headless-pdd-check.js: scans
  context for the eight fields (heading and inline-label forms) and
  reports missing/sparse, plus a minimum-spine check (Purpose +
  Consumer + Contract + Evidence-or-Falsifier).
- src/headless.ts calls the check after loadContext, before
  bootstrapping .sf/. Refusal exits 1 with formatPddRefusal text.
- --skip-pdd-check is the migration escape hatch (warning printed,
  PDD gate bypassed) for milestones that pre-date the gate.
- SF-internal auto-bootstrap (autonomous→new-milestone fallback)
  is exempted because the seed is SF-generated, not operator-PDD.
- vitest test covers missing-Purpose, missing-Consumer, all-8,
  sparse, inline-label form, Falsifier-as-Evidence spine, and the
  doctrine field order.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-15 18:37:06 +02:00
Mikael Hugo
5cd5e14160 feat(headless): surface memory auth pause 2026-05-15 18:16:08 +02:00
Mikael Hugo
90b8e7edf8 feat(headless): expose memory extraction diagnostics 2026-05-15 18:13:35 +02:00
Mikael Hugo
881fd5e304 feat(memory,state): runtime counters for memory injection + milestone work validation
Memory injection telemetry:
- Move counter writes from auto-prompts.js to memory-store.js (where
  getRelevantMemoriesRanked/getActiveMemoriesRanked actually fire).
- Track memory_inject_count and memory_inject_chars_total via
  runtime_counters table for headless-query reporting.

State-db validation:
- handleAllSlicesDone now checks if any slice carries real work
  (status=complete/done) before routing to validation.
- Milestones with all-skipped slices route to "reassess-roadmap"
  instead of asking the operator to validate non-existent work.

SM client defense:
- Filter foreign-tenant memories from SM query responses even when
  the server returns them (defense-in-depth).

Tests updated: memory-extraction-lifecycle, sf-db-migration,
headless-query-memory-injection, sm-client, memory-tenant-gate.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-15 17:57:45 +02:00
Mikael Hugo
ff333ae067 feat(memory): surface injection token cost in headless query
The Project Memories section is rendered into every execute-task,
plan-slice, and research-slice prompt. At 10 memories × ~200 chars
each that's ~2K chars/turn injected into the context — real cost,
no operator-visible meter.

Adds two runtime_counters (already-existing key/value store):

  memory_inject_chars_total  — cumulative section size
  memory_inject_count        — number of injections

Written by buildProjectMemoriesSection() on every render. Both
writes sit inside a try/catch so a legacy DB without
runtime_counters silently skips rather than blocking prompt build.

`sf headless query` surfaces the cumulative + derived metrics as a
new top-level `memoryInjection` block:

  {
    total_chars: 12480,
    count: 8,
    avg_chars: 1560,
    estimated_total_tokens: 3120
  }

The block is omitted entirely when count is 0 (fresh project / no
prompts rendered yet) so it doesn't clutter the snapshot.

Operators can now correlate prompt size growth against autonomous
run cost without instrumenting the LLM call sites directly. The
estimated_total_tokens is chars/4 — a rough approximation since SF
doesn't tokenise the section, intentionally documented as such.

Resolves sf-mp723yl9-rcxoeh filed via the headless feedback CLI.

Tests: 5 source-level invariants — type carries the section, query
reads counters by name, snapshot omits section on zero, write side
calls both counter functions, write is wrapped in try/catch with
documented failure-mode comment.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-15 17:55:14 +02:00
Mikael Hugo
362af3d6a4 fix(headless): bypass rpc for status
Some checks failed
CI / detect-changes (push) Has been cancelled
CI / docs-check (push) Has been cancelled
CI / lint (push) Has been cancelled
CI / build (push) Has been cancelled
CI / integration-tests (push) Has been cancelled
CI / windows-portability (push) Has been cancelled
CI / rtk-portability (linux, blacksmith-4vcpu-ubuntu-2404) (push) Has been cancelled
CI / rtk-portability (macos, macos-15) (push) Has been cancelled
CI / rtk-portability (windows, blacksmith-4vcpu-windows-2025) (push) Has been cancelled
2026-05-15 17:32:21 +02:00
Mikael Hugo
cf32e79578 feat(memory-embeddings): read SF_LLM_GATEWAY_KEY from env as auth.json fallback
Enables CI and containerised deployments without writing secrets to disk.
Auth.json still takes precedence when present.

- readGatewayFromAuthJson now falls back to SF_LLM_GATEWAY_KEY env var
- SF_LLM_GATEWAY_URL env var also supported for endpoint override
- Added tests for env fallback, auth.json preference, and default URL

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-15 17:13:40 +02:00
Mikael Hugo
d8f56e6704 feat(cli): add sf key subcommand for auth.json management
Surgical read/write access to ~/.sf/agent/auth.json without touching
the file directly. All mutations go through AuthStorage so file-lock
and chmod-600 invariants are always respected.

  sf key set    <provider> <api-key>   add/rotate stored key
  sf key get    <provider>             show masked key (last 4 chars)
  sf key remove <provider> [--yes]     remove credential
  sf key list                          list all providers + status

Rationale: SF's source of truth for credentials is auth.json at
runtime — env vars are only used during initial one-time provider
setup. Rotation needs an explicit, audit-friendly path, not implicit
env-driven re-reads. Keys are never echoed in full (last 4 chars
only); remove always prompts unless --yes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-15 16:37:04 +02:00
Mikael Hugo
351bfad41d fix(memory): extractTranscriptFromActivity now reads custom_message entries
Some checks are pending
CI / detect-changes (push) Waiting to run
CI / docs-check (push) Blocked by required conditions
CI / lint (push) Blocked by required conditions
CI / build (push) Blocked by required conditions
CI / integration-tests (push) Blocked by required conditions
CI / windows-portability (push) Blocked by required conditions
CI / rtk-portability (linux, blacksmith-4vcpu-ubuntu-2404) (push) Blocked by required conditions
CI / rtk-portability (macos, macos-15) (push) Blocked by required conditions
CI / rtk-portability (windows, blacksmith-4vcpu-windows-2025) (push) Blocked by required conditions
Activity JSONL logs use `type: "custom_message"` with `customType: "sf-auto"`
for assistant reasoning content. The old code only checked `role === "assistant"`,
so every transcript was empty → extraction silently skipped every unit.

Fix: recognise both legacy (`role === "assistant"`) and modern
(`custom_message` with `sf-*` prefix) entry shapes. Also reads the
standalone `text` field used by custom messages.

This is why memory_processed_units had 0 rows despite 34 activity logs.

Tests: 186 files / 1994 tests pass.
Type check: clean.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-15 16:13:26 +02:00
Mikael Hugo
0a332f4cba fix(headless): normalize auto alias to autonomous 2026-05-15 14:32:00 +02:00
Mikael Hugo
f3454de58a fix(triage): --run routes through runTriageApply{dryRun:true} via SF router
Closes sf-mp5khix3-9beona architecture-defect:triage-run-bypasses-sf-routing.

The legacy `runTriage` in self-feedback-drain.js hardcoded
DEFAULT_TRIAGE_MODEL="google-gemini-cli/gemini-3-pro-preview" and
dispatched via @singularity-forge/ai completeSimple (text-only, no
tools). The result: an autonomous triage path that produced a markdown
decision matrix operators had to manually apply via resolve_issue.

Now `--run` goes through runTriageApply with a new `dryRun: true`
option that:
- uses the same Phase 1/2 pipeline as --apply (triage-decider + review)
- pre-resolves the model via SF's router (rankTriageModelsViaRouter),
  no hardcoded model
- skips Phase 3 applyTriagePlan (read-only by design)
- uses permissionProfile="low" and relaxes the trusted-source +
  custom-runner guards for the inspection path
- prefixes flowId with "triage-run-" for clean trace separation

Legacy runTriage kept as @deprecated (still exercised by
self-feedback-drain.test.mjs unit tests that target completeSimple
dispatch directly).

Tests: 6 new in headless-triage-run-routing.test.ts covering dryRun
short-circuit, no ledger mutations, guard relaxation, router not
hardcoded, disagreement surfaces deciderOutput. Full triage suite:
35 tests pass, 0 regressions.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-15 09:20:43 +02:00
Mikael Hugo
81425230f5 fix(headless): do not restart graceful child exits 2026-05-15 07:25:06 +02:00
Mikael Hugo
62f886430c fix: run subagents in process by default 2026-05-15 03:59:34 +02:00
Mikael Hugo
8b0f0bbd65 fix: harden headless dogfood self-healing 2026-05-15 03:53:15 +02:00
Mikael Hugo
3ac5aede1e fix: repair headless runtime self-healing 2026-05-15 03:33:29 +02:00
Mikael Hugo
ca7ff554c3 feat(swarm): integrate LLM runner into AgentSwarm.run()
- Make AgentSwarm.run() async with optional enableLLM flag
- Wire runAgentTurn from agent-runner.js into all 4 topologies
  (round_robin, supervisor, dynamic, sleeptime)
- Update drainSleeptimeQueue to use runAgentTurn for actual LLM
  execution instead of passive inbox reading
- Export runAgentTurn, runAgentLoop, runSwarmTurn from uok/index.js
- Update PersistentAgent JSDoc to reflect runner exists
- Fix test imports after extension consolidation (ttsr, google-search)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-15 03:05:01 +02:00
Mikael Hugo
a3b68bb269 fix(env): align SF_PERMISSION_LEVEL enum with permission-profile values
Schema now accepts the same five levels used elsewhere in the codebase
(minimal/low/medium/high/bypassed) instead of the stale full/restricted/
sandbox triple. Docs and env test updated to match.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 21:11:36 +02:00
Mikael Hugo
18aa257ede refactor: rename review gate agent 2026-05-14 19:43:01 +02:00
Mikael Hugo
62fbc5d57b refactor: align agent resource overlays 2026-05-14 19:32:41 +02:00
Mikael Hugo
7003da3f6a test(uok): assert triage-apply-mutation-gate fires after agree-path
Codex audit (Q4) flagged that the mutation gate landed in slice 3a but
the test suite only verified the three earlier gates. Add coverage:

- agree-path: mutation-gate fires with outcome=fail, rejectedCount=1,
  resolvedCount=0 (the test fixture has no real ledger entry for the
  decision id, so markResolved rejects it — the gate correctly surfaces
  the partial failure)
- disagree-path: mutation-gate does NOT fire (apply phase skipped)

Pins the 4-gate contract end-to-end. Suite: 4/4 in this file.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 18:16:04 +02:00
Mikael Hugo
61d3031007 test(uok): fail-closed contract for triage-apply gate emission
Adds the missing test case that confirms the fail-closed semantics
the parallel worker shipped in slice 3a: when the trace writer
cannot persist a UOK gate record (e.g. .sf/traces is unwritable),
runTriageApply MUST abort before any subagent runs and surface the
emission failure as the run error.

This pins down the contract codex Q5 noted as soft: enrichment
failures are debug-only, but PRIMARY gate emission for the apply
flow is hard-required. Without observable gates, an apply that
mutates the ledger has no audit trail — refusing is the right call.

Test asserts: trace-dir write failure → ok=false, error contains
"UOK gate emission failed for trusted-agent-source-gate", and the
mocked agentRunner was never invoked.

Suite: 1682/1682.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 18:08:29 +02:00
Mikael Hugo
454e051aed feat(uok): slice 3a — triage --apply emits 4 schema-v2 UOK gates
First production caller of the schema-v2 writer chain. Every
`sf headless triage --apply` invocation now emits four gate_run trace
events with surface=headless, runControl=supervised, permissionProfile=
high, traceId=flowId — making the gates visible in `status uok --json`
with coverageStatus: "ok" (or fail/manual-attention on reject paths).

Gates emitted, in order:

  1. trusted-agent-source-gate — fires on the trust precondition:
       pass: both triage-decider and rubber-duck are SF-shipped built-ins
       fail: missing-agent OR non-builtin source OR untrusted custom runner
       (covers all three pre-dispatch refusal paths so operators see the
       failure in status uok, not just in the journal)
  2. triage-plan-validation-gate — fires on the strict-parse contract:
       pass: parseTriagePlanStrict returns a valid plan covering expectedIds
       fail: missing marker / bad yaml / unknown id / outcome-required field missing
  3. triage-apply-review-gate — fires on the rubber-duck verdict:
       pass: rubber-duck: agree → apply phase proceeds
       fail: rubber-duck disagreed → clean pause, no mutations
       manual-attention: rubber-duck subagent failed to complete
  4. triage-apply-mutation-gate — fires after applyTriagePlan:
       pass: every approved mutation landed
       fail: any rejected mutation
       manual-attention: zero approved mutations (all decisions were "fix")
     Includes counts in extra: resolvedCount, rejectedCount, pendingFixCount.

Reader-side fixes (codex review follow-up on slice 3a):

  - getDistinctGateIds (sf-db-gates.js) now UNIONs trace-event IDs with
    quality_gates DB IDs instead of returning trace IDs early when any
    exist. The old behavior silently hid slice-scoped DB-only gates the
    moment a flow-scoped trace landed.
  - getGateMeta (headless-uok-status.ts) now reads BOTH trace events and
    DB row, then picks whichever has the later evaluatedAt. Tie-break
    prefers trace (flow-scoped gates with no quality_gates FK row are
    trace-only). Old behavior preferred trace whenever surface was set,
    regardless of timestamp.

Live verification: ran `sf headless triage --apply` 4 times against the
operator's environment (rubber-duck is a project-level override).
trusted-agent-source-gate now shows in `sf headless status uok --json`
with total: 4, fail: 4, coverageStatus: "ok" — proving the schema-v2
metadata round-trips through the trace events and reaches the
classifier.

Tests:
  - headless-triage-uok-gates.test.ts (3 new tests): agree path emits
    3 pass gates with v2 metadata; disagree path emits review fail;
    unknown-id path emits validation fail with no review gate.
  - Existing test suites adjusted for the GateMetadataRow →
    GateRunContextRow rename (classifier helpers renamed consistently
    across .ts source and the .mjs test mirror).
  - Full SF + headless apply: 1681/1681.

Still legacy in production (slice 3b targets these next):
  - phases-pre-dispatch.js gates: resource-version-guard, pre-dispatch-
    health-gate, planning-flow-gate. None of these pass uokContext yet.
  - phases-unit.js gates: unit-verification-gate, plan-gate.
  - plan-slice.js: Q3/Q4/Q5/Q6/Q7/Q8 seed gates.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 18:04:50 +02:00
Mikael Hugo
289bf9e264 fix(triage-apply): strict plan validation + custom-runner guard + per-decision failures
Codex review follow-up (2026-05-14) addressed all three remaining
issues from the earlier rescue pass:

1. Strict plan validation. parseTriagePlanStrict refuses the WHOLE
   plan on any malformed item instead of silently dropping. Enforces:
   - completion marker "Self-feedback triage complete" present
   - exactly one fenced ```yaml block
   - every decision has non-empty id + outcome ∈ {fix, promote, close}
   - outcome-specific required fields (close → reason; promote →
     reason + requirement_id; fix → proposed_approach)
   - duplicate ids rejected
   - when expectedIds is supplied, decisions must cover the candidate
     set exactly — no extras (hallucinated ids), no missing
   Returns ParseTriagePlanResult with {plan, error} so the caller can
   surface the specific failure reason.

2. Custom-runner trust guard. runTriageApply refuses an injected
   options.agentRunner unless allowUntrustedRunner is also explicitly
   set. Production callers cannot inject a runner. Without this guard
   a custom runner could side-channel-mutate the ledger despite the
   read-only tool override (codex Q2).

3. Per-decision failure surfacing. applyTriagePlan now returns
   {resolvedIds, rejectedIds, pendingFixIds} instead of just
   resolvedIds. runTriageApply reports ok=false if rejectedIds is
   non-empty, with the count + ids in the error message. Mutations
   still happen one-by-one (no SQL transaction wrapping) but the
   failure is no longer silent (codex Q3).

Tests: src/tests/headless-triage-apply.test.ts now covers:
   - agree-path runs both agents in order; apply fails on missing
     ledger entry → ok=false, rejectedIds populated (the realistic
     contract for a test fixture without a seeded DB)
   - custom runner without allowUntrustedRunner refuses, agentRunner
     never invoked
   - rubber-duck disagrees → clean pause, ok=false, agreed=false
   - decider fails → skip rubber-duck
   - unknown id in plan rejected before review

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 17:19:12 +02:00
Mikael Hugo
fa9baf71d5 feat(secret-scan): SF_SECURITY_FAST contract for the regex-only fast path
Codifies AC4 of sf-mp4w2dij-xm6cwj: the regex-only path is the
today-default fast mode. SF_SECURITY_FAST=1 is the explicit opt-in for
callers that want to assert "regex-only, no LLM escalation, sub-100ms"
regardless of any future tiered reviewer landing in the script.

Today the env var changes only the trailing status line so operators
can verify the contract is observable. When the LLM-backed review hook
(AC1) lands, the absence of SF_SECURITY_FAST becomes the trigger for
escalation; setting it=1 keeps offline / pre-commit callers on the
fast path. Locked in by tests in both the .sh and .mjs scanners.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 07:57:02 +02:00
Mikael Hugo
32cfb6224b test: migrate node:test imports to vitest and stabilize timing thresholds
- Three .test.mjs files now import describe/it from vitest, matching the
  harness CLAUDE.md mandates for the SF extension suite.
- schedule-e2e local readEntries threshold raised 50ms → 100ms with a
  comment noting full-suite parallelism adds scheduler/filesystem jitter
  on dev machines (CI threshold unchanged at 200ms).
- e2e-smoke "headless new-milestone without --context" timeout raised
  10s → 30s so the exit-1 assertion isn't flaky under load.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 21:30:21 +02:00
Mikael Hugo
0b5fa75c0d fix(lint): fix all pre-existing lint failures
- check-sf-extension-inventory.mjs: expand parseDirectRegisteredCommands()
  scan to include 7 more files (guards/inturn.js, notifications/notify.js,
  permissions/index.js, ui/usage-bar.js, commands/legacy/audit.js,
  commands/legacy/create-extension.js, commands/legacy/create-slash-command.js)
  and filter results by BASE_RUNTIME_COMMAND_NAMES to exclude doc-string false
  positives ("name" in create-slash-command.js template text)

- extension-manifest.json: remove 'clear' (subcommand of logs/notifications,
  never a top-level pi.registerCommand)

- packages/pi-agent-core/src/db/sf-db.ts: fix 23 noVoidTypeReturn errors
  - openDatabase: void → boolean (caller uses return value at line 5625)
  - claimEscalationOverride: void → boolean (caller checks at escalation.js:243)
  - resolveSelfFeedbackEntry: void → boolean (caller checks at self-feedback.js:387)
  - copyWorktreeDb: void → boolean (caller checks at reconcileWorktreeDb)
  - compactUokMessages: void → {before,after} (caller returns value at message-bus.js:238)
  - insertSessionTurn: void → bigint|null (caller uses id at session-recorder.js:104)
  - expireStaleMemories: void → number (caller uses count at auto-start.js:1047)
  - deleteMemorySourceRow: void → boolean (caller returns value at memory-source-store.js:107)
  - deleteMemoryEmbedding: void → boolean (caller returns value at memory-embeddings.js:328)
  - updateBacklogItemStatus: remove dead return expression (callers discard value)
  - removeBacklogItem: remove dead return expression (callers discard value)
  - updateGateCircuitBreaker: remove dead return {total,avgMs,...} (wrong-type
    code accidentally merged from getGateLatencyStats, never reachable)
  - markUokMessageRead: remove dead return true/false (callers discard value)

- Auto-fix formatting and organizeImports in ~30 source files (biome --write)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-11 04:02:31 +02:00
Mikael Hugo
adb449d642 fix: consolidate extensions into sf, migrate kernel.ts, fix test suite
- Fold sf-usage-bar, sf-notify, sf-inturn-guard, sf-permissions,
  slash-commands into sf extension (ui/, notifications/, guards/,
  permissions/, commands/legacy/)
- Delete vectordrive extension
- Migrate uok/kernel.js to TypeScript (kernel.ts) with full interfaces
- Add allowJs/checkJs:false to tsconfig.resources.json for incremental TS migration
- Add symlink dedup to extension-discovery.ts (seenRealPaths Set)
- Add before_provider_request delegate back to native-search.js so
  session budget tests exercise the middleware end-to-end
- Fix parseSfNativeTools() to return all SF manifest tools (drop sf_ filter)
- Fix test assertions: plan_milestone/complete_task/validate_milestone
- Remove subagent from app-smoke.test.ts (folded into sf/subagent/)
- Remove sf-permissions/sf-inturn-guard/subagent from features-inventory test
- Fix resolveSearchProvider autonomous mode test to pass 'auto' explicitly
- Remove legacy /clear slash command (conflicts with built-in clear_terminal)
- Update web-command-parity-contract.test.ts for clear removal

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-11 02:40:52 +02:00
Mikael Hugo
24592507c3 sf snapshot: uncommitted changes after 53m inactivity 2026-05-11 01:54:55 +02:00
Mikael Hugo
2dea73398d fix(learning): add save_knowledge to manifest, failure_mode to aggregator SELECT + index
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-10 23:18:02 +02:00
Mikael Hugo
3fba4bcb03 refactor(mcp): move MCP connection manager to packages/coding-agent/src/core/mcp/
- Create config.ts with McpServerConfig types and readMcpConfigs/getServerConfig
- Create auth.ts with buildHttpTransportOpts and createCliOAuthProvider
- Create connection-manager.ts with McpConnectionManager class
- Create index.ts re-exporting the public API
- Export McpConnectionManager and helpers from @singularity-forge/coding-agent
- Rewrite mcp-client extension as thin wrapper using McpConnectionManager
- Rewrite auth.js as re-export shim from @singularity-forge/coding-agent
- Update test to import buildHttpTransportOpts from @singularity-forge/coding-agent

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-10 22:19:46 +02:00
Mikael Hugo
d33e30e885 feat(notifications): NOTICE_KIND enum, schema v2 dedup, sf-db cleanup
- notification-store: schema v2 — repeatCount/lastTs merge for non-blocking
  notices; NOTICE_KIND enum (SYSTEM_NOTICE, TOOL_NOTICE, BLOCKING_NOTICE,
  USER_VISIBLE) for renderer classification without message parsing
- sf-db: remove gate_runs and audit_events tables (replaced by uok audit.js
  and trace-writer); schema reduced by ~370 lines
- notify-interceptor: tag auto-mode system notices with NOTICE_KIND.SYSTEM_NOTICE
- auto-prompts, guided-flow, system-context: use NOTICE_KIND on emit calls
- cli-status: expanded headless status surface + test coverage
- headless-types: new status fields
- Makefile/justfile: dev workflow improvements
- record-promoter, requirement-promoter: minor cleanup
- sf-db-migration tests: updated for dropped tables
- uok-gate-runner, uok-metrics, uok-outcome, uok-status tests: updated

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-10 20:13:58 +02:00
Mikael Hugo
280303ef9a fix(lint): reformat 6 files touched during web dep upgrade
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-10 12:10:10 +02:00
Mikael Hugo
02a4339a51 refactor: rename pi-* packages to forge-native names (Phase 1)
Rename all four packages/pi-* directories to forge-native names,
stripping the 'pi' identity and establishing forge's own:

- packages/pi-coding-agent → packages/coding-agent
- packages/pi-ai → packages/ai
- packages/pi-agent-core → packages/agent-core
- packages/pi-tui → packages/tui

Package names updated:
- @singularity-forge/pi-coding-agent → @singularity-forge/coding-agent
- @singularity-forge/pi-ai → @singularity-forge/ai
- @singularity-forge/pi-agent-core → @singularity-forge/agent-core
- @singularity-forge/pi-tui → @singularity-forge/tui

All import references, bare string references, path references,
internal variable names (_bundledPi*), and dist files updated.
@mariozechner/pi-* third-party compat aliases preserved.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-10 11:28:01 +02:00
Mikael Hugo
05953e9599 fix(lint): restore 0 Biome diagnostics and fix web-mode-onboarding test timeout
- Remove/prefix unused imports and variables across 11 src/ files to clear
  74 diagnostics introduced by 37 subsequent commits since run #3
- Fix pre-existing timeout in web-mode-onboarding integration test:
  - Add timeoutMs: 120_000 to launchPackagedWebHost call (was unbounded)
  - Raise AbortSignal.timeout on simple fetches 10s → 30s (under parallel load)
  - Raise overall test timeout 180s → 420s (budget: 120+60+30+30+120+30=390s)
- Log autoresearch run #4 and update lessons in autoresearch.md

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-10 11:01:43 +02:00
Mikael Hugo
5dbd318a76 refactor(uok): rename scheduler-v2 and plan-v2 to drop v2 suffix
v1 no longer exists — the suffix is just noise. Update all import sites
and rename the test file to match.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-09 14:45:02 +02:00
Mikael Hugo
22cbd83675 fix: update test snapshots for queryInstruction and complete /sf prefix Phase 2 deprecation
- Fix memory-embeddings-llm-gateway tests: add queryInstruction field to
  expected config objects after loadGatewayConfigFromEnv was updated to
  return it
- Add STYLEGUIDE.md: SF code standards adapted from ace-coder patterns
  (purpose doctrine, principles, anti-patterns STY001-012, thresholds,
  naming, patterns, documentation sections)
- Phase 2 /sf prefix removal: update all web components, browser dispatch,
  and tests to use direct commands (/autonomous, /stop, /next, /discuss,
  /init, /new-milestone) instead of /sf-prefixed forms
  - workflow-actions.ts: all command strings updated
  - chat-mode.tsx: SF_ACTIONS array updated
  - project-welcome.tsx: primaryCommand values updated
  - command-surface.tsx: fallback display updated
  - remaining-command-panels.tsx: usage examples updated
  - browser-slash-command-dispatch.ts: add stop/new-milestone/init to
    SF_PASSTHROUGH_COMMANDS so they route correctly to the extension
  - recovery-diagnostics-service.ts: suggestion commands updated
  - welcome-screen.ts: hint text updated
  - All affected tests updated to match new command strings

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-09 00:17:47 +02:00
Mikael Hugo
e4c951ff0c feat: improve sf runtime self-reload and safeguards 2026-05-08 23:52:35 +02:00
Mikael Hugo
fd06629f06 feat: add centralized LogTape logger module with dev/autonomous modes, PII redaction, and per-session file rotation
- Install @logtape/logtape, @logtape/pretty, @logtape/file, @logtape/redaction
- Create src/logger.ts with configureLogger() and getLogger() exports
- Dev mode: pretty console output with debug level
- Autonomous mode: JSON console + rotating file sink in .sf/logs/{sessionId}/
- PII redaction for API keys (sk-*, key-*, Bearer *) and home directory paths
- Category hierarchy: sf.core, sf.uok, sf.autonomous, sf.extension, sf.web
- Comprehensive tests in src/tests/logger.test.ts (10 tests)
- Wire configureLogger() into src/cli.ts startup path

Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
2026-05-08 19:58:11 +02:00
Mikael Hugo
d548ea01c5 sf snapshot: uncommitted changes after 155m inactivity 2026-05-08 10:08:39 +02:00
Mikael Hugo
19bfc3d3f6 feat(sf): align node sqlite uok runtime 2026-05-08 03:01:20 +02:00
Mikael Hugo
d640aa0949 test(sf): align direct command web contracts 2026-05-08 01:48:50 +02:00
Mikael Hugo
b5893d1c28 Make SF direct command surface baseline 2026-05-08 01:34:07 +02:00
Mikael Hugo
6fc054e7c3 sf snapshot: uncommitted changes after 49m inactivity 2026-05-08 01:07:24 +02:00
Mikael Hugo
89677b7e9b sf snapshot: uncommitted changes after 110m inactivity 2026-05-08 00:17:47 +02:00
Mikael Hugo
deeb4dbd4e sf snapshot: uncommitted changes after 61m inactivity 2026-05-07 16:39:39 +02:00