- gap-audit prompt detection: Add DYNAMICALLY_LOADED_PROMPTS set for prompts
loaded through wrappers (research-slice, plan-slice, execute-task, etc.)
and detect loadPrompt calls with comma-separated args (#sf-moobj36l-ewu7js)
- gap-audit command detection: Detect exact match, prefix match, and
switch/case patterns for command dispatch (#sf-moobj36o-n8b7g9)
- empty task summary: Add isValidTaskSummary() to require non-empty content
with frontmatter or H1 before reconciliation marks task complete
(#sf-moobj36o-6rxy6e)
- journal write failures: Emit bounded health warning to .write-failures.jsonl
on journal write failure with per-session dedup (#sf-moobj36p-ikq3b2)
- resource sync manifest divergence: Add verifyManifestFilesExist() to check
all manifest-listed files exist on disk after hash match (#sf-moody5qi-8gbwp2)
- self-feedback markdown stale: Regenerate SELF-FEEDBACK.md from jsonl on
markResolved with resolved entries section (#sf-moobj36p-rlo95i)
- self-feedback context bloat: Cap entries to 20 max, 4000 chars, inject
compact summaries only with pointer to jsonl for full evidence
(#sf-moobj36p-ko6snt)
- hook-emitter types: Replace unknown with EventResult discriminated union,
implement emitExtensionEvent call with fallback warning when _pi missing
(#sf-moobmhwt-bxejb6, #sf-moobmhx4-gk9g83)
- export visualizer types: Add VisualizerExportData interface with proper
PhaseAggregate/SliceAggregate/ModelAggregate/ProjectTotals types
replacing any (#sf-moobmhx0-ow5fhy)
- native-edit-bridge: Already resolved (artifact removed from repo)
(#sf-moobj36q-z4id3u)
Three production gaps Codex's adversarial review flagged are now closed:
1. Real legacy-vs-UOK parity diff (per turn, per plane):
- parity-diff-capture.ts captures plan / graph / model-policy /
audit-envelope / gitops decisions for both paths and emits
ParityDiffEvent records to .sf/runtime/uok-parity.jsonl.
- parity-report.ts aggregates divergencesByPlane, populates
criticalMismatches with real divergence summaries, and tracks
enterEvents / exitEvents / missingExitEvents for symmetry.
2. Exit-event symmetry:
- sessionId / turnId now flow through enter+exit parity events.
- writeParityHeartbeat lets kernel/loop-adapter emit best-effort
diagnostics on plane failure paths so missing-exit gaps shrink.
3. Commit-gating on divergence or missing-exit:
- resolveParitySafeGitAction (in uok/gitops.ts) reads the parity
report and downgrades turn_action to status-only when divergence
count > 0 or missing-exit count > 0 — UOK can no longer commit
on top of unverified state.
- auto-post-unit.ts now resolves a configuredTurnAction from UOK
flags then asks the parity gate for the safe action; the gate's
decision is what flows to the actual git op.
- new test: tests/uok-gitops-commit-gate.test.ts.
- existing gitops-wiring assertion updated for the renamed
configuredTurnAction (semantic preserved).
Tests: 53/53 UOK pass.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Since Node >= 24 is the minimum engine, remove the better-sqlite3 fallback
chain from sf-db.ts, unit-ownership.ts, and cli-stats.ts. Use DatabaseSync
from node:sqlite directly. Also replace the `glob` npm package with built-in
node:fs/promises.glob and node:fs.globSync in pi-coding-agent LSP utils.
- Remove createRequire boilerplate and suppressSqliteWarning helper
- Simplify loadProvider() and openRawDb()
- Net -177 lines of fallback/middleware code
💘 Generated with Crush
Assisted-by: GLM-5.1 via Crush <crush@charm.land>
- Convert remaining node:test → vitest imports in packages/* and studio/*
- Fix mock.callCount() → mock.callCount property access for vitest compat
- Fix mock.calls[N].arguments → mock.calls[N] for vitest compat
- Update tsconfig.extensions.json to exclude test files from tsc
- Harden migrate-to-vitest-all.mjs regex for single quotes and optional semicolons
Add vitest.config.ts with forks pool, v8 coverage, and package aliases.
Run migrate-to-vitest.mjs to replace `from "node:test"` imports with
`from 'vitest'` across 749 test files, converting mock.fn→vi.fn and
mock.timers→vi fake timers where needed.
💘 Generated with Crush
Assisted-by: GLM-5.1 via Crush <crush@charm.land>
- openai-codex.ts: replace hand-rolled PKCE flow with simple read of
~/.codex/auth.json written by the real codex CLI after user authentication.
Removes ~250 lines of local callback server + browser dance code.
- openai-codex-responses.ts: minor residual cleanup
- openai-completions.ts: drop remaining `as any` stream_options cast
- anthropic-shared.ts: use `unknown` cast on thinkingNoBudget path
- pi-coding-agent/extensions/types.ts: minor type addition
- db-tools.ts: explicit AgentToolResult return type on execute handlers
- requesting-code-review/SKILL.md: prompt wording cleanup
- subagent/index.ts: capability registration wiring
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- package.json: add 'typecheck' script (build:pi + tsc --noEmit) so pi-ai
and pi-coding-agent typecheck under the same command surface SF uses.
- anthropic-shared.ts: replace 'as any' casts with proper Anthropic SDK
types (ServerToolUseBlockParam, WebSearchToolResultBlockParam,
CacheControlEphemeral). The cache_control variant is documented inline
so the cast is auditable.
- openai-completions.ts: drop the 'as any' on stream_options — the type
system can verify the assignment now.
- openai-codex-responses.ts, package-manager.ts, skills.ts: annotate the
three remaining empty catches with one-line WHY comments (best-effort
cleanup, malformed ignore files, partial directory traversal). Empty
catch with no rationale is an SF012 anti-pattern; with rationale it is
a deliberate fallback.
- oauth/github-copilot.ts, oauth/openai-codex.ts: add UPSTREAM AUDIT
blocks documenting why these hand-rolled OAuth flows stay hand-rolled
rather than delegating to @octokit/auth-oauth-device or @openai/codex.
AbortSignal coverage and provider-specific surface area are the gating
concerns; re-audit triggers are named.
Replace ~700 LOC of hand-rolled OAuth and onboarding with cli-core's own
getOauthClient + setupUser. The provider now reads ~/.gemini/oauth_creds.json
itself (via cli-core), refreshes tokens, and discovers the Code Assist
project + tier server-side — exactly like the real gemini CLI does.
- provider/google-gemini-cli.ts: drop apiKey={token,projectId} JSON
plumbing; getCodeAssistServer() uses cli-core for everything
- delete utils/oauth/google-gemini-cli.ts (457 LOC: hand-rolled login,
PKCE, callback server, discoverProject, onboardUser, tier handling)
- delete utils/oauth/google-oauth-utils.ts (201 LOC: only consumed by
the deleted gemini-cli helper)
- oauth/index.ts: remove gemini-cli from BUILT_IN_OAUTH_PROVIDERS
registry; google-gemini-cli is no longer SF-managed
- auth-storage.ts: update 3 error messages to direct users to the real
gemini CLI for authentication instead of the removed /login command
Login UX: users authenticate with the real gemini CLI; we just consume
~/.gemini/oauth_creds.json. Whole-provider disable goes through manual
settings.json edit (per-model toggle still works in interactive UI).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replace brittle string-matching in headless-events.ts with structured
source/kind/blocking/dedupe_key metadata on notify() events. String
matching is preserved as a fallback for the ~940 untagged call sites.
- Add NotificationMetadata type to headless-types.ts (canonical definition)
- Extend rpc-types.ts notify event with optional metadata field
- Extend ExtensionUIContext.notify() signature with optional 3rd arg
- Pass metadata through RPC notify implementation in rpc-mode.ts
- Update headless-events.ts: isTerminalNotification, isBlockedNotification,
isMilestoneReadyNotification, isPauseNotification all check metadata first
- Update notification-store.ts: store metadata on NotificationEntry; use
metadata.dedupe_key as dedup key when provided (falls back to message hash)
- Update notify-interceptor.ts to thread metadata through to store + original
- Tag critical emit sites with structured metadata:
stopAuto → { kind: "terminal" } (+ blocking: true when reason includes "block")
pauseAuto → { kind: "terminal", blocking: true }
guided-flow milestone ready → { kind: "approval_request", blocking: true }
- Update notification-overlay.ts to prefer metadata.source for [label] display
- Add 17-test regression suite (notification-event-model.test.ts)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
84 files spanning provider capabilities, model routing, headless
runtime, sf auto subsystems, gitbook docs, and test coverage. Snapshotted
so headless auto can resume M004 (Production Readiness) S03
(Verification Gate Validation) on a clean tree.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Pi-mono Tier 0 #4 — manual port (sf went off-task; ported directly).
undici's default 300s bodyTimeout aborts long local-LLM SSE streams
(e.g. vLLM buffering a large tool call) with UND_ERR_BODY_TIMEOUT.
retry.provider.timeoutMs cannot lift this cap — it controls the
provider SDK's AbortController, not undici's per-socket idle timer.
Pass {bodyTimeout: 0, headersTimeout: 0} to EnvHttpProxyAgent. Provider
SDKs continue to enforce their own deadlines.
Type-check passes.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Pi-mono Tier 0 #1 (security) — sf-driven port.
Two upstream security fixes (pi-mono PR #3819, #3883) that escape
user-controlled session content before embedding in HTML exports.
Crafted session content (image mime types, image data, model IDs,
tool names, entry IDs) could otherwise inject markup at the export
boundary.
What sf changed in
packages/pi-coding-agent/src/core/export-html/template.js:
- Image tags: escape `mimeType` and `data` attributes for both
tool-result and user-message image renders (PR #3819).
- Session metadata: escape `msg.toolName`, `msg.role`, `entry.modelId`,
`entry.thinkingLevel`, `entry.type`, `entry.id`, and
`globalStats.models` (PR #3883).
- DOM id construction: renamed `entryId` → `entryDomId` and escape
`entry.id` to prevent attribute-breakout from a crafted id.
The existing `escapeHtml()` helper was used at every site; no new
helper introduced. Type-check passes.
Co-Authored-By: sf v2.75.1 (session 150fe2c1)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Two codex-rescue tasks landed together:
1. Auto-coerce JSON-schema validator: when a tool field declares
{type:"array", items:{type:"string"}} and the model sends a single
string, wrap it in [string] before validation instead of hard-rejecting.
Fixes the recurring "keyDecisions: must be array" rejection on
sf_complete_task that wasted retries.
2. Provider_model_allow filter (proper implementation with helpers):
- resolveProviderModelAllowList / isProviderModelAllowed /
filterModelsByProviderModelAllow helpers in preferences-models
- Wired into model-registry and auto-model-selection
- New tests/provider-model-allow.test.ts
Tools coerced: sf_complete_task, sf_complete_milestone, sf_plan_milestone,
sf_plan_slice, sf_replan_slice, sf_reassess_roadmap (key list fields).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: OpenAI Codex <noreply@openai.com>
reassess-roadmap: flip default from true → false. Most reassess units
conclude "roadmap is fine" burning a session for no change; the
plan-slice prompt now carries a JIT preamble at zero cost. (#4778)
tool-execution: always prefer toolDefinition.label when non-empty,
even when label === name — allows tools to display their canonical
name explicitly. (#4758)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
showDeprecationWarnings ran setRawMode(true)/once('data')/setRawMode(false)/
pause() right before pi-tui's own stdin setup. That handoff is fragile —
buffered bytes and mode flips between the migration prompt and the TUI's
raw-mode setup can leave stdin cooked and line-buffered, producing the
'Enter does nothing + garbled typing' symptom.
Warnings now print non-blocking. They stay visible in scrollback above
the TUI, so users still see them without a blocking acknowledge step.
Antigravity (Google's IDE sandbox product, different from Gemini CLI) is
removed from:
src/onboarding.ts — drop from LLM_PROVIDER_IDS + OAuth-flow picker
src/pi-migration.ts — drop from LLM_PROVIDER_IDS migration list
src/web/onboarding-service.ts — drop from web-UI provider list
src/tests/integration/web-onboarding-contract.test.ts — update contract
src/resources/extensions/sf/doctor-providers.ts — drop from CLI_AUTH_PROVIDERS
src/resources/extensions/sf/key-manager.ts — drop UI listing
src/resources/extensions/sf-usage-bar/index.ts — delete entire quota fetcher block (~200 lines)
packages/pi-coding-agent/src/cli/args.ts — drop PI_AI_ANTIGRAVITY_VERSION doc
packages/pi-coding-agent/src/utils/proxy-server.ts — drop from claude provider chain
Reason: antigravity has no vendor-published core library we can embed
(unlike @google/gemini-cli-core for the Gemini CLI). Continuing to
hand-roll OAuth against daily-cloudcode-pa.sandbox.googleapis.com is
exactly the pattern Google has started banning for third-party tools.
Removing the code removes the ban risk.
pi-ai provider code, OAuth util, and models.generated entries for
google-antigravity are removed in follow-up commits (separated for
reviewability — each layer verified independently).
Build passes. Note: this is a breaking change for any user who had
google-antigravity configured — they'll need to migrate to
google-gemini-cli (OAuth), google (API key), or google-vertex.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Prior PROXY_FAMILY_PRIORITY table conflated "direct provider" with
"failover provider that happens to serve this family". Observed case:
claude-* family listed anthropic, google-antigravity, and
github-copilot all as "providers" — but only anthropic is the direct
vendor. google-antigravity re-serves Claude via Google's sandbox
IDE product (same endpoint as gemini-cli, different auth contract);
github-copilot re-serves via GitHub's paid platform.
This matters for the 429 fallback chain: a broken anthropic key
should try genuinely-vendored endpoints first (none, for Claude),
then fall into family_failover (antigravity, copilot), and only then
reach the generic GLOBAL_PROVIDER_FALLBACK (opencode, opencode-go,
openrouter, ollama-cloud). The old all-flat list hid this distinction.
New shape:
{ providers: [...], family_failover?: [...] }
Corrections applied:
claude-*: providers=[anthropic], failover=[google-antigravity, github-copilot]
gemini-*: providers=[google-gemini-cli, google, google-vertex],
failover=[github-copilot]
gpt-* / o* / codex-*: providers=[openai],
failover=[azure-openai-responses, openai-codex, github-copilot]
mimo-*: providers=[xiaomi] (new: was [] — Xiaomi MiMo Open Platform
is direct API at api.xiaomimimo.com / token-plan-sgp.xiaomimimo.com)
buildCandidateOrder stitches [direct, family_failover, global_fallback]
with deduplication. User overrides via settings.proxy.providerPriority
continue to replace only the direct-provider list, keeping family
failover and global fallback intact.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Four related improvements that landed in the working tree after the
auto-hardening merge but hadn't been committed:
1. auth_error as a distinct error type (auth-storage + retry-handler).
Previously invalid/expired API keys would retry the same failing
credential until the retry budget exhausted. Now:
- classifyErrorType() recognizes 401s, "invalid api key",
"authentication error", "unauthorized" etc as "auth_error"
- RetryHandler triggers cross-provider fallback on auth_error just
like it does for rate_limit and quota_exhausted — switch
providers rather than burning retries on a broken key
Outcome: a stale OPENCODE_API_KEY in sops now fails over to kimi or
minimax immediately instead of stalling the unit.
2. Multi-provider search-key detection (native-search.ts).
The "Web search: Set BRAVE_API_KEY" warning fired whenever a
non-Anthropic model lacked BRAVE_API_KEY, even when the user had
TAVILY_API_KEY or OLLAMA_API_KEY available. Now: the warning
suppresses if any of BRAVE/TAVILY/OLLAMA keys is present, and the
warning text lists all three options. Matches the preferences-
validation allow-list for search_provider.
3. MiniMax-M2.7-highspeed benchmark entry (model-benchmarks.json).
Routes the fast-tier variant of M2.7 through the Bayesian blender
with inherited RULER scores. Lets dynamic routing consider the
highspeed model when speed matters more than peak quality.
No regressions: the 41 pre-existing test failures in pi-coding-agent
(FallbackResolver chain-membership + LSP integration) are unchanged
relative to the prior commit.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>