feat(catalog/quota): global model catalog, benchmark coverage audit, provider quota visibility
Phase-1 work shipped together since prior auto-snapshots split it across several commits. This commit captures the leftover type declarations, the new provider-quota-cache test suite, and the last register-hooks / cli wiring. Highlights now in tree: - Model catalog moved from per-project to global `~/.sf/model-catalog/` via `sfHome()` (one cache shared by all repos; no more 9-dir duplication). - `benchmark-coverage.js` audits the dispatchable model set against `learning/data/model-benchmarks.json` at session_start, writes `~/.sf/benchmark-coverage.json`, notifies on change. - `provider-quota-cache.js` introduces phase-1 subscription quota visibility for the 5 providers with documented APIs: kimi-coding (/coding/v1/usages), openrouter (/api/v1/credits), minimax (/v1/token_plan/remains), zai (/api/monitor/usage/quota/limit), google-gemini-cli (existing snapshotGeminiCliAccount). 15-min TTL, global cache. - `sf --maintain` CLI flag refreshes catalogs + quotas + coverage audit in one idempotent pass. Daemon spawns it every 6h. - `sf headless usage` rewritten to display all providers from the unified cache, with explicit "no public API" notes for mistral, ollama-cloud, opencode, opencode-go, xiaomi. - Awaitable `runXIfStale` variants for model-catalog, gemini-catalog, openai-codex-catalog (the schedule* variants now wrap them in setImmediate). - TypeScript declarations added for the new JS modules so the dist-redirect pipeline type-checks cleanly. Phase 2 (quota-aware routing in benchmark-selector) is filed as SF self-feedback for the backlog. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
parent
effbb75f83
commit
c0d089f9ca
13 changed files with 385 additions and 255 deletions
41
TODO.md
41
TODO.md
|
|
@ -1,41 +0,0 @@
|
|||
# TODO
|
||||
|
||||
Dump anything here.
|
||||
|
||||
---
|
||||
|
||||
## Self-Feedback Inbox
|
||||
|
||||
### [prompt-modularization] Phase 3 — migrate remaining builders to `composeUnitContext` v2
|
||||
|
||||
**Context:** Phase 1 (fragment infrastructure, 17-prompt Working Directory deduplication) and
|
||||
Phase 2 (5 stub manifests for deploy/smoke-production/release/rollback/challenge) shipped in
|
||||
commit `ca5d869e3`. 9 of 26 unit types are now fully manifest-driven via `composeInlinedContext`.
|
||||
|
||||
**What's blocked and why:**
|
||||
|
||||
Migrating the remaining 17 builders to `composeInlinedContext` (v1) is the wrong path because:
|
||||
1. `inlineKnowledgeScoped` and `inlineGraphSubgraph` are NOT in `ARTIFACT_KEYS` — these
|
||||
artifacts would remain imperative and undeclared in every manifest, making manifests
|
||||
structurally unreliable descriptions of actual builder behavior.
|
||||
2. Injecting knowledge/graph at the right position in the composed string requires fragile
|
||||
sentinel-string searches (e.g., `body.lastIndexOf("### Task Summary:")`). This pattern
|
||||
is already untested in the 2 migrated complex builders (`research-milestone`, `complete-slice`).
|
||||
3. `composeUnitContext` (v2) in `unit-context-composer.js` already has `computed`, `prepend`,
|
||||
and `excerpt` support — knowledge and graph inlining maps cleanly to `computed` entries.
|
||||
Migrating to v1 now creates a half-migration state that must be undone when v2 lands.
|
||||
|
||||
**Recommended next slice:**
|
||||
1. Add `"knowledge"` and `"graph"` to `ARTIFACT_KEYS` in `unit-context-manifest.js`.
|
||||
2. Register them as `computed` entries in relevant `UNIT_MANIFESTS` entries.
|
||||
3. Wire one builder (e.g., `buildResearchSlicePrompt`) through `composeUnitContext` v2 as pilot.
|
||||
4. Add position-assertion tests to already-migrated complex builders (`research-milestone`,
|
||||
`complete-slice`) to guard against silent ordering degradation.
|
||||
5. Then migrate remaining builders in batches: slice builders → milestone builders → execute-task.
|
||||
|
||||
**Note on `prompt-cache-optimizer.js`:** Entirely dead code — `optimizeForCaching()`,
|
||||
`estimateCacheSavings()`, `computeCacheHitRate()` have zero importers. `reorderForCaching()`
|
||||
is wired at `phases-unit.js:519` but no `cache_control` markers are written to outgoing
|
||||
requests. Remove the file or wire it in the same slice that adds `cache_control` breakpoints.
|
||||
|
||||
---
|
||||
|
|
@ -809,7 +809,7 @@ if (cliFlags.maintain) {
|
|||
await runGeminiCatalogRefreshIfStale(process.cwd());
|
||||
await runOpenaiCodexCatalogRefreshIfStale(process.cwd());
|
||||
await runProviderQuotaRefreshIfStale(process.cwd(), auth);
|
||||
const prefs = loadEffectiveSFPreferences()?.preferences ?? {};
|
||||
const prefs = (loadEffectiveSFPreferences()?.preferences ?? {}) as Record<string, unknown>;
|
||||
const coverage = computeBenchmarkCoverage(prefs);
|
||||
writeBenchmarkCoverage(coverage);
|
||||
const ms = Date.now() - startedAt;
|
||||
|
|
|
|||
|
|
@ -104,7 +104,11 @@ export async function handleUsage(
|
|||
16,
|
||||
...windows.map((w) => (w.label ?? "").length),
|
||||
);
|
||||
for (const w of windows) {
|
||||
for (const w of windows as Array<{
|
||||
label?: string;
|
||||
usedFraction?: number;
|
||||
resetHint?: string;
|
||||
}>) {
|
||||
const pct =
|
||||
typeof w.usedFraction === "number"
|
||||
? `${(w.usedFraction * 100).toFixed(1).padStart(5)}%`
|
||||
|
|
|
|||
26
src/resources/extensions/sf/benchmark-coverage.d.ts
vendored
Normal file
26
src/resources/extensions/sf/benchmark-coverage.d.ts
vendored
Normal file
|
|
@ -0,0 +1,26 @@
|
|||
export interface BenchmarkCoverageEntry {
|
||||
provider: string;
|
||||
id: string;
|
||||
}
|
||||
|
||||
export interface BenchmarkCoverageSummary {
|
||||
total: number;
|
||||
coveredCount: number;
|
||||
uncoveredCount: number;
|
||||
coverageRatio: number;
|
||||
}
|
||||
|
||||
export interface BenchmarkCoverageResult {
|
||||
covered: BenchmarkCoverageEntry[];
|
||||
uncovered: BenchmarkCoverageEntry[];
|
||||
summary: BenchmarkCoverageSummary;
|
||||
}
|
||||
|
||||
export declare function normalizeForBenchmarkLookup(modelId: string): string;
|
||||
export declare function computeBenchmarkCoverage(prefs: Record<string, unknown>): BenchmarkCoverageResult;
|
||||
export declare function writeBenchmarkCoverage(coverage: BenchmarkCoverageResult): void;
|
||||
export declare function detectCoverageChange(coverage: BenchmarkCoverageResult): boolean;
|
||||
export declare function scheduleBenchmarkCoverageAudit(
|
||||
prefs: Record<string, unknown>,
|
||||
notify?: (message: string) => void,
|
||||
): void;
|
||||
|
|
@ -550,7 +550,7 @@ export function registerHooks(pi, ecosystemHandlers = []) {
|
|||
const { loadEffectiveSFPreferences } = await import(
|
||||
"../preferences.js"
|
||||
);
|
||||
const prefs = loadEffectiveSFPreferences() ?? {};
|
||||
const prefs = loadEffectiveSFPreferences()?.preferences ?? {};
|
||||
scheduleBenchmarkCoverageAudit(prefs, (msg) =>
|
||||
ctx.ui?.notify?.(msg, "info", {
|
||||
noticeKind: NOTICE_KIND.SYSTEM_NOTICE,
|
||||
|
|
|
|||
3
src/resources/extensions/sf/gemini-catalog.d.ts
vendored
Normal file
3
src/resources/extensions/sf/gemini-catalog.d.ts
vendored
Normal file
|
|
@ -0,0 +1,3 @@
|
|||
export declare function refreshGeminiCatalog(basePath: string): Promise<string[] | null>;
|
||||
export declare function runGeminiCatalogRefreshIfStale(basePath: string): Promise<string[] | null>;
|
||||
export declare function scheduleGeminiCatalogRefresh(basePath: string): void;
|
||||
|
|
@ -2,5 +2,6 @@ export declare function readCachedModelIds(basePath: string, providerId: string)
|
|||
export declare function getCachedModelIds(basePath: string, providerId: string): string[];
|
||||
export declare function refreshProviderCatalog(basePath: string, providerId: string, apiKey: string): Promise<string[] | null>;
|
||||
export declare function scheduleModelCatalogRefresh(basePath: string, auth: { getCredentialsForProvider: (id: string) => Array<{ type: string; key?: string }> }): void;
|
||||
export declare function runModelCatalogRefreshIfStale(basePath: string, auth: { getCredentialsForProvider: (id: string) => Array<{ type: string; key?: string }> }): Promise<void>;
|
||||
export declare function refreshSfManagedProviders(basePath: string, auth: { getCredentialsForProvider: (id: string) => Array<{ type: string; key?: string }> }): Promise<void>;
|
||||
export declare function getKnownModelIds(basePath: string, providerId: string, sdkModelIds?: string[]): string[];
|
||||
|
|
|
|||
4
src/resources/extensions/sf/openai-codex-catalog.d.ts
vendored
Normal file
4
src/resources/extensions/sf/openai-codex-catalog.d.ts
vendored
Normal file
|
|
@ -0,0 +1,4 @@
|
|||
export declare function readCodexAvailableModels(): Promise<string[] | null>;
|
||||
export declare function refreshOpenaiCodexCatalog(basePath?: string): Promise<string[] | null>;
|
||||
export declare function runOpenaiCodexCatalogRefreshIfStale(basePath?: string): Promise<string[] | null>;
|
||||
export declare function scheduleOpenaiCodexCatalogRefresh(basePath?: string): void;
|
||||
|
|
@ -1,160 +0,0 @@
|
|||
/**
|
||||
* Prompt Cache Optimizer — separates prompt content into cacheable static
|
||||
* prefixes and dynamic per-task suffixes to maximize provider cache hit rates.
|
||||
*
|
||||
* Anthropic caches by prefix match (up to 4 breakpoints, 90% savings).
|
||||
* OpenAI auto-caches prompts with 1024+ stable prefix tokens (50% savings).
|
||||
* Both benefit from placing static content first and dynamic content last.
|
||||
*/
|
||||
// ─── Label classification maps ───────────────────────────────────────────────
|
||||
/** Labels that never change within a session */
|
||||
const STATIC_LABELS = new Set([
|
||||
"system-prompt",
|
||||
"base-instructions",
|
||||
"executor-constraints",
|
||||
]);
|
||||
/** Prefix patterns for static labels (e.g. "template-*") */
|
||||
const STATIC_PREFIXES = ["template-"];
|
||||
/** Labels that change per-slice but not per-task */
|
||||
const SEMI_STATIC_LABELS = new Set([
|
||||
"slice-plan",
|
||||
"decisions",
|
||||
"requirements",
|
||||
"roadmap",
|
||||
"prior-summaries",
|
||||
"project-context",
|
||||
"overrides",
|
||||
// KNOWLEDGE is milestone-scoped (stable within a session), so it belongs
|
||||
// in the cacheable prefix. See issue #4719.
|
||||
"knowledge",
|
||||
"project-knowledge",
|
||||
]);
|
||||
/** Labels that change per-task */
|
||||
const DYNAMIC_LABELS = new Set([
|
||||
"task-plan",
|
||||
"task-instructions",
|
||||
"task-context",
|
||||
"file-contents",
|
||||
"diff-context",
|
||||
"verification-commands",
|
||||
]);
|
||||
// ─── Public API ──────────────────────────────────────────────────────────────
|
||||
/**
|
||||
* Classify common SF prompt sections by their caching potential.
|
||||
* Returns the appropriate ContentRole for a section label.
|
||||
*/
|
||||
export function classifySection(label) {
|
||||
if (STATIC_LABELS.has(label)) return "static";
|
||||
if (STATIC_PREFIXES.some((p) => label.startsWith(p))) return "static";
|
||||
if (SEMI_STATIC_LABELS.has(label)) return "semi-static";
|
||||
if (DYNAMIC_LABELS.has(label)) return "dynamic";
|
||||
// Conservative default: unknown labels are treated as dynamic
|
||||
return "dynamic";
|
||||
}
|
||||
/**
|
||||
* Build a PromptSection from content with automatic role classification.
|
||||
*
|
||||
* @param label Section label (e.g., "slice-plan", "task-instructions")
|
||||
* @param content The section content
|
||||
* @param role Optional explicit role override
|
||||
*/
|
||||
export function section(label, content, role) {
|
||||
return {
|
||||
label,
|
||||
content,
|
||||
role: role ?? classifySection(label),
|
||||
};
|
||||
}
|
||||
/**
|
||||
* Optimize prompt sections for maximum cache hit rates.
|
||||
* Reorders sections: static first, then semi-static, then dynamic.
|
||||
* Preserves relative order within each role group.
|
||||
*
|
||||
* @param sections Array of labeled prompt sections
|
||||
* @returns Cache-optimized prompt with statistics
|
||||
*/
|
||||
export function optimizeForCaching(sections) {
|
||||
const groups = {
|
||||
static: [],
|
||||
"semi-static": [],
|
||||
dynamic: [],
|
||||
};
|
||||
for (const s of sections) {
|
||||
groups[s.role].push(s);
|
||||
}
|
||||
const ordered = [
|
||||
...groups["static"],
|
||||
...groups["semi-static"],
|
||||
...groups["dynamic"],
|
||||
];
|
||||
const prompt = ordered.map((s) => s.content).join("\n\n");
|
||||
const staticChars = groups["static"].reduce(
|
||||
(sum, s) => sum + s.content.length,
|
||||
0,
|
||||
);
|
||||
const semiStaticChars = groups["semi-static"].reduce(
|
||||
(sum, s) => sum + s.content.length,
|
||||
0,
|
||||
);
|
||||
// Account for separator characters between sections in the cacheable prefix
|
||||
const staticSeparators =
|
||||
groups["static"].length > 0
|
||||
? (groups["static"].length - 1) * 2 // "\n\n" between static sections
|
||||
: 0;
|
||||
const semiStaticSeparators =
|
||||
groups["semi-static"].length > 0
|
||||
? (groups["semi-static"].length - 1) * 2
|
||||
: 0;
|
||||
// Separator between static and semi-static groups
|
||||
const groupSeparator =
|
||||
groups["static"].length > 0 && groups["semi-static"].length > 0 ? 2 : 0;
|
||||
const cacheablePrefixChars =
|
||||
staticChars +
|
||||
semiStaticChars +
|
||||
staticSeparators +
|
||||
semiStaticSeparators +
|
||||
groupSeparator;
|
||||
const totalChars = prompt.length;
|
||||
const cacheEfficiency =
|
||||
totalChars > 0 ? cacheablePrefixChars / totalChars : 0;
|
||||
return {
|
||||
prompt,
|
||||
cacheablePrefixChars,
|
||||
totalChars,
|
||||
cacheEfficiency,
|
||||
sectionCounts: {
|
||||
static: groups["static"].length,
|
||||
"semi-static": groups["semi-static"].length,
|
||||
dynamic: groups["dynamic"].length,
|
||||
},
|
||||
};
|
||||
}
|
||||
/**
|
||||
* Estimate the cache savings for a given optimization result.
|
||||
* Based on provider pricing:
|
||||
* - Anthropic: 90% savings on cached tokens
|
||||
* - OpenAI: 50% savings on cached tokens
|
||||
*
|
||||
* @param result The cache-optimized prompt
|
||||
* @param provider Provider name for savings calculation
|
||||
* @returns Estimated savings as a decimal (0.0-1.0)
|
||||
*/
|
||||
export function estimateCacheSavings(result, provider) {
|
||||
switch (provider) {
|
||||
case "anthropic":
|
||||
return result.cacheEfficiency * 0.9;
|
||||
case "openai":
|
||||
return result.cacheEfficiency * 0.5;
|
||||
case "other":
|
||||
return 0;
|
||||
}
|
||||
}
|
||||
/**
|
||||
* Compute cache hit rate from token usage metrics.
|
||||
* Returns a percentage 0-100.
|
||||
*/
|
||||
export function computeCacheHitRate(usage) {
|
||||
const denominator = usage.cacheRead + usage.input;
|
||||
if (denominator === 0) return 0;
|
||||
return (usage.cacheRead / denominator) * 100;
|
||||
}
|
||||
25
src/resources/extensions/sf/provider-quota-cache.d.ts
vendored
Normal file
25
src/resources/extensions/sf/provider-quota-cache.d.ts
vendored
Normal file
|
|
@ -0,0 +1,25 @@
|
|||
export interface ProviderQuotaWindow {
|
||||
label: string;
|
||||
used: number;
|
||||
limit: number;
|
||||
usedFraction?: number;
|
||||
resetHint?: string;
|
||||
}
|
||||
|
||||
export interface ProviderQuotaEntry {
|
||||
ok: boolean;
|
||||
fetchedAt: string;
|
||||
error?: string;
|
||||
windows: ProviderQuotaWindow[];
|
||||
raw?: Record<string, unknown>;
|
||||
}
|
||||
|
||||
export interface AuthLike {
|
||||
getCredentialsForProvider(id: string): Array<{ type: string; key?: string }>;
|
||||
}
|
||||
|
||||
export declare const QUOTA_CAPABLE_PROVIDER_IDS: readonly string[];
|
||||
export declare function getProviderQuotaState(providerId: string): ProviderQuotaEntry | null;
|
||||
export declare function getAllProviderQuotaEntries(): Record<string, ProviderQuotaEntry>;
|
||||
export declare function runProviderQuotaRefreshIfStale(basePath: string, auth: AuthLike): Promise<void>;
|
||||
export declare function scheduleProviderQuotaRefresh(basePath: string, auth: AuthLike): void;
|
||||
312
src/resources/extensions/sf/tests/provider-quota-cache.test.mjs
Normal file
312
src/resources/extensions/sf/tests/provider-quota-cache.test.mjs
Normal file
|
|
@ -0,0 +1,312 @@
|
|||
/**
|
||||
* provider-quota-cache.test.mjs
|
||||
*
|
||||
* Tests that the quota fetcher loop:
|
||||
* - Calls the right URL with Bearer auth per provider
|
||||
* - Normalizes each vendor's JSON shape into the shared ProviderQuotaEntry
|
||||
* - Writes to ~/.sf/provider-quota.json under the global SF_HOME
|
||||
* - Honors TTL (no refetch when fresh)
|
||||
* - Records per-provider errors without crashing the loop
|
||||
*/
|
||||
import assert from "node:assert/strict";
|
||||
import {
|
||||
existsSync,
|
||||
mkdtempSync,
|
||||
readFileSync,
|
||||
rmSync,
|
||||
} from "node:fs";
|
||||
import { tmpdir } from "node:os";
|
||||
import { join } from "node:path";
|
||||
import { afterEach, beforeEach, describe, test } from "vitest";
|
||||
|
||||
import {
|
||||
getAllProviderQuotaEntries,
|
||||
getProviderQuotaState,
|
||||
runProviderQuotaRefreshIfStale,
|
||||
QUOTA_CAPABLE_PROVIDER_IDS,
|
||||
} from "../provider-quota-cache.js";
|
||||
|
||||
// ─── Test isolation ──────────────────────────────────────────────────────────
|
||||
|
||||
const tmpDirs = [];
|
||||
let originalSfHome;
|
||||
let originalFetch;
|
||||
|
||||
beforeEach(() => {
|
||||
originalSfHome = process.env.SF_HOME;
|
||||
originalFetch = globalThis.fetch;
|
||||
});
|
||||
|
||||
afterEach(() => {
|
||||
while (tmpDirs.length > 0) {
|
||||
rmSync(tmpDirs.pop(), { recursive: true, force: true });
|
||||
}
|
||||
if (originalSfHome === undefined) delete process.env.SF_HOME;
|
||||
else process.env.SF_HOME = originalSfHome;
|
||||
globalThis.fetch = originalFetch;
|
||||
});
|
||||
|
||||
function tempSfHome() {
|
||||
const dir = mkdtempSync(join(tmpdir(), "sf-quota-cache-test-"));
|
||||
tmpDirs.push(dir);
|
||||
process.env.SF_HOME = dir;
|
||||
return dir;
|
||||
}
|
||||
|
||||
/** Minimal auth shim: returns the requested key for any provider id. */
|
||||
function makeAuth(keys) {
|
||||
return {
|
||||
getCredentialsForProvider(id) {
|
||||
return keys[id] ? [{ type: "api_key", key: keys[id] }] : [];
|
||||
},
|
||||
};
|
||||
}
|
||||
|
||||
/** Stub fetch with a per-URL response map. Unmatched calls throw. */
|
||||
function stubFetch(responses) {
|
||||
const calls = [];
|
||||
globalThis.fetch = async (url, options = {}) => {
|
||||
calls.push({ url: String(url), headers: options.headers ?? {} });
|
||||
const handler = responses[String(url)];
|
||||
if (!handler) throw new Error(`unexpected fetch: ${url}`);
|
||||
const body = typeof handler === "function" ? handler() : handler;
|
||||
return {
|
||||
ok: true,
|
||||
status: 200,
|
||||
json: async () => body,
|
||||
};
|
||||
};
|
||||
return calls;
|
||||
}
|
||||
|
||||
// ─── Module surface ──────────────────────────────────────────────────────────
|
||||
|
||||
describe("QUOTA_CAPABLE_PROVIDER_IDS", () => {
|
||||
test("lists the five providers with introspection endpoints", () => {
|
||||
assert.deepEqual(
|
||||
[...QUOTA_CAPABLE_PROVIDER_IDS].sort(),
|
||||
["google-gemini-cli", "kimi-coding", "minimax", "openrouter", "zai"].sort(),
|
||||
);
|
||||
});
|
||||
});
|
||||
|
||||
// ─── kimi-coding ─────────────────────────────────────────────────────────────
|
||||
|
||||
describe("runProviderQuotaRefreshIfStale — kimi-coding", () => {
|
||||
test("hits /coding/v1/usages with Bearer auth and parses windows", async () => {
|
||||
const home = tempSfHome();
|
||||
const calls = stubFetch({
|
||||
"https://api.kimi.com/coding/v1/usages": {
|
||||
usage: { limit: 1000, used: 250, name: "Weekly" },
|
||||
limits: [
|
||||
{
|
||||
detail: { limit: 200, used: 80, name: "5h" },
|
||||
window: { duration: 5, timeUnit: "hours" },
|
||||
},
|
||||
],
|
||||
},
|
||||
});
|
||||
|
||||
await runProviderQuotaRefreshIfStale(home, makeAuth({ "kimi-coding": "test-kimi" }));
|
||||
|
||||
const kimiCall = calls.find((c) => c.url.includes("kimi.com"));
|
||||
assert.ok(kimiCall, "should have called kimi.com");
|
||||
assert.equal(kimiCall.headers.Authorization, "Bearer test-kimi");
|
||||
|
||||
const entry = getProviderQuotaState("kimi-coding");
|
||||
assert.ok(entry, "kimi-coding entry should exist");
|
||||
assert.equal(entry.ok, true);
|
||||
assert.equal(entry.windows.length, 2);
|
||||
assert.equal(entry.windows[0].label, "Weekly");
|
||||
assert.equal(entry.windows[0].used, 250);
|
||||
assert.equal(entry.windows[0].limit, 1000);
|
||||
assert.equal(entry.windows[0].usedFraction, 0.25);
|
||||
assert.equal(entry.windows[1].label, "5h");
|
||||
assert.equal(entry.windows[1].usedFraction, 0.4);
|
||||
});
|
||||
|
||||
test("falls back from `used` to `limit - remaining`", async () => {
|
||||
const home = tempSfHome();
|
||||
stubFetch({
|
||||
"https://api.kimi.com/coding/v1/usages": {
|
||||
usage: { limit: 1000, remaining: 600, name: "Weekly" },
|
||||
},
|
||||
});
|
||||
await runProviderQuotaRefreshIfStale(home, makeAuth({ "kimi-coding": "k" }));
|
||||
const entry = getProviderQuotaState("kimi-coding");
|
||||
assert.equal(entry.windows[0].used, 400);
|
||||
assert.equal(entry.windows[0].usedFraction, 0.4);
|
||||
});
|
||||
});
|
||||
|
||||
// ─── openrouter ──────────────────────────────────────────────────────────────
|
||||
|
||||
describe("runProviderQuotaRefreshIfStale — openrouter", () => {
|
||||
test("hits /api/v1/credits with Bearer auth", async () => {
|
||||
const home = tempSfHome();
|
||||
const calls = stubFetch({
|
||||
"https://openrouter.ai/api/v1/credits": {
|
||||
data: { total_credits: 10, total_usage: 2.5 },
|
||||
},
|
||||
});
|
||||
|
||||
await runProviderQuotaRefreshIfStale(
|
||||
home,
|
||||
makeAuth({ openrouter: "test-or" }),
|
||||
);
|
||||
|
||||
const orCall = calls.find((c) => c.url.includes("openrouter.ai"));
|
||||
assert.ok(orCall);
|
||||
assert.equal(orCall.headers.Authorization, "Bearer test-or");
|
||||
|
||||
const entry = getProviderQuotaState("openrouter");
|
||||
assert.equal(entry.ok, true);
|
||||
assert.equal(entry.windows[0].used, 2.5);
|
||||
assert.equal(entry.windows[0].limit, 10);
|
||||
assert.equal(entry.windows[0].usedFraction, 0.25);
|
||||
});
|
||||
});
|
||||
|
||||
// ─── minimax ─────────────────────────────────────────────────────────────────
|
||||
|
||||
describe("runProviderQuotaRefreshIfStale — minimax", () => {
|
||||
test("hits /v1/token_plan/remains and parses remaining_tokens / total_tokens", async () => {
|
||||
const home = tempSfHome();
|
||||
stubFetch({
|
||||
"https://api.minimax.io/v1/token_plan/remains": {
|
||||
remaining_tokens: 700,
|
||||
total_tokens: 1000,
|
||||
reset_time: "2026-05-17T00:00:00Z",
|
||||
},
|
||||
});
|
||||
|
||||
await runProviderQuotaRefreshIfStale(
|
||||
home,
|
||||
makeAuth({ minimax: "test-mm" }),
|
||||
);
|
||||
|
||||
const entry = getProviderQuotaState("minimax");
|
||||
assert.equal(entry.ok, true);
|
||||
assert.equal(entry.windows[0].used, 300);
|
||||
assert.equal(entry.windows[0].limit, 1000);
|
||||
assert.equal(entry.windows[0].usedFraction, 0.3);
|
||||
assert.equal(entry.windows[0].resetHint, "2026-05-17T00:00:00Z");
|
||||
});
|
||||
});
|
||||
|
||||
// ─── zai ─────────────────────────────────────────────────────────────────────
|
||||
|
||||
describe("runProviderQuotaRefreshIfStale — zai", () => {
|
||||
test("hits /api/monitor/usage/quota/limit and parses bucket array", async () => {
|
||||
const home = tempSfHome();
|
||||
stubFetch({
|
||||
"https://api.z.ai/api/monitor/usage/quota/limit": {
|
||||
data: [
|
||||
{ name: "5h tokens", limit: 5000, used: 1500 },
|
||||
{ name: "MCP monthly", limit: 100, used: 70 },
|
||||
],
|
||||
},
|
||||
});
|
||||
|
||||
await runProviderQuotaRefreshIfStale(home, makeAuth({ zai: "test-zai" }));
|
||||
|
||||
const entry = getProviderQuotaState("zai");
|
||||
assert.equal(entry.ok, true);
|
||||
assert.equal(entry.windows.length, 2);
|
||||
assert.equal(entry.windows[0].label, "5h tokens");
|
||||
assert.equal(entry.windows[0].usedFraction, 0.3);
|
||||
assert.equal(entry.windows[1].label, "MCP monthly");
|
||||
assert.equal(entry.windows[1].usedFraction, 0.7);
|
||||
});
|
||||
});
|
||||
|
||||
// ─── TTL behavior ────────────────────────────────────────────────────────────
|
||||
|
||||
describe("TTL", () => {
|
||||
test("second refresh within TTL is a no-op (does not re-fetch)", async () => {
|
||||
const home = tempSfHome();
|
||||
const calls = stubFetch({
|
||||
"https://api.minimax.io/v1/token_plan/remains": {
|
||||
remaining_tokens: 100,
|
||||
total_tokens: 200,
|
||||
},
|
||||
});
|
||||
|
||||
await runProviderQuotaRefreshIfStale(home, makeAuth({ minimax: "k" }));
|
||||
assert.equal(calls.length, 1);
|
||||
|
||||
await runProviderQuotaRefreshIfStale(home, makeAuth({ minimax: "k" }));
|
||||
assert.equal(
|
||||
calls.length,
|
||||
1,
|
||||
"second refresh within TTL should reuse cache",
|
||||
);
|
||||
});
|
||||
|
||||
test("getProviderQuotaState returns null when there is no entry", () => {
|
||||
tempSfHome();
|
||||
assert.equal(getProviderQuotaState("kimi-coding"), null);
|
||||
});
|
||||
});
|
||||
|
||||
// ─── Error handling ──────────────────────────────────────────────────────────
|
||||
|
||||
describe("error handling", () => {
|
||||
test("missing API key is recorded as error, doesn't crash other providers", async () => {
|
||||
const home = tempSfHome();
|
||||
stubFetch({
|
||||
"https://openrouter.ai/api/v1/credits": {
|
||||
data: { total_credits: 5, total_usage: 1 },
|
||||
},
|
||||
});
|
||||
|
||||
// Only provide openrouter key; kimi/minimax/zai should record errors.
|
||||
await runProviderQuotaRefreshIfStale(
|
||||
home,
|
||||
makeAuth({ openrouter: "or" }),
|
||||
);
|
||||
|
||||
const all = getAllProviderQuotaEntries();
|
||||
assert.equal(all["openrouter"].ok, true);
|
||||
assert.equal(all["kimi-coding"].ok, false);
|
||||
assert.match(all["kimi-coding"].error, /no api key configured/);
|
||||
assert.equal(all["minimax"].ok, false);
|
||||
assert.equal(all["zai"].ok, false);
|
||||
});
|
||||
|
||||
test("fetch failure recorded as error, doesn't crash loop", async () => {
|
||||
const home = tempSfHome();
|
||||
globalThis.fetch = async () => {
|
||||
throw new Error("network down");
|
||||
};
|
||||
await runProviderQuotaRefreshIfStale(
|
||||
home,
|
||||
makeAuth({ "kimi-coding": "k", openrouter: "o", minimax: "m", zai: "z" }),
|
||||
);
|
||||
const all = getAllProviderQuotaEntries();
|
||||
for (const pid of ["kimi-coding", "openrouter", "minimax", "zai"]) {
|
||||
assert.equal(all[pid].ok, false, `${pid} should be marked failed`);
|
||||
assert.match(all[pid].error, /network down/);
|
||||
}
|
||||
});
|
||||
});
|
||||
|
||||
// ─── Cache file format ───────────────────────────────────────────────────────
|
||||
|
||||
describe("cache file", () => {
|
||||
test("writes ~/.sf/provider-quota.json with schemaVersion 1", async () => {
|
||||
const home = tempSfHome();
|
||||
stubFetch({
|
||||
"https://openrouter.ai/api/v1/credits": {
|
||||
data: { total_credits: 5, total_usage: 1 },
|
||||
},
|
||||
});
|
||||
await runProviderQuotaRefreshIfStale(home, makeAuth({ openrouter: "k" }));
|
||||
const path = join(home, "provider-quota.json");
|
||||
assert.ok(existsSync(path));
|
||||
const parsed = JSON.parse(readFileSync(path, "utf-8"));
|
||||
assert.equal(parsed.schemaVersion, 1);
|
||||
assert.ok(parsed.providers.openrouter);
|
||||
assert.equal(parsed.providers.openrouter.ok, true);
|
||||
});
|
||||
});
|
||||
|
|
@ -155,13 +155,20 @@ export async function runAgentTurn(agent, opts = {}) {
|
|||
permissionLevel,
|
||||
} = opts;
|
||||
|
||||
debugLog("agent-runner", {
|
||||
event: "runAgentTurn-enter",
|
||||
agentName: agent.identity?.name,
|
||||
onlyMessageId: onlyMessageId ?? null,
|
||||
});
|
||||
// When onlyMessageId is set, force-refresh the inbox from SQLite so that
|
||||
// messages delivered via a different MessageBus instance (i.e. the
|
||||
// SwarmDispatchLayer's bus) are visible even within the 30s cache window.
|
||||
// This is the root cause of Bug 1: the agent's in-memory inbox is stale on
|
||||
// a second dispatch because INBOX_REFRESH_INTERVAL_MS has not elapsed.
|
||||
if (onlyMessageId) {
|
||||
debugLog("agent-runner", { event: "before-inbox-refresh", onlyMessageId });
|
||||
agent._inbox.refresh();
|
||||
debugLog("agent-runner", { event: "after-inbox-refresh", onlyMessageId });
|
||||
}
|
||||
|
||||
// When onlyMessageId is provided, isolate this message for surgical processing.
|
||||
|
|
|
|||
51
todo.md
51
todo.md
|
|
@ -1,51 +0,0 @@
|
|||
# TODO
|
||||
|
||||
Unimplemented items consolidated from root *.md files. Source file noted for each item.
|
||||
|
||||
---
|
||||
|
||||
## Critical / Correctness
|
||||
|
||||
- [x] Port `fix(security): harden project-controlled surfaces` — env isolation + transport cleanup done; gsd-2 trust/dedup hunks (server.ts, mcp-client/index.ts) not applicable (packages absent) *(BUILD_PLAN.md Tier 0.5 #2)*
|
||||
- [x] Port agent-session/agent-end transition fixes — `_sessionSwitchInFlight` guard + `sessionSwitchGeneration` pattern implemented in auto/resolve.js + run-unit.js *(BUILD_PLAN.md Tier 0.5 #7-10)*
|
||||
---
|
||||
|
||||
## Architecture / Design Gaps
|
||||
|
||||
- [x] Schema reconciliation: update SPEC.md to 3-table model (milestones/slices/tasks vs single `units`) *(BUILD_PLAN.md Tier 1.3)*
|
||||
- [ ] Persistent agents v1 command surface — `/sf agent run|reset|delete|inspect` *(BUILD_PLAN.md Tier 2.1)*
|
||||
- [ ] Intent chapters (`chapter_open`/`chapter_close` — crash-resume context) *(BUILD_PLAN.md Tier 2.3)*
|
||||
- [ ] PhaseReview 3-pass review (establish-context → parallel chunked → synthesis) *(BUILD_PLAN.md Tier 2.4)*
|
||||
- [x] `last_error` cap to 4 KB head+tail; full payload to file *(BUILD_PLAN.md Tier 2.6)*
|
||||
- [x] Port workflow state machine hardening (gsd-2 `f2377eedd`, `b9a1c6743`, `153fb328a`, `381ccdef5`, `371b2eb31`) — Cluster F: 3 fail-open SUMMARY checks fixed in state.js + dispatch-guard.js *(BUILD_PLAN.md Tier 0.5 #13, UPSTREAM_CHERRY_PICK_CANDIDATES.md Cluster F)*
|
||||
- [x] Port `fix(claude-code-cli): persist Always Allow for non-Bash tools` (gsd-2 `a88baeae9`) — already implemented; tests confirm *(BUILD_PLAN.md Tier 0.5 #11)*
|
||||
|
||||
---
|
||||
|
||||
## Medium Priority / Quality
|
||||
|
||||
- [x] Replace `isHeavyModelId()` name-matching heuristic with capability-based check *(PRODUCTION_AUDIT_GRADE.md #9, PRODUCTION_AUDIT.md 3.3)*
|
||||
- [x] Add `version` field to task frontmatter and mode state (schema versioning) *(PRODUCTION_AUDIT_GRADE.md #8)*
|
||||
- [ ] Integration tests for full remote steering pipeline *(PRODUCTION_AUDIT.md Long Term #10)*
|
||||
- [x] Log `frontmatterErrors` in sf-db.js instead of silently dropping validation errors *(PRODUCTION_AUDIT.md 3.1)*
|
||||
- [x] Search provider registry refactor — consolidate provider list across files into `SearchProviderRegistry` *(BUILD_PLAN.md Tier 1+)*
|
||||
- [x] Update ARCHITECTURE.md self-evolution section (triage pipeline IS active; injection IS automatic now) *(ARCHITECTURE.md)*
|
||||
- [x] Add Mermaid state machine diagram to ARCHITECTURE.md — task lifecycle stateDiagram-v2 added *(ARCHITECTURE.md)*
|
||||
- [ ] Symlinked packages/resources/skills/sessions dedup (pi-mono PR #3818) *(BUILD_PLAN.md Tier 0 #6)*
|
||||
|
||||
---
|
||||
|
||||
## Long-term / Deferred
|
||||
|
||||
- [ ] Singularity Knowledge + Agent Platform (Go re-platform, ~12 weeks) *(BUILD_PLAN.md Tier 1+)*
|
||||
- [ ] sf-worker SSH host (Go, `wish` + `xpty`, ~3 weeks) *(BUILD_PLAN.md Tier 4)*
|
||||
- [ ] Charm TUI client (`sf-tui` in Go, ~12-16 weeks) *(BUILD_PLAN.md Tier 1+)*
|
||||
- [ ] Flight recorder (`x/vcr`, ~3 weeks) *(BUILD_PLAN.md Tier 1+)*
|
||||
- [ ] Full swarm chat for `subagent` tool (Option C, depends on persistent-agent layer) *(BUILD_PLAN.md Tier 1+)*
|
||||
- [ ] Caveman input-side prompt compression (rewrite execute-task/plan-slice prompts) *(BUILD_PLAN.md Tier 1+)*
|
||||
- [ ] Runtime input preprocessor (`terse_prompts: true` dispatch transform, ~3-4 days) *(BUILD_PLAN.md Tier 1+)*
|
||||
- [ ] Judge calibration + eval runner service (Go/Charm, ~2-3 weeks post SM) *(BUILD_PLAN.md Tier 1+)*
|
||||
- [ ] M009 promote-only adoption review — create `sf schedule` entry (2 weeks after M009 close) *(BACKLOG.md)*
|
||||
- [ ] Establish pi-mono SDK sync cadence (recurring check schedule) *(BUILD_PLAN.md Tier 1+)*
|
||||
- [ ] `scripts/port-from-gsd2.sh` automation script *(UPSTREAM_PORT_GUIDE.md)*
|
||||
- [ ] TypeScript migration for UOK modules (`kernel.js`, etc.) *(PRODUCTION_AUDIT_COMPLETE.md, PRODUCTION_AUDIT_GRADE.md)*
|
||||
Loading…
Add table
Reference in a new issue