feat(sf): auto-fill empty model fallbacks from benchmark picker (PDD)

Closes the gap that left the user's session paused on a quota error
with no fallback to switch to. Before this commit:
  - User pins models.execution: { model: gemini-3-flash-preview }
  - No fallbacks array → resolveModelWithFallbacksForUnit returns
    { primary, fallbacks: [] }
  - agent-end-recovery.ts line 348 checks fallbacks.length > 0 → false
  - Loop pauses on the first rate-limit, even though the user has
    other API-keyed providers available.

After: an empty/missing fallbacks array auto-fills from
resolveAutoBenchmarkPickForUnit (which picks API-keyed candidates
ranked by benchmark scores), excluding the user's pinned primary so
we never get a no-op switch to the same model.

PDD spec:

Purpose: out-of-the-box auto-switch to fallback models when a user
  pins only a primary. Matches user expectation that 'the system
  selects models automatically' when keys are available.
Consumer: agent-end-recovery.ts model-fallback flow on rate-limit.
Contract:
  1. models.<unit>: '<id>' (string, no fallbacks) → primary plus
     auto-filled fallbacks. Unchanged primary, fallbacks excluding
     primary.
  2. models.<unit>: { model: '<id>', fallbacks: ['a', 'b'] } (explicit
     non-empty) → unchanged. User intent respected.
  3. models.<unit>: { model: '<id>' } (object, no fallbacks) → auto-
     fill from benchmark picker.
  4. models.<unit>: { model: '<id>', fallbacks: [] } (explicit empty)
     → auto-fill (treat empty same as missing).
  5. No models config at all → unchanged behavior — full auto-pick.
Failure boundary:
  - resolveAutoBenchmarkPickForUnit returns undefined when no
    API-keyed providers exist → fallbacks stays empty (no candidates
    to switch to anyway).
  - autoBenchmark option still honored — set to false to opt out.
Evidence:
  - Smoke test: pinned 'gemini-3-flash-preview' with empty fallbacks +
    OPENROUTER_API_KEY + GEMINI_API_KEY in env → returns 4 fallbacks
    starting with minimax/MiniMax-M2.7. Primary not in fallbacks.
  - Existing 62 preferences tests + 5 rate-limit-model-fallback tests
    still pass — no regression.
Non-goals:
  - Cross-phase inheritance (planning falls back to execution config).
  - Persisting auto-filled fallbacks to PREFERENCES.md.
  - Mid-tool-call rate-limit recovery (different code path through
    pi-coding-agent's RetryHandler).
Invariants:
  - Safety: explicit non-empty user fallbacks NEVER overwritten —
    line userFallbacks.length > 0 short-circuits before auto-fill.
  - Liveness: empty arrays trigger auto-fill, so callers get a chain
    if any keys are configured.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Mikael Hugo 2026-05-02 20:43:28 +02:00
parent e4a86ddf6f
commit 2c044f340f

View file

@ -429,7 +429,17 @@ export function resolveModelWithFallbacksForUnit(
// Normalize: string -> { model, fallbacks: [] }
if (typeof phaseConfig === "string") {
return { primary: phaseConfig, fallbacks: [] };
const auto = autoBenchmark
? resolveAutoBenchmarkPickForUnit(unitType, prefs?.preferences)
: undefined;
const autoFallbacks = (auto?.fallbacks ?? []).filter(
(id) => id !== phaseConfig,
);
// Don't double-include the user's primary if benchmarks picked it as #1.
if (auto && auto.primary !== phaseConfig) {
autoFallbacks.unshift(auto.primary);
}
return { primary: phaseConfig, fallbacks: autoFallbacks };
}
// When provider is explicitly set, prepend it to the model ID so the
@ -439,10 +449,24 @@ export function resolveModelWithFallbacksForUnit(
? `${phaseConfig.provider}/${phaseConfig.model}`
: phaseConfig.model;
return {
primary,
fallbacks: phaseConfig.fallbacks ?? [],
};
const userFallbacks = phaseConfig.fallbacks ?? [];
if (userFallbacks.length > 0) {
// Explicit fallbacks: respect them as-is, no auto-fill.
return { primary, fallbacks: userFallbacks };
}
// Auto-fill empty/missing fallbacks from the benchmark picker so a user
// who only pinned a primary still gets rate-limit auto-switch behavior
// out of the box. The auto-pick may include the user's primary as its #1
// — drop it to avoid `primary == fallbacks[0]` no-op switches.
const auto = autoBenchmark
? resolveAutoBenchmarkPickForUnit(unitType, prefs?.preferences)
: undefined;
const autoFallbacks = (auto?.fallbacks ?? []).filter((id) => id !== primary);
if (auto && auto.primary !== primary) {
autoFallbacks.unshift(auto.primary);
}
return { primary, fallbacks: autoFallbacks };
}
/**