fix(sf): classify 'exhausted your capacity / quota will reset after Ns' as rate-limit

Real failure caught from a user session: provider returned
'Error: You have exhausted your capacity on this model. Your quota
will reset after 51s.' SF's classifier didn't match it (no 'rate
limit', no '429', no 'limit resets'), so it fell through to unknown
→ no auto-resume → loop paused indefinitely until manual /sf
autonomous restart.

PDD spec:

Purpose: every legitimately transient quota error should auto-resume
  after the named cooldown, not pause indefinitely.
Consumer: classifyError() callers, ultimately the auto-loop.
Contract:
  - 'exhausted your|the (quota|capacity|usage)' → rate-limit
  - 'quota will reset' → rate-limit (paired with the above)
  - 'will reset after Ns' / 'will reset in Ns' → retryAfterMs = N*1000
Failure boundary: parse failure → 60s default (preserved).
Evidence: smoke test with 6 inputs:
   'exhausted your capacity ... will reset after 51s' → rate-limit/51000
   'rate limit exceeded' → rate-limit/60000 (unchanged)
   'Internal server error' → server/30000 (unchanged)
   '429 too many requests' → rate-limit/60000 (unchanged)
   'Invalid API key' → permanent (unchanged — still manual)
   'exhausted the usage. Will reset in 30s.' → rate-limit/30000
Non-goals: model-fallback-on-rate-limit (separate change — the
  provider-error-pause module currently waits and retries the same
  model; switching to the configured fallback model after the first
  rate-limit hit is a richer policy change).
Invariants:
  - Permanent classification still wins when no rate-limit pattern is
    present (auth/billing/invalid-key untouched).
  - Default 60s delay preserved when reset-time can't be parsed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Mikael Hugo 2026-05-02 20:35:55 +02:00
parent f757a18417
commit e4a86ddf6f

View file

@ -51,7 +51,8 @@ const PERMANENT_RE =
/auth|unauthorized|forbidden|invalid.*key|invalid.*api|billing|quota exceeded|account/i;
// Include provider-specific quota-window phrasing like "hit your limit", "usage limit", "quota reached"
const RATE_LIMIT_RE =
/rate.?limit|too many requests|429|hit your limit|usage limit|quota (?:reached|hit)|limit.*resets?/i;
/rate.?limit|too many requests|429|hit your limit|usage limit|quota (?:reached|hit|will reset)|limit.*resets?|exhausted (?:your|the) (?:quota|capacity|usage)/i;
const RESET_QUOTA_DELAY_RE = /reset(?:s)?(?:\s+(?:in|after))?\s+(\d+)s/i;
// Unsupported-model: provider rejected the model for the current account/plan (#4513).
// Checked before `permanent` because PERMANENT_RE also matches /account/i.
const UNSUPPORTED_MODEL_MODEL_RE = /\b(?:model|deployment)\b/i;
@ -118,7 +119,12 @@ export function classifyError(
if (retryAfterMs != null && retryAfterMs > 0) {
return { kind: "rate-limit", retryAfterMs };
}
const resetMatch = errorMsg.match(RESET_DELAY_RE);
// Try the existing "reset in Ns" first, then the broader
// "reset(s)? (in|after) Ns" form that catches "Your quota will reset
// after 51s" — common across providers (Anthropic capacity exhaustion,
// OpenAI usage caps, etc.).
const resetMatch =
errorMsg.match(RESET_DELAY_RE) ?? errorMsg.match(RESET_QUOTA_DELAY_RE);
const delayMs = resetMatch ? Number(resetMatch[1]) * 1000 : 60_000;
return { kind: "rate-limit", retryAfterMs: delayMs };
}