fix(sf): classify 'exhausted your capacity / quota will reset after Ns' as rate-limit
Real failure caught from a user session: provider returned 'Error: You have exhausted your capacity on this model. Your quota will reset after 51s.' SF's classifier didn't match it (no 'rate limit', no '429', no 'limit resets'), so it fell through to unknown → no auto-resume → loop paused indefinitely until manual /sf autonomous restart. PDD spec: Purpose: every legitimately transient quota error should auto-resume after the named cooldown, not pause indefinitely. Consumer: classifyError() callers, ultimately the auto-loop. Contract: - 'exhausted your|the (quota|capacity|usage)' → rate-limit - 'quota will reset' → rate-limit (paired with the above) - 'will reset after Ns' / 'will reset in Ns' → retryAfterMs = N*1000 Failure boundary: parse failure → 60s default (preserved). Evidence: smoke test with 6 inputs: ✅ 'exhausted your capacity ... will reset after 51s' → rate-limit/51000 ✅ 'rate limit exceeded' → rate-limit/60000 (unchanged) ✅ 'Internal server error' → server/30000 (unchanged) ✅ '429 too many requests' → rate-limit/60000 (unchanged) ✅ 'Invalid API key' → permanent (unchanged — still manual) ✅ 'exhausted the usage. Will reset in 30s.' → rate-limit/30000 Non-goals: model-fallback-on-rate-limit (separate change — the provider-error-pause module currently waits and retries the same model; switching to the configured fallback model after the first rate-limit hit is a richer policy change). Invariants: - Permanent classification still wins when no rate-limit pattern is present (auth/billing/invalid-key untouched). - Default 60s delay preserved when reset-time can't be parsed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
parent
f757a18417
commit
e4a86ddf6f
1 changed files with 8 additions and 2 deletions
|
|
@ -51,7 +51,8 @@ const PERMANENT_RE =
|
|||
/auth|unauthorized|forbidden|invalid.*key|invalid.*api|billing|quota exceeded|account/i;
|
||||
// Include provider-specific quota-window phrasing like "hit your limit", "usage limit", "quota reached"
|
||||
const RATE_LIMIT_RE =
|
||||
/rate.?limit|too many requests|429|hit your limit|usage limit|quota (?:reached|hit)|limit.*resets?/i;
|
||||
/rate.?limit|too many requests|429|hit your limit|usage limit|quota (?:reached|hit|will reset)|limit.*resets?|exhausted (?:your|the) (?:quota|capacity|usage)/i;
|
||||
const RESET_QUOTA_DELAY_RE = /reset(?:s)?(?:\s+(?:in|after))?\s+(\d+)s/i;
|
||||
// Unsupported-model: provider rejected the model for the current account/plan (#4513).
|
||||
// Checked before `permanent` because PERMANENT_RE also matches /account/i.
|
||||
const UNSUPPORTED_MODEL_MODEL_RE = /\b(?:model|deployment)\b/i;
|
||||
|
|
@ -118,7 +119,12 @@ export function classifyError(
|
|||
if (retryAfterMs != null && retryAfterMs > 0) {
|
||||
return { kind: "rate-limit", retryAfterMs };
|
||||
}
|
||||
const resetMatch = errorMsg.match(RESET_DELAY_RE);
|
||||
// Try the existing "reset in Ns" first, then the broader
|
||||
// "reset(s)? (in|after) Ns" form that catches "Your quota will reset
|
||||
// after 51s" — common across providers (Anthropic capacity exhaustion,
|
||||
// OpenAI usage caps, etc.).
|
||||
const resetMatch =
|
||||
errorMsg.match(RESET_DELAY_RE) ?? errorMsg.match(RESET_QUOTA_DELAY_RE);
|
||||
const delayMs = resetMatch ? Number(resetMatch[1]) * 1000 : 60_000;
|
||||
return { kind: "rate-limit", retryAfterMs: delayMs };
|
||||
}
|
||||
|
|
|
|||
Loading…
Add table
Reference in a new issue