OpenAI API / OpenAI API

OpenAI API 429 Too Many Requests

Fix OpenAI API 429 Too Many Requests errors by reducing bursts, adding retry backoff, lowering concurrency, and checking rate or quota limits.

Category
OpenAI API
Error signature
429 Too Many Requests
Quick fix
Reduce concurrency and request volume, add exponential backoff with jitter, respect Retry-After when present, and check provider usage and rate-limit dashboards.
Updated

What this error means

429 Too Many Requests means the API accepted your authentication but throttled the workload because requests arrived too quickly, too many tokens were sent, concurrency was too high, or the account/project is constrained by rate or usage limits.

Why this happens

A 429 is mainly a traffic-shaping problem. It is different from a 401 authentication failure: the key can be valid while the request pattern still exceeds request-per-minute, token-per-minute, or concurrency limits.

Bursty retries can make the problem worse. If every failed request immediately retries, a short throttle can turn into a sustained overload.

Common causes

Quick fixes

  1. Reduce concurrent requests and batch size before retrying the same workload.
  2. Add exponential backoff with jitter and respect any Retry-After header returned by the provider.
  3. Lower prompt size or max output tokens if token-per-minute limits are being hit.
  4. Check usage, quota, and rate-limit settings in the provider dashboard to distinguish throttling from exhausted quota.

Copy-paste commands

Send one request and inspect rate-limit headers

curl -i https://api.openai.com/v1/models \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  | grep -iE "HTTP/|retry-after|rate|limit|remaining"

Find aggressive retry loops in JavaScript or TypeScript

rg -n "retry|setTimeout|setInterval|Promise\.all|concurr|p-limit|queue" src scripts .

Find retry loops in Python

rg -n "retry|backoff|sleep|asyncio\.gather|ThreadPoolExecutor|concurrent" .

Confirm the request volume from local logs

rg -n "429|Too Many Requests|rate limit|Retry-After" logs .

Platform-specific fixes

CI/CD

Production workers

Real-world fixes

Step-by-step troubleshooting

  1. Confirm the error is 429 Too Many Requests and not a 401 or insufficient-quota response.
  2. Check whether the response includes Retry-After or rate-limit headers and log them without logging request content.
  3. Temporarily run the workload with concurrency set to 1; if it succeeds, add a queue or limiter.
  4. Inspect retry code for immediate loops, nested retries, or Promise.all over large request batches.
  5. Compare request count and token usage in the provider dashboard before asking for higher limits.

How to prevent it

FAQ

Is this caused by an invalid API key?

Usually no. 429 Too Many Requests means the request is being throttled. Invalid or missing keys usually produce 401-style authentication errors instead.

What should I check first?

Check concurrency, retry behavior, and recent request volume. A single deploy, queue drain, or test run can create a burst that crosses request-per-minute or token-per-minute limits.

What is exponential backoff?

It means waiting progressively longer between retries, often with a small random jitter, so clients do not retry in synchronized bursts.

How do I know the fix worked?

Run the same workload with logging for 429 responses. The fix is working when request volume stays under the limit and 429 Too Many Requests no longer appears during normal traffic.