What this error means

An OpenAI API rate limit error means the API rejected the request because the project, model, or account exceeded an allowed request or token rate.

Common causes

  • A batch job sends too many requests at once.
  • The prompt or response size consumes more tokens than expected.
  • Retry logic immediately repeats failed requests and increases traffic.
  • Your project has a lower limit than the workload requires.

Quick fixes

  1. Add exponential backoff with jitter for retryable 429 responses.
  2. Lower concurrency for batch jobs and workers.
  3. Reduce prompt size or requested output length.
  4. Check project limits and billing status in the OpenAI dashboard.

Step-by-step troubleshooting

  1. Log response status, request IDs, model names, and retry timing.
  2. Separate request-per-minute issues from token-per-minute issues.
  3. Queue requests so workers do not all retry at the same instant.
  4. Cache repeated results where possible.
  5. If the workload is legitimate and optimized, request a limit increase from the provider dashboard.