What this error means
Deployment over defined rpm limit / Deployment over defined tpm limit — dashboard/callbacks attribute these to upstream vendor (VENDOR_RATE_LIMIT) instead of LiteLLM router (LITELLM_RATE_LIMIT) is a LiteLLM failure pattern reported for developers trying to fix litellm rate-limit categorization so router-side deployment throttling is properly labelled for correct dashboard attribution and alerting routing. Based on the imported evidence, treat this as a tool-specific troubleshooting page rather than a generic API error.
Why this happens
GitHub PR #27708 in BerriAI/litellm identifies that PR #27687 made RateLimitError default category be VENDOR_RATE_LIMIT, causing router-side raises in lowest_tpm_rpm_v2.py (5 instances) and model_rate_limit_check.py (4 instances) to silently misattribute themselves as vendor errors. This breaks dashboards and callback-based alerting that rely on error categories to split litellm-side vs. vendor-side throttles. All 9 raises previously emitted category=vendor_rate_limit; fix makes them emit litellm_rate_limit with correct rate_limit_type. Impacts enterprise LiteLLM proxy operators monitoring rate limits. Category mapping: LiteLLM (proxy/routing layer error affecting commercial deployments).
Common causes
- GitHub PR #27708 in BerriAI/litellm identifies that PR #27687 made RateLimitError default category be VENDOR_RATE_LIMIT, causing router-side raises in lowest_tpm_rpm_v2.py (5 instances) and model_rate_limit_check.py (4 instances) to silently misattribute themselves as vendor errors. This breaks dashboards and callback-based alerting that rely on error categories to split litellm-side vs. vendor-side throttles. All 9 raises previously emitted category=vendor_rate_limit; fix makes them emit litellm_rate_limit with correct rate_limit_type. Impacts enterprise LiteLLM proxy operators monitoring rate limits. Category mapping: LiteLLM (proxy/routing layer error affecting commercial deployments).
Quick fixes
- Confirm the exact error signature matches
Deployment over defined rpm limit / Deployment over defined tpm limit — dashboard/callbacks attribute these to upstream vendor (VENDOR_RATE_LIMIT) instead of LiteLLM router (LITELLM_RATE_LIMIT). - Check the LiteLLM account, local tool state, and provider configuration involved in the failing workflow.
- Reduce request pressure, check quota or plan limits, and retry with backoff instead of immediate repeated requests.
Platform/tool-specific checks
- Verify the command, editor, extension, or API client that produced the error.
- Compare local settings with CI, deployment, or editor-level settings when the error appears in only one environment.
- Avoid deleting credentials, local model data, or project settings until the failing scope is clear.
Step-by-step troubleshooting
- Capture the exact error message and the command, editor action, or request that triggered it.
- Check whether the failure is account/auth, quota/rate, model/provider, local runtime, or deployment configuration.
- Review the source evidence below and compare it with your environment.
- Apply one change at a time and rerun the smallest failing action.
- Keep the working fix documented for the team or deployment environment.
How to prevent it
- Keep provider/tool configuration documented.
- Record non-secret diagnostics such as tool version, provider name, model name, and command path.
- Add a lightweight check before CI or production workflows depend on the tool.