Handling Errors and Retries Gracefully

Real applications have to deal with things going wrong: a bad request, an expired key, an empty balance, or a temporary hiccup. Handling these gracefully is the difference between an app that feels flaky and one that feels solid. This tutorial covers the HTTP status codes you will see from Model Database and how to retry safely with exponential backoff.

Because Model Database is OpenAI-compatible, errors follow the familiar HTTP status code conventions and return a JSON body describing what happened.

The status codes that matter

400 Bad Request — your request is malformed. Common causes: an invalid model ID, a missing messages array, or a value out of range. These will not succeed on retry; fix the request.
401 Unauthorized — your key is missing, wrong, or revoked. Check the Authorization: Bearer mdb_live_... header. Retrying will not help until the key is fixed.
402 Payment Required — your prepaid balance is too low to cover the request. Top up credit at the dashboard. Retrying without adding funds will keep failing.
429 Too Many Requests — you are sending requests too quickly. This is the main case where you should back off and retry.
5xx Server errors — a transient problem on the server side. These are usually safe to retry.

The key distinction: 4xx errors except 429 are your responsibility to fix and should not be blindly retried, while 429 and 5xx are transient and are good retry candidates.

Reading the error

Inspect the status code and body so you can branch correctly. With the OpenAI SDK, errors raise typed exceptions carrying a status code:

from openai import OpenAI, APIStatusError

client = OpenAI(base_url="https://modeldatabase.com/v1", api_key="mdb_live_...")

try:
    resp = client.chat.completions.create(
        model="anthropic/claude-sonnet-4-6",
        messages=[{"role": "user", "content": "Hello"}],
    )
except APIStatusError as e:
    print("status:", e.status_code)
    print("body:", e.response.text)

Do not retry what cannot succeed

Retrying a 400, 401, or 402 just wastes time and hammers the API. Branch on the status code and only retry the transient ones:

def should_retry(status):
    return status == 429 or 500 <= status < 600

For a 402, the right action is to surface a clear message and prompt the user to add credit, not to loop.

Exponential backoff with jitter

When a request is retryable, wait longer between each attempt so you give the system time to recover. Exponential backoff doubles the delay each try (1s, 2s, 4s, ...), and jitter adds a small random amount so many clients do not all retry at the same instant:

import time, random
from openai import OpenAI, APIStatusError

client = OpenAI(base_url="https://modeldatabase.com/v1", api_key="mdb_live_...")

def chat_with_retry(messages, model="anthropic/claude-sonnet-4-6", max_retries=5):
    for attempt in range(max_retries):
        try:
            return client.chat.completions.create(model=model, messages=messages)
        except APIStatusError as e:
            if not (e.status_code == 429 or 500 <= e.status_code < 600):
                raise  # 400/401/402 etc: do not retry
            if attempt == max_retries - 1:
                raise
            delay = (2 ** attempt) + random.uniform(0, 1)
            time.sleep(delay)

This tries up to five times, backing off 1s, 2s, 4s, 8s, 16s (plus jitter), and gives up gracefully if the issue persists.

The SDK has retries built in

The OpenAI SDK already retries certain transient errors automatically. You can tune how many times:

client = OpenAI(
    base_url="https://modeldatabase.com/v1",
    api_key="mdb_live_...",
    max_retries=4,
    timeout=30.0,
)

For most apps this is enough. Add your own backoff loop when you need custom logic, such as different handling for 402.

Set timeouts and fail fast

Always set a request timeout so a stalled connection does not hang your app. Combine a sensible timeout with a bounded number of retries so worst-case latency stays predictable. Log the status code and the X-MDB-Charged-USD header on success so you can correlate failures with cost and traffic.

A quick checklist

Branch on status: fix 400/401/402, retry 429/5xx.
Use exponential backoff with jitter, capped at a max number of attempts.
Set a timeout on every request.
Show users a clear, actionable message (especially "add credit" for 402).

Robust error handling keeps your app calm under pressure. Top up credit and check your key at your dashboard, and see the full error reference in the docs.

Handling Errors and Retries Gracefully

The status codes that matter

Reading the error

Do not retry what cannot succeed

Exponential backoff with jitter

The SDK has retries built in

Set timeouts and fail fast

A quick checklist

More in Tutorials

How to Get Your First Model Database API Key

Switch From the OpenAI API in One Line

Your First Chat Completion in Python