Tutorials

Handling Errors and Retries Gracefully

PNPriya NairFeb 6, 20264 min read

Real applications have to deal with things going wrong: a bad request, an expired key, an empty balance, or a temporary hiccup. Handling these gracefully is the difference between an app that feels flaky and one that feels solid. This tutorial covers the HTTP status codes you will see from Model Database and how to retry safely with exponential backoff.

Because Model Database is OpenAI-compatible, errors follow the familiar HTTP status code conventions and return a JSON body describing what happened.

The status codes that matter

The key distinction: 4xx errors except 429 are your responsibility to fix and should not be blindly retried, while 429 and 5xx are transient and are good retry candidates.

Reading the error

Inspect the status code and body so you can branch correctly. With the OpenAI SDK, errors raise typed exceptions carrying a status code:

from openai import OpenAI, APIStatusError

client = OpenAI(base_url="https://modeldatabase.com/v1", api_key="mdb_live_...")

try:
    resp = client.chat.completions.create(
        model="anthropic/claude-sonnet-4-6",
        messages=[{"role": "user", "content": "Hello"}],
    )
except APIStatusError as e:
    print("status:", e.status_code)
    print("body:", e.response.text)

Do not retry what cannot succeed

Retrying a 400, 401, or 402 just wastes time and hammers the API. Branch on the status code and only retry the transient ones:

def should_retry(status):
    return status == 429 or 500 <= status < 600

For a 402, the right action is to surface a clear message and prompt the user to add credit, not to loop.

Exponential backoff with jitter

When a request is retryable, wait longer between each attempt so you give the system time to recover. Exponential backoff doubles the delay each try (1s, 2s, 4s, ...), and jitter adds a small random amount so many clients do not all retry at the same instant:

import time, random
from openai import OpenAI, APIStatusError

client = OpenAI(base_url="https://modeldatabase.com/v1", api_key="mdb_live_...")

def chat_with_retry(messages, model="anthropic/claude-sonnet-4-6", max_retries=5):
    for attempt in range(max_retries):
        try:
            return client.chat.completions.create(model=model, messages=messages)
        except APIStatusError as e:
            if not (e.status_code == 429 or 500 <= e.status_code < 600):
                raise  # 400/401/402 etc: do not retry
            if attempt == max_retries - 1:
                raise
            delay = (2 ** attempt) + random.uniform(0, 1)
            time.sleep(delay)

This tries up to five times, backing off 1s, 2s, 4s, 8s, 16s (plus jitter), and gives up gracefully if the issue persists.

The SDK has retries built in

The OpenAI SDK already retries certain transient errors automatically. You can tune how many times:

client = OpenAI(
    base_url="https://modeldatabase.com/v1",
    api_key="mdb_live_...",
    max_retries=4,
    timeout=30.0,
)

For most apps this is enough. Add your own backoff loop when you need custom logic, such as different handling for 402.

Set timeouts and fail fast

Always set a request timeout so a stalled connection does not hang your app. Combine a sensible timeout with a bounded number of retries so worst-case latency stays predictable. Log the status code and the X-MDB-Charged-USD header on success so you can correlate failures with cost and traffic.

A quick checklist

Robust error handling keeps your app calm under pressure. Top up credit and check your key at your dashboard, and see the full error reference in the docs.

← All articles Get your API key →