Getting Reliable JSON Out of LLMs

You want a model to return data your code can parse: an object with fixed keys, the right types, nothing else. Yet ask plainly and you'll get JSON wrapped in a friendly sentence, fenced in markdown, or subtly malformed. Getting reliable JSON is a solvable engineering problem, not a matter of luck.

This article covers the layered tactics that turn flaky extraction into a dependable pipeline on the Model Database API.

Ask for a schema, not "JSON"

Vague instructions produce vague output. Give the model the exact shape you expect, name every field, and state types. Provide one concrete example so it can pattern-match.

from openai import OpenAI
client = OpenAI(base_url="https://modeldatabase.com/v1", api_key="mdb_live_...")

system = """Extract fields and return ONLY a JSON object, no prose, no markdown fences.
Schema:
{
  "name": string,
  "email": string | null,
  "priority": "low" | "medium" | "high"
}"""

resp = client.chat.completions.create(
    model="openai/gpt-4o-mini",
    temperature=0,
    messages=[
        {"role": "system", "content": system},
        {"role": "user", "content": "Hi, I'm Dana, urgent issue, [email protected]"},
    ],
)
print(resp.choices[0].message.content)

Temperature 0 removes most variability. "ONLY a JSON object" plus "no markdown fences" kills the two most common failure modes.

Use response_format when the model supports it

Model Database passes response_format through to the model. Models that support JSON mode will constrain output to valid JSON, eliminating fence and prose noise.

resp = client.chat.completions.create(
    model="openai/gpt-4o",
    response_format={"type": "json_object"},
    messages=[...],
)

This is a capability of the chosen model, not every model honors it, so confirm with a quick test before relying on it. When supported, it is the single biggest reliability win.

Always parse defensively

Even with JSON mode, treat output as untrusted. Strip stray fences, parse in a try/except, and validate against a real schema. A library like Pydantic gives you typed objects and clear errors in one step.

import json, re
from pydantic import BaseModel, ValidationError

class Ticket(BaseModel):
    name: str
    email: str | None
    priority: str

def coerce(raw: str) -> Ticket:
    raw = re.sub(r"^```(json)?|```$", "", raw.strip(), flags=re.M).strip()
    return Ticket(**json.loads(raw))

Retry with the error message

When parsing fails, don't just retry blindly. Feed the validation error back to the model so it can correct itself. One targeted retry resolves the large majority of failures.

def extract(messages, attempts=2):
    for i in range(attempts):
        out = client.chat.completions.create(
            model="openai/gpt-4o-mini", temperature=0, messages=messages,
        ).choices[0].message.content
        try:
            return coerce(out)
        except (ValidationError, json.JSONDecodeError) as e:
            messages += [
                {"role": "assistant", "content": out},
                {"role": "user", "content": f"That failed validation: {e}. Return corrected JSON only."},
            ]
    raise RuntimeError("could not get valid JSON")

Prefer tool calls for structured extraction

If your model supports function calling, defining a single tool with your JSON Schema is often more robust than free-form JSON, because the schema constrains the arguments directly. Set tool_choice to force that tool and read the structured arguments from the response.

Honest limitations

No prompt guarantees valid JSON 100% of the time; always keep the parse-and-retry layer.
Strict schema constraints can cause a model to truncate or invent values to satisfy a required field. Allow null and validate ranges.
Large nested schemas raise error rates. Flatten where you can, and split very large extractions into multiple smaller calls.

Start extracting structured data today with a key from your dashboard, and check which models support JSON mode and tools in the docs.

Getting Reliable JSON Out of LLMs

Ask for a schema, not "JSON"

Use response_format when the model supports it

Always parse defensively

Retry with the error message

Prefer tool calls for structured extraction

Honest limitations

More in Engineering

A Practical Guide to Retrieval-Augmented Generation

Function Calling and Tool Use, Explained

Building Agents That Actually Work