Prompt engineering gets dismissed as fiddling with words. In production it's something more disciplined: a set of repeatable patterns that make outputs predictable, testable, and cheap to maintain. This article collects the patterns that hold up as your features grow, all usable directly on the Model Database API.
Separate the system prompt from the data
Put stable instructions in the system message and per-request data in user messages. This keeps your rules in one place, makes them cacheable in your own code, and prevents user input from rewriting your instructions.
from openai import OpenAI
client = OpenAI(base_url="https://modeldatabase.com/v1", api_key="mdb_live_...")
SYSTEM = "You are a support classifier. Output one label: billing, technical, or other."
def classify(text):
return client.chat.completions.create(
model="openai/gpt-4o-mini",
temperature=0,
messages=[
{"role": "system", "content": SYSTEM},
{"role": "user", "content": text},
],
).choices[0].message.content
Show, don't just tell (few-shot)
A couple of input/output examples communicate format and edge-case handling more reliably than a paragraph of description. Include examples as prior turns so the model treats them as the established pattern. Keep them few and representative; redundant examples cost tokens without adding signal.
Give the model room to think
For reasoning tasks, ask the model to work through steps before answering, then return only the conclusion in a structured field. This "reason then answer" pattern improves accuracy on multi-step problems.
PROMPT = """Think step by step, then end with a line:
FINAL: <yes|no>
Question: Is 51 prime?"""
Parse only the FINAL: line in code so the reasoning stays internal and your output stays clean.
Constrain the output shape
Ambiguous formats produce ambiguous output. State the exact shape: a label from a fixed set, JSON matching a schema, or a single sentence. The narrower the target, the more consistent the result and the easier it is to validate downstream.
Make prompts versioned assets
Treat prompts like code: store them in your repo, give each a version, and pin which version a feature uses. When you change a prompt, run it against your evaluation set before shipping. This is what makes prompt work scale, you can reason about changes instead of guessing.
- One job per prompt: a prompt that classifies and summarizes and translates will do all three poorly. Split them.
- Name your variables: use clear delimiters like triple backticks or XML-ish tags around injected data so the model knows where content begins and ends.
- Fail loudly: instruct the model what to output when input is empty or malformed.
Match the pattern to the model
Because Model Database exposes many models behind one endpoint, you can test the same prompt across tiers. A cheap model like openai/gpt-4o-mini may need more explicit examples; a stronger one like anthropic/claude-opus-4-8 may follow terse instructions. Don't over-engineer a prompt for a model that doesn't need it, and don't assume a prompt tuned for one model transfers to another, re-test on each.
Honest limitations
Prompting has a ceiling. If a task needs facts the model lacks, you need retrieval, not better wording. If it needs guaranteed structure, you need schema validation and retries, not just instructions. And prompts are brittle across model versions: a wording that works today may behave differently after a model update, which is exactly why you keep an evaluation set and version your prompts.
Test your prompts across models with one key from your dashboard, and find the full parameter list in the docs.