Tutorials

Build a Terminal Chat Tool in 50 Lines

PNPriya NairDec 28, 20254 min read

Nothing cements how an API works like building a small real tool with it. In this tutorial you will write a complete terminal chat application in about 50 lines of Python. It streams responses as they are generated, remembers the conversation, lets you switch models on the fly, and prints how much each reply cost, all through Model Database.

It is a genuinely useful little program, and a great template to extend.

Setup

Install the SDK and set your key:

pip install openai
export MDB_API_KEY="mdb_live_xxxxxxxxxxxxxxxxxxxxxxxx"

The full program

Create chat.py. The whole tool fits comfortably under 50 lines:

import os
from openai import OpenAI

client = OpenAI(
    base_url="https://modeldatabase.com/v1",
    api_key=os.environ["MDB_API_KEY"],
)

model = "anthropic/claude-sonnet-4-6"
messages = [{"role": "system", "content": "You are a helpful terminal assistant. Be concise."}]

print("MDB chat. Commands: /model <id>, /reset, /quit")

while True:
    try:
        user = input("\nyou > ").strip()
    except (EOFError, KeyboardInterrupt):
        break
    if not user:
        continue
    if user == "/quit":
        break
    if user == "/reset":
        messages = messages[:1]
        print("(history cleared)")
        continue
    if user.startswith("/model "):
        model = user.split(" ", 1)[1].strip()
        print(f"(switched to {model})")
        continue

    messages.append({"role": "user", "content": user})

    print("bot > ", end="", flush=True)
    reply = ""
    stream = client.chat.completions.create(model=model, messages=messages, stream=True)
    for chunk in stream:
        delta = chunk.choices[0].delta.content
        if delta:
            print(delta, end="", flush=True)
            reply += delta
    print()
    messages.append({"role": "assistant", "content": reply})

How it works

A few small pieces do all the work:

Try switching models live

Run python chat.py and chat. Then type a command to switch providers mid-session:

/model openai/gpt-4o
/model google/gemini-2.0-flash
/model deepseek/deepseek-chat

Because every model is behind the same API, the tool does not change at all, only the model string does. This makes it a handy way to compare how different models answer the same prompt.

Add a cost readout (optional)

Want to see what each reply cost? Use a non-streaming call with the raw response to read the billing headers, then print them after the answer:

raw = client.chat.completions.with_raw_response.create(model=model, messages=messages)
completion = raw.parse()
print(completion.choices[0].message.content)
print(f"[charged ${raw.headers.get('X-MDB-Charged-USD')}, "
      f"balance ${raw.headers.get('X-MDB-Balance-USD')}]")

You can keep both modes and toggle between streaming and a cost readout with another slash command.

Ideas to extend it

In 50 lines you have a streaming, multi-model, memory-aware chat tool. Get your key and credit at your dashboard, and browse the full API in the docs.

← All articles Get your API key →