Beginner· ~2 min read#system-prompt#prompting#persona

System Prompt

The persistent instruction

A standing instruction placed at the top of every conversation that defines the model's role, tone, rules, and boundaries.

Definition

When you talk to an LLM there are really two kinds of messages: the system prompt and the user message. The system prompt sits at the top of every conversation as a persistent instruction. The model sees it before producing every reply.

It typically contains: role definition ("You are a Turkish recipe assistant"), rules ("answer in 5 lines max"), format ("use bullet points"), constraints ("don't discuss politics"), and static context ("our company is Avva, founded 2018"). The user never sees it — it runs behind the curtain.

Modern models (Claude, GPT-4) give system prompts extra weight. The same instruction in a user message might get ignored; in the system prompt the model sticks to it.

Analogy

A new waiter starting at your restaurant gets the menu, dress code, and "always say 'sir' not 'mate'" rules on day one. Those notes stay in their head all shift. The system prompt does the same — no matter how long the chat goes, the model keeps glancing back at "what's my role, what are my rules?"

Real-world example

You're building an e-commerce support bot. System prompt: "You are Avva's customer support assistant. Reply only in Turkish. Use the get_order(id) tool for order info. Never offer a discount — escalate to a human for that. Keep replies under 4 sentences." User asks "how many hours for shipping?" — model replies in Turkish, short, using the tool. User asks "give me 50% off" — model says "I'll transfer you to a representative." The system prompt nudges every turn.

Code examples

Anthropic SDK · system + user messagePython

from anthropic import Anthropic

client = Anthropic()

SYSTEM = """You are Avva's customer support assistant.
Reply only in Turkish. Maximum 4 sentences.
Never promise a discount — escalate to a human."""

msg = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=1024,
    system=SYSTEM,                # <-- persistent instruction
    messages=[
        {"role": "user", "content": "How long until my order ships?"}
    ],
)

print(msg.content[0].text)
# Model returns a Turkish, short, rule-following reply.

When to use

Giving the model a role/persona (assistant, tutor, doctor, etc.)
Locking output format (JSON, markdown, length)
Do/don't rules (no politics, no leaking secrets)
Static context: company info, date format, language preference

When not to use

Info that changes per request — put it in the user message or context
Very long document context — use RAG instead of stuffing the system prompt
One-off prompt experiments — a plain user message is enough

Common pitfalls

Writing it 5000 tokens long

A 5000-token system prompt means 5000 fewer tokens for the user message. Lead with the critical rules, drop the rest.

Contradictory rules

'Never do math' + 'answer every question' will make the model break one. Keep the rules consistent.

Treating it as secret

Some models will leak the system prompt when asked to 'recite the previous message'. Don't put API keys or user secrets in it.