System prompt
What you will achieve
Section titled “What you will achieve”Set a system instruction ('Reply with only the word PONG.') and a user message
('ping'), assert the model replies PONG. One system field, every provider.
When and why you need this
Section titled “When and why you need this”The system prompt is the highest-priority instruction you give the model. It defines the persona, constraints, and context for the whole conversation. Without it models tend to be verbose, off-topic, or inconsistent in tone.
Every provider places the system prompt differently:
- OpenAI — a
{ role: "system", content: "..." }entry inside themessagesarray. - Anthropic — a top-level
systemfield outside the messages array. - Google — a top-level
systemInstruction: { parts: [{ text: "..." }] }field.
If you hard-code to one shape you have to add conditionals when you add another provider.
Step by step
Section titled “Step by step”Step 1 — Pass a system string to complete()
Section titled “Step 1 — Pass a system string to complete()”import { complete } from '@combycode/llm-sdk';
const { text } = await complete({ model: process.env.LLM_MODEL!, apiKey: process.env.LLM_API_KEY, system: 'Reply with only the word PONG.', prompt: 'ping', maxTokens: 16,});
console.log(text); // 'PONG'The system field is the only change from a basic completion. The SDK maps it to the
provider-correct placement before the request is sent. Your application sees one
uniform field.
Step 2 — Use it for a persona-constrained assistant
Section titled “Step 2 — Use it for a persona-constrained assistant”A realistic system prompt establishes role, format, and constraints:
const { text } = await complete({ model: process.env.LLM_MODEL!, apiKey: process.env.LLM_API_KEY, system: [ 'You are a senior TypeScript engineer.', 'Reply only with code blocks followed by one sentence of explanation.', 'Never say "certainly" or "of course".', ].join('\n'), prompt: 'Write a function that debounces an async function.', maxTokens: 400,});Joining lines with \n is idiomatic. Separate them into logical sections to make the
system prompt easier to maintain.
Step 3 — Combine with a multi-turn conversation
Section titled “Step 3 — Combine with a multi-turn conversation”The system prompt applies to every turn in the conversation. Pass it once, not per-turn:
import { complete, type Message } from '@combycode/llm-sdk';
const SYSTEM = 'You are a helpful assistant that always responds in French.';
const history: Message[] = [ { role: 'user', content: 'Hello, who are you?' }, { role: 'assistant', content: 'Bonjour! Je suis un assistant IA.' }, { role: 'user', content: 'What is 2 + 2?' },];
const { text } = await complete({ model: process.env.LLM_MODEL!, apiKey: process.env.LLM_API_KEY, system: SYSTEM, prompt: history, // pass the message array as prompt maxTokens: 64,});
console.log(text); // 'Deux plus deux font quatre.'Step 4 — Role messages inside the message array
Section titled “Step 4 — Role messages inside the message array”You can also inject a role: 'system' message directly in the Message[] array.
The SDK extracts all system-role messages from the array, merges them with the top-level
system field (in the order: per-call system option, then extracted role-system
messages, then LLMClient.system), and sends the combined string through the
provider-correct channel.
const { text } = await complete({ model: process.env.LLM_MODEL!, apiKey: process.env.LLM_API_KEY, prompt: [ { role: 'system', content: 'Always respond in uppercase.' }, { role: 'user', content: 'What is the capital of France?' }, ], maxTokens: 16,});
console.log(text); // 'PARIS'This form is useful when you assemble a message array programmatically and want the system instruction co-located with the messages it controls.
Your options
Section titled “Your options”| Option | Where | Notes |
|---|---|---|
system on complete() | Per-call string | Highest priority system text for this call. Merged before any role-system messages from the prompt array. |
role: 'system' in Message[] | Per-call array entry | Extracted and merged after the top-level system. Lets you co-locate instructions with messages. |
system on createLLM() | Client-level default | Applied to every call made from this client. Per-call system is prepended before this. |
history.registry layers | Conversation-level | Preferred for dynamic, multi-contributor context. The AgentLoop uses this. See Layered context. |
Merge order when all three are present: per-call system option + extracted role-system
messages + LLMClient.system, joined with \n\n. This means the most-specific instruction
(per-call) renders first, which models follow more reliably.
Why not always use history.registry? The registry is the right tool for an
AgentLoop that manages a long-running conversation with multiple contributors updating
context turn-by-turn. For a one-shot complete() call the system string is simpler and
sufficient.
Compare the SDKs
Section titled “Compare the SDKs”The structural difference: official SDKs require you to know where each provider expects
its system instruction and to write provider-specific code for each. Anthropic’s SDK puts
system at the top level of the request; OpenAI requires a role: "system" message
in the messages array; Google requires a systemInstruction object with parts. ORXA
maps one system field to the right place automatically, and because it extracts any
role: 'system' entries from your message array, provider-idiomatic code you copy-paste
from Anthropic examples works without modification too.
Gotchas and next steps
Section titled “Gotchas and next steps”System prompts count as input tokens. Long, detailed system prompts increase cost per call. Enable prompt caching for stable system prompts that are reused across many calls — Anthropic and OpenAI both cache system prefixes.
Provider differences in adherence. Some models follow system-prompt constraints more strictly than others. If a model ignores your constraint, try rephrasing as a numbered list of rules rather than a paragraph.
For long-running agents, use the registry. The system field on complete() is
per-call. If you have multiple code paths (memory manager, RAG plugin, ContextGuard) each
contributing a section to the system prompt, use history.registry so each writes its
own named layer without clobbering others.
Next steps:
- Multi-turn conversation — extend the conversation across turns
- Structured output — constrain the reply to a JSON schema
- Layered context — dynamic multi-contributor system prompts