Skip to content

System prompt

Try in Sandbox Opens a live chat playground with this example prefilled — add your API key then hit Send. Runs in your browser; no code is executed.

Set a system instruction ('Reply with only the word PONG.') and a user message ('ping'), assert the model replies PONG. One system field, every provider.

The system prompt is the highest-priority instruction you give the model. It defines the persona, constraints, and context for the whole conversation. Without it models tend to be verbose, off-topic, or inconsistent in tone.

Every provider places the system prompt differently:

  • OpenAI — a { role: "system", content: "..." } entry inside the messages array.
  • Anthropic — a top-level system field outside the messages array.
  • Google — a top-level systemInstruction: { parts: [{ text: "..." }] } field.

If you hard-code to one shape you have to add conditionals when you add another provider.

Step 1 — Pass a system string to complete()

Section titled “Step 1 — Pass a system string to complete()”
import { complete } from '@combycode/llm-sdk';
const { text } = await complete({
model: process.env.LLM_MODEL!,
apiKey: process.env.LLM_API_KEY,
system: 'Reply with only the word PONG.',
prompt: 'ping',
maxTokens: 16,
});
console.log(text); // 'PONG'

The system field is the only change from a basic completion. The SDK maps it to the provider-correct placement before the request is sent. Your application sees one uniform field.

Step 2 — Use it for a persona-constrained assistant

Section titled “Step 2 — Use it for a persona-constrained assistant”

A realistic system prompt establishes role, format, and constraints:

const { text } = await complete({
model: process.env.LLM_MODEL!,
apiKey: process.env.LLM_API_KEY,
system: [
'You are a senior TypeScript engineer.',
'Reply only with code blocks followed by one sentence of explanation.',
'Never say "certainly" or "of course".',
].join('\n'),
prompt: 'Write a function that debounces an async function.',
maxTokens: 400,
});

Joining lines with \n is idiomatic. Separate them into logical sections to make the system prompt easier to maintain.

Step 3 — Combine with a multi-turn conversation

Section titled “Step 3 — Combine with a multi-turn conversation”

The system prompt applies to every turn in the conversation. Pass it once, not per-turn:

import { complete, type Message } from '@combycode/llm-sdk';
const SYSTEM = 'You are a helpful assistant that always responds in French.';
const history: Message[] = [
{ role: 'user', content: 'Hello, who are you?' },
{ role: 'assistant', content: 'Bonjour! Je suis un assistant IA.' },
{ role: 'user', content: 'What is 2 + 2?' },
];
const { text } = await complete({
model: process.env.LLM_MODEL!,
apiKey: process.env.LLM_API_KEY,
system: SYSTEM,
prompt: history, // pass the message array as prompt
maxTokens: 64,
});
console.log(text); // 'Deux plus deux font quatre.'

Step 4 — Role messages inside the message array

Section titled “Step 4 — Role messages inside the message array”

You can also inject a role: 'system' message directly in the Message[] array. The SDK extracts all system-role messages from the array, merges them with the top-level system field (in the order: per-call system option, then extracted role-system messages, then LLMClient.system), and sends the combined string through the provider-correct channel.

const { text } = await complete({
model: process.env.LLM_MODEL!,
apiKey: process.env.LLM_API_KEY,
prompt: [
{ role: 'system', content: 'Always respond in uppercase.' },
{ role: 'user', content: 'What is the capital of France?' },
],
maxTokens: 16,
});
console.log(text); // 'PARIS'

This form is useful when you assemble a message array programmatically and want the system instruction co-located with the messages it controls.

OptionWhereNotes
system on complete()Per-call stringHighest priority system text for this call. Merged before any role-system messages from the prompt array.
role: 'system' in Message[]Per-call array entryExtracted and merged after the top-level system. Lets you co-locate instructions with messages.
system on createLLM()Client-level defaultApplied to every call made from this client. Per-call system is prepended before this.
history.registry layersConversation-levelPreferred for dynamic, multi-contributor context. The AgentLoop uses this. See Layered context.

Merge order when all three are present: per-call system option + extracted role-system messages + LLMClient.system, joined with \n\n. This means the most-specific instruction (per-call) renders first, which models follow more reliably.

Why not always use history.registry? The registry is the right tool for an AgentLoop that manages a long-running conversation with multiple contributors updating context turn-by-turn. For a one-shot complete() call the system string is simpler and sufficient.

import { complete } from '@combycode/llm-sdk';

const t0 = performance.now();
const { text } = await complete({
  model: process.env.LLM_MODEL!,
  apiKey: process.env.LLM_API_KEY,
  system: 'Reply with only the word PONG.',
  prompt: 'ping',
  maxTokens: 16,
});

console.log(JSON.stringify({ result: text.trim(), ms: Math.round(performance.now() - t0) }));

The structural difference: official SDKs require you to know where each provider expects its system instruction and to write provider-specific code for each. Anthropic’s SDK puts system at the top level of the request; OpenAI requires a role: "system" message in the messages array; Google requires a systemInstruction object with parts. ORXA maps one system field to the right place automatically, and because it extracts any role: 'system' entries from your message array, provider-idiomatic code you copy-paste from Anthropic examples works without modification too.

System prompts count as input tokens. Long, detailed system prompts increase cost per call. Enable prompt caching for stable system prompts that are reused across many calls — Anthropic and OpenAI both cache system prefixes.

Provider differences in adherence. Some models follow system-prompt constraints more strictly than others. If a model ignores your constraint, try rephrasing as a numbered list of rules rather than a paragraph.

For long-running agents, use the registry. The system field on complete() is per-call. If you have multiple code paths (memory manager, RAG plugin, ContextGuard) each contributing a section to the system prompt, use history.registry so each writes its own named layer without clobbering others.

Next steps: