Grounded web search

What you will achieve

Ask a question that requires current information and receive a grounded answer — the model searches the web server-side and cites its sources. One complete() call with tools: [{ type: 'web_search' }] works across OpenAI, Anthropic, Google, and OpenRouter. Citations and grounding metadata stay in response.raw when you need them.

When and why

Web search is a server-side built-in tool: the provider’s infrastructure performs the search, injects results into the model’s context, and the model synthesises a grounded reply — no client-side execute function is involved.

Use it when:

The question is time-sensitive — current prices, release dates, recent events, latest software versions.
Hallucination risk is high — let the model cite real sources rather than generate plausible-sounding facts.
You don’t have a corpus — no document store to embed and retrieve; let the web be the retrieval layer.

Avoid it when:

The answer is in your own documents — use Retrieval (RAG) with embed() instead.
Latency is critical — each web search adds round-trips inside the provider’s infrastructure.
Cost matters — web search tools carry per-call pricing on top of token cost.

Step by step

Step 1 — Enable web search with one descriptor

import { complete } from '@combycode/llm-sdk';

const { text, response } = await complete({
  model: process.env.LLM_MODEL!,  // e.g. 'openai/gpt-4o' or 'anthropic/claude-opus-4.8'
  apiKey: process.env.LLM_API_KEY,
  prompt: 'What is the latest stable Python version? Cite a source.',
  tools: [{ type: 'web_search' }],
  maxTokens: 1024,
});

console.log(text); // 'Python 3.13.3 is the latest stable release (source: python.org)...'

No execute function — { type: 'web_search' } is a BuiltinTool. The SDK passes a no-op executor internally; the provider handles the search entirely server-side.

Step 2 — Understand what each provider receives

The SDK maps { type: 'web_search' } to the provider-native shape before sending:

Provider	Wire shape sent to API
OpenAI	`{ type: 'web_search_preview' }` (Responses API built-in)
Anthropic	`{ type: 'web_search_20250305', name: 'web_search', max_uses: 5 }`
Google	`{ googleSearch: {} }` (Gemini grounding tool)
OpenRouter	Appends `:online` suffix to the model id (e.g. `google/gemini-2.0-flash:online`)

You write one descriptor; the adapter translates it. The translation is in src/llm/providers/*/ and is transparent.

Step 3 — Access citations and grounding metadata

Citations and grounding data are in response.raw — the unmodified provider response:

const { text, response } = await complete({
  model: 'anthropic/claude-opus-4.8',
  prompt: 'What is the current EUR/USD exchange rate?',
  tools: [{ type: 'web_search' }],
  maxTokens: 512,
});

// Provider-specific citation location:
// OpenAI:    (response.raw as any).output_text  or look for url_citation items
// Anthropic: (response.raw as any) -- look for `citations` blocks in content
// Google:    (response.raw as any).candidates[0].groundingMetadata

console.log(text); // synthesised answer

The normalised response.text contains the final synthesised answer. Raw provider metadata (URL lists, grounding chunks, confidence scores) is in response.raw.

Step 4 — Mix web search with client-side tools

Web search and function tools can be combined. The model decides which to call:

import { complete, defineTool } from '@combycode/llm-sdk';

const lookupAccount = defineTool({
  name: 'lookup_account',
  description: 'Look up a customer account by email.',
  params: { email: 'string' },
  execute: async ({ email }) => JSON.stringify({ email, plan: 'pro', since: '2024-01' }),
});

const { text } = await complete({
  model: 'openai/gpt-4o',
  prompt: 'Check account user@example.com and also tell me the latest OpenAI news.',
  tools: [
    lookupAccount,              // client-side: your server executes this
    { type: 'web_search' },     // server-side: OpenAI executes this
  ],
  maxTokens: 1024,
});

When the model calls lookup_account, the SDK invokes your execute function. When it calls web search, nothing happens client-side.

Step 5 — Configure max_uses (Anthropic)

Anthropic’s web_search_20250305 tool supports a max_uses field. The adapter hardcodes it to 5. If you need a different value, pass a custom providerOptions override (not yet wired — use the client option to pass raw body overrides for now, or open a PR to expose max_uses via params).

Your options

BuiltinTool for web search:

{ type: 'web_search', params?: Record<string, unknown> }

Field	Notes
`type`	Must be `'web_search'`. Triggers the provider-specific mapping.
`params`	Passed through to the provider’s native tool entry. Provider-specific — check each provider’s API docs for valid keys. Currently used by OpenAI built-ins (e.g. `file_search` vector store ids).

Provider support matrix:

Provider	Mechanism	Who searches	`servedBy` metadata
OpenAI	`web_search_preview` (Responses API)	OpenAI infrastructure	In `response.raw` output items
Anthropic	`web_search_20250305`	Brave/Tavily (Anthropic-managed)	Citations in `response.raw` content blocks
Google	`{ googleSearch: {} }` grounding	Google Search	`groundingMetadata` in `response.raw`
OpenRouter	`:online` model suffix	Provider-dependent	Varies by routed model
xAI	Not supported	—	Pass xAI-specific tool directly via `providerOptions`

Cost implications:

Each provider bills web search separately from token cost. OpenAI charges per web search call; Anthropic charges per search query within the max_uses limit; Google charges for grounded generation. Check provider pricing pages before deploying at scale.

Controlling whether the model uses search:

With tools: [{ type: 'web_search' }] and default toolChoice, the model decides when to search. There is no required mode for built-in tools (only function tools support toolChoice: 'required'). Prompt the model explicitly (“use web search”) to bias it toward searching.

Compare the SDKs

import { complete } from '@combycode/llm-sdk';

// ONE unified web-search call across providers: the `web_search` builtin maps to
// each provider's grounding tool (openai web_search, anthropic web_search_2025…,
// google googleSearch, openrouter `:online` suffix). Citations live in the raw
// response under provider-specific keys — detect any of them.
const t0 = performance.now();
const { text, response } = await complete({
  model: process.env.LLM_MODEL!,
  apiKey: process.env.LLM_API_KEY,
  prompt: 'What is the latest stable Python version? Use web search and cite a source.',
  tools: [{ type: 'web_search' }],
  maxTokens: 1024,
});

const raw = JSON.stringify(response.raw ?? '');
const cited =
  /web_search_call|web_search_tool_result|groundingMetadata|url_citation|annotations|"citations"/i.test(
    raw,
  );
const result = cited ? 'cited' : text.trim() ? 'no-citation' : 'empty';
console.log(JSON.stringify({ result, ms: Math.round(performance.now() - t0) }));

OpenAI’s official SDK requires adding { type: 'web_search_preview' } directly; Anthropic requires { type: 'web_search_20250305', name: 'web_search', max_uses: 5 }; Google requires tools: [{ googleSearch: {} }]; OpenRouter requires appending :online to the model name. ORXA normalises all four to { type: 'web_search' } and handles the per-provider translation inside the adapter layer. The complete() call shape is identical regardless of provider.

Gotchas and next steps

OpenRouter uses :online, not a tool field. The adapter appends :online to the model id in the request body and removes the empty tools array. The model id in response.model will carry the :online suffix.

Citations are in response.raw, not response.text. The synthesised answer is in text. URL lists, source metadata, and grounding chunks are in response.raw under provider-specific keys — parse them according to the provider’s API response schema.

Google googleSearch is grounding, not a function call. Gemini does not use the tools parameter for grounding; instead it uses tools: [{ googleSearch: {} }] inside generateContent. The adapter handles this translation. Unlike function tools, there is no tool-call/tool-result round-trip — the grounding result is injected into the model’s context automatically.

Not available on every model. Anthropic’s web_search_20250305 requires a Claude 3.5+ model. OpenAI web_search_preview requires a supported GPT-4 / o-series model. Check provider docs for the current model list.

Combine with complete() options. tools: [{ type: 'web_search' }] is compatible with structured, maxTokens, temperature, and all other CompleteOptions. The agent loop runs internally when any tool is present.

Next steps:

Server-side built-in tools — full built-in tool reference (code interpreter, file search, image generation, MCP)
Embeddings — embed text for offline semantic retrieval
Retrieval (RAG) guide — combine embeddings and search into a RAG pipeline