Skip to content

Grounded web search

Ask a question that requires current information and receive a grounded answer — the model searches the web server-side and cites its sources. One complete() call with tools: [{ type: 'web_search' }] works across OpenAI, Anthropic, Google, and OpenRouter. Citations and grounding metadata stay in response.raw when you need them.

Web search is a server-side built-in tool: the provider’s infrastructure performs the search, injects results into the model’s context, and the model synthesises a grounded reply — no client-side execute function is involved.

Use it when:

  • The question is time-sensitive — current prices, release dates, recent events, latest software versions.
  • Hallucination risk is high — let the model cite real sources rather than generate plausible-sounding facts.
  • You don’t have a corpus — no document store to embed and retrieve; let the web be the retrieval layer.

Avoid it when:

  • The answer is in your own documents — use Retrieval (RAG) with embed() instead.
  • Latency is critical — each web search adds round-trips inside the provider’s infrastructure.
  • Cost matters — web search tools carry per-call pricing on top of token cost.

Step 1 — Enable web search with one descriptor

Section titled “Step 1 — Enable web search with one descriptor”
import { complete } from '@combycode/llm-sdk';
const { text, response } = await complete({
model: process.env.LLM_MODEL!, // e.g. 'openai/gpt-4o' or 'anthropic/claude-opus-4.8'
apiKey: process.env.LLM_API_KEY,
prompt: 'What is the latest stable Python version? Cite a source.',
tools: [{ type: 'web_search' }],
maxTokens: 1024,
});
console.log(text); // 'Python 3.13.3 is the latest stable release (source: python.org)...'

No execute function — { type: 'web_search' } is a BuiltinTool. The SDK passes a no-op executor internally; the provider handles the search entirely server-side.

Step 2 — Understand what each provider receives

Section titled “Step 2 — Understand what each provider receives”

The SDK maps { type: 'web_search' } to the provider-native shape before sending:

ProviderWire shape sent to API
OpenAI{ type: 'web_search_preview' } (Responses API built-in)
Anthropic{ type: 'web_search_20250305', name: 'web_search', max_uses: 5 }
Google{ googleSearch: {} } (Gemini grounding tool)
OpenRouterAppends :online suffix to the model id (e.g. google/gemini-2.0-flash:online)

You write one descriptor; the adapter translates it. The translation is in src/llm/providers/*/ and is transparent.

Step 3 — Access citations and grounding metadata

Section titled “Step 3 — Access citations and grounding metadata”

Citations and grounding data are in response.raw — the unmodified provider response:

const { text, response } = await complete({
model: 'anthropic/claude-opus-4.8',
prompt: 'What is the current EUR/USD exchange rate?',
tools: [{ type: 'web_search' }],
maxTokens: 512,
});
// Provider-specific citation location:
// OpenAI: (response.raw as any).output_text or look for url_citation items
// Anthropic: (response.raw as any) -- look for `citations` blocks in content
// Google: (response.raw as any).candidates[0].groundingMetadata
console.log(text); // synthesised answer

The normalised response.text contains the final synthesised answer. Raw provider metadata (URL lists, grounding chunks, confidence scores) is in response.raw.

Step 4 — Mix web search with client-side tools

Section titled “Step 4 — Mix web search with client-side tools”

Web search and function tools can be combined. The model decides which to call:

import { complete, defineTool } from '@combycode/llm-sdk';
const lookupAccount = defineTool({
name: 'lookup_account',
description: 'Look up a customer account by email.',
params: { email: 'string' },
execute: async ({ email }) => JSON.stringify({ email, plan: 'pro', since: '2024-01' }),
});
const { text } = await complete({
model: 'openai/gpt-4o',
prompt: 'Check account user@example.com and also tell me the latest OpenAI news.',
tools: [
lookupAccount, // client-side: your server executes this
{ type: 'web_search' }, // server-side: OpenAI executes this
],
maxTokens: 1024,
});

When the model calls lookup_account, the SDK invokes your execute function. When it calls web search, nothing happens client-side.

Anthropic’s web_search_20250305 tool supports a max_uses field. The adapter hardcodes it to 5. If you need a different value, pass a custom providerOptions override (not yet wired — use the client option to pass raw body overrides for now, or open a PR to expose max_uses via params).

BuiltinTool for web search:

{ type: 'web_search', params?: Record<string, unknown> }
FieldNotes
typeMust be 'web_search'. Triggers the provider-specific mapping.
paramsPassed through to the provider’s native tool entry. Provider-specific — check each provider’s API docs for valid keys. Currently used by OpenAI built-ins (e.g. file_search vector store ids).

Provider support matrix:

ProviderMechanismWho searchesservedBy metadata
OpenAIweb_search_preview (Responses API)OpenAI infrastructureIn response.raw output items
Anthropicweb_search_20250305Brave/Tavily (Anthropic-managed)Citations in response.raw content blocks
Google{ googleSearch: {} } groundingGoogle SearchgroundingMetadata in response.raw
OpenRouter:online model suffixProvider-dependentVaries by routed model
xAINot supportedPass xAI-specific tool directly via providerOptions

Cost implications:

Each provider bills web search separately from token cost. OpenAI charges per web search call; Anthropic charges per search query within the max_uses limit; Google charges for grounded generation. Check provider pricing pages before deploying at scale.

Controlling whether the model uses search:

With tools: [{ type: 'web_search' }] and default toolChoice, the model decides when to search. There is no required mode for built-in tools (only function tools support toolChoice: 'required'). Prompt the model explicitly (“use web search”) to bias it toward searching.

import { complete } from '@combycode/llm-sdk';

// ONE unified web-search call across providers: the `web_search` builtin maps to
// each provider's grounding tool (openai web_search, anthropic web_search_2025…,
// google googleSearch, openrouter `:online` suffix). Citations live in the raw
// response under provider-specific keys — detect any of them.
const t0 = performance.now();
const { text, response } = await complete({
  model: process.env.LLM_MODEL!,
  apiKey: process.env.LLM_API_KEY,
  prompt: 'What is the latest stable Python version? Use web search and cite a source.',
  tools: [{ type: 'web_search' }],
  maxTokens: 1024,
});

const raw = JSON.stringify(response.raw ?? '');
const cited =
  /web_search_call|web_search_tool_result|groundingMetadata|url_citation|annotations|"citations"/i.test(
    raw,
  );
const result = cited ? 'cited' : text.trim() ? 'no-citation' : 'empty';
console.log(JSON.stringify({ result, ms: Math.round(performance.now() - t0) }));

OpenAI’s official SDK requires adding { type: 'web_search_preview' } directly; Anthropic requires { type: 'web_search_20250305', name: 'web_search', max_uses: 5 }; Google requires tools: [{ googleSearch: {} }]; OpenRouter requires appending :online to the model name. ORXA normalises all four to { type: 'web_search' } and handles the per-provider translation inside the adapter layer. The complete() call shape is identical regardless of provider.

OpenRouter uses :online, not a tool field. The adapter appends :online to the model id in the request body and removes the empty tools array. The model id in response.model will carry the :online suffix.

Citations are in response.raw, not response.text. The synthesised answer is in text. URL lists, source metadata, and grounding chunks are in response.raw under provider-specific keys — parse them according to the provider’s API response schema.

Google googleSearch is grounding, not a function call. Gemini does not use the tools parameter for grounding; instead it uses tools: [{ googleSearch: {} }] inside generateContent. The adapter handles this translation. Unlike function tools, there is no tool-call/tool-result round-trip — the grounding result is injected into the model’s context automatically.

Not available on every model. Anthropic’s web_search_20250305 requires a Claude 3.5+ model. OpenAI web_search_preview requires a supported GPT-4 / o-series model. Check provider docs for the current model list.

Combine with complete() options. tools: [{ type: 'web_search' }] is compatible with structured, maxTokens, temperature, and all other CompleteOptions. The agent loop runs internally when any tool is present.

Next steps: