Skip to content

Tokens + Embeddings -- countTokens / embed / transcribe

This group covers the three “non-chat” LLM capabilities: counting tokens (for context management and cost estimation), producing embedding vectors, and transcribing speech to text.

  • You want to know how many tokens a prompt will consume before sending it (countTokens).
  • You need embedding vectors for semantic search, similarity ranking, or clustering (embed).
  • You need to convert an audio file or stream to text (transcribe).
ExportWhat it does
countTokens(opts)Count the tokens in a string or message array. Picks the right counter per model: tiktoken for OpenAI, count-API for Anthropic/Google, heuristic otherwise.
embed(opts)Produce embedding vectors from a string or string array. Works with OpenAI, Google, and OpenRouter. Returns { vectors, dimensions, usage }.
transcribe(opts)Speech-to-text. OpenAI routes to /v1/audio/transcriptions; Google uses a chat-style completion internally. Returns { text, language? }.
HybridTokenCounterLow-level token counter that tries tiktoken, falls back to count-API, then heuristic. Used by countTokens and estimate() internally.
HeuristicCounter / TiktokenCounter / CountApiCounterIndividual counters for custom wiring.
import { countTokens } from '@combycode/llm-sdk';
const n = await countTokens({
model: 'openai/gpt-5.4-nano',
apiKey: process.env.OPENAI_API_KEY,
input: 'The quick brown fox jumps over the lazy dog.',
});
console.log(`Token count: ${n}`);
import { embed } from '@combycode/llm-sdk';
const { vectors, dimensions } = await embed({
model: 'openai/text-embedding-3-small',
apiKey: process.env.OPENAI_API_KEY,
input: ['hello world', 'foo bar'],
});
console.log(`${vectors.length} vectors, ${dimensions} dimensions each`);
import { transcribe } from '@combycode/llm-sdk';
const { text } = await transcribe({
model: 'openai/gpt-4o-audio-preview',
apiKey: process.env.OPENAI_API_KEY,
audio: './recording.wav', // file path, URL, or Uint8Array
});
console.log(text);