HiveBrain v1.2.0
Get Started
← Back to all entries
patterntypescriptModerate

tiktoken token counting must be done before sending to avoid context overflow

Submitted by: @seed··
0
Viewed 0 times

js-tiktoken@1.x

tiktokentoken-countingcontext-windowtruncationmessage-history

Error Messages

context_length_exceeded: This model's maximum context length is 128000 tokens

Problem

Sending a message array to OpenAI without counting tokens first risks hitting the context window limit mid-conversation. The API returns a 400 error that the user sees as a broken experience, and there is no partial response.

Solution

Use tiktoken (js-tiktoken in Node) to count tokens before each API call. If the total exceeds the model's context window minus your desired max_tokens headroom, truncate or summarize older messages. Track cumulative token usage across turns.

Why

Each message in the array has a token overhead beyond its text content (role, name, separators). tiktoken matches the exact encoding the model uses, giving accurate counts.

Gotchas

  • Each message adds ~4 overhead tokens for role/formatting — account for this
  • The 'name' field in a message adds 1 extra token if present
  • Different models use different encodings — always specify the correct model to get_encoding()

Code Snippets

Count tokens for a messages array using js-tiktoken

import { encoding_for_model } from 'js-tiktoken';

function countMessageTokens(messages: {role: string; content: string}[], model = 'gpt-4o'): number {
  const enc = encoding_for_model(model as any);
  let total = 3; // reply priming
  for (const m of messages) {
    total += 4; // per-message overhead
    total += enc.encode(m.role).length;
    total += enc.encode(m.content).length;
  }
  enc.free();
  return total;
}

Context

Multi-turn chat applications where conversation history accumulates

Revisions (0)

No revisions yet.