How it works
Modern LLMs don't see characters — they see tokens, sub-word units produced by a Byte-Pair Encoding (BPE) tokenizer. Common English words are usually one token; rarer words split into multiple; non-Latin scripts often run a few characters per token.
This tool runs OpenAI's reference tokenizer in your browser via js-tiktoken, the official-API-compatible JavaScript port. The token IDs you see here match what OpenAI's tiktoken Python library produces — and what the model itself sees during inference.
The visualisation alternates background colours per token so you can spot how a tokenizer chunks text. Hover any token to see its numeric ID. · represents a space; ↵ represents a newline.
Examples
-
"Hello, world!"→ 4 tokens (gpt-4o) -
"你好,世界"→ 4 tokens (gpt-4o) · 11 tokens (gpt-3.5) o200k_base is much denser for CJK than cl100k_base. -
"const x = 42;"→ 5 tokens
FAQ
- Why does the same string have different token counts across models?
- Different models use different tokenizers. GPT-4o uses
o200k_base(≈200k vocab); GPT-4 and GPT-3.5 usecl100k_base(≈100k vocab). Larger vocabularies tend to fit more characters per token, especially for non-English text. - Does this work for Claude or Gemini?
- Not yet — those tokenizers aren't publicly distributed in the same form. Anthropic's SDK has a
count_tokensAPI, and Gemini exposes one too. A multi-provider counter is on the roadmap; for now, this tool covers OpenAI models exactly. - Is this byte-for-byte accurate?
- Yes for OpenAI.
js-tiktokenis a JavaScript port of OpenAI's reference tokenizer and produces identical token IDs to the Pythontiktokenlibrary. We use it directly here. - How big is the bundle?
js-tiktokenships the BPE rank tables for the encodings it supports. The full bundle is ~1MB gzipped. We load it on-demand only on this page (Astro island), so the rest of the site stays light.- What does "chars / token" tell me?
- A rough efficiency metric. English averages ~4 chars/token. CJK text can be ~1–1.5 chars/token (denser). Code with lots of symbols can be lower. Use it to sanity-check whether content is "tokenizer-friendly".
- Is anything I paste sent to a server?
- No. The BPE encoding runs entirely in your browser. Your text never leaves your machine.
Common pitfalls
- Estimating tokens as "chars / 4" — works for English, badly under-counts for CJK and over-counts for code-heavy prompts.
- Counting input only when the model also bills output tokens at a different rate.
- Forgetting that chat models also pay for system prompt and message-format overhead (a few tokens per message).
- Assuming Claude / Gemini use the same tokenizer as GPT — they don't.
In your code
npm i js-tiktoken import { encodingForModel } from 'js-tiktoken';
const enc = encodingForModel('gpt-4o');
const tokens = enc.encode('Hello, world!');
console.log(tokens.length); // 4 pip install tiktoken import tiktoken
enc = tiktoken.encoding_for_model('gpt-4o')
tokens = enc.encode('Hello, world!')
print(len(tokens)) # 4 Related tools
- Multi-model Token Compare
Side-by-side token counts for the same input across GPT-4o, GPT-4 Turbo, GPT-3.5, Claude, Gemini.
- Context Window Calculator
Pick a model, paste your prompt, see how much context you have left after reserving output tokens.
- RAG Chunk Estimator
Estimate chunk count and embedding token spend from chunk size + overlap + corpus size.
- Embedding Dimension Reference
Reference table of embedding model output dimensions, max input tokens, and pricing.