Promptyard
Tokens

Embedding Dimension Reference

Reference table of embedding model output dimensions, max input tokens, and pricing.

Model Provider Dim Max input tokens $/M tokens Released
text-embedding-3-large OpenAI 3,072 8,191 $0.13 2024-01-25
text-embedding-3-small OpenAI 1,536 8,191 $0.02 2024-01-25
text-embedding-ada-002 OpenAI 1,536 8,191 $0.1 2022-12-01
text-embedding-004 Google AI 768 2,048 $0 2024-04-09

Last reviewed 2026-05-10. Verify with the provider before basing infrastructure decisions on these values.

How dimensions affect cost

Vector dimensions are the size of each embedding's output array. A 1536-dim model produces vectors with 1536 floating-point numbers; a 3072-dim model doubles that. Storage and similarity-search compute scale linearly with dimension.

For a 1M-document corpus at float32: 1536-dim ≈ 6 GB, 3072-dim ≈ 12 GB. Add HNSW or IVF index overhead and the gap widens. Halving dimensions can also speed up cosine queries 2× — meaningful at scale.

OpenAI's text-embedding-3-large supports Matryoshka truncation: ask the API for fewer dimensions and the model returns a prefix-truncated vector that retains most of the semantic signal. Common choices: 3072 (full), 1536, 1024, 512, 256.

FAQ

Why does dimension matter?
It dictates index size and query speed in your vector store. A 3072-dim embedding takes 2× the disk and ~2× the cosine-similarity compute of a 1536-dim one. Higher dims aren't universally better — they trade cost for marginal recall gains.
Can I shorten OpenAI's 3072-dim embeddings?
Yes. text-embedding-3-large supports the dimensions parameter to truncate the output. The model was trained with Matryoshka representations so the truncated vector retains most of the semantic signal. 1536, 1024, even 256 all work.
Are dimensions comparable across providers?
No. A 1536-dim OpenAI embedding and a 1536-dim Cohere embedding live in completely different vector spaces — you cannot compare them with cosine similarity. Stick to one provider per index.
How do I migrate from <code>ada-002</code> to <code>3-small</code>?
Re-embed everything. Even though both are 1536-dim, the underlying spaces differ. Migration is a one-time backfill: re-embed your corpus, build a new index, swap traffic.

Common pitfalls

  • Mixing embeddings from different providers in the same index — vectors aren't comparable across vector spaces.
  • Treating ada-002 and 3-small as drop-in compatible just because both are 1536-dim — they're trained differently.
  • Defaulting to the largest dimension "for accuracy" without measuring recall — for many use cases 1536 matches 3072 within margin.
  • Forgetting the max input token limit. Embedding APIs truncate or reject input over the limit; chunk first.

In your code

OpenAI Node SDK js npm →
npm i openai
import OpenAI from 'openai';
const openai = new OpenAI();

const res = await openai.embeddings.create({
  model: 'text-embedding-3-small',
  input: 'hello world',
});
console.log(res.data[0].embedding.length); // 1536
Voyage AI Python python pypi →
pip install voyageai
import voyageai
vo = voyageai.Client()
result = vo.embed(['hello world'], model='voyage-3')
print(len(result.embeddings[0]))

Related tools