Embedding Models
Compare the leading embedding models across dimensions, context length, pricing, quality benchmarks, and specialized capabilities to choose the right one for your use case.
Model Comparison Table
| Model | Provider | Dimensions | Context Length | Pricing (per 1M tokens) | MTEB Avg |
|---|---|---|---|---|---|
| text-embedding-3-small | OpenAI | 1536 | 8,191 tokens | $0.02 | 62.3 |
| text-embedding-3-large | OpenAI | 3072 | 8,191 tokens | $0.13 | 64.6 |
| voyage-3 | Voyage AI | 1024 | 32,000 tokens | $0.06 | 67.1 |
| voyage-3-lite | Voyage AI | 512 | 32,000 tokens | $0.02 | 62.4 |
| voyage-code-3 | Voyage AI | 1024 | 32,000 tokens | $0.06 | — (code) |
| embed-english-v3.0 | Cohere | 1024 | 512 tokens | $0.10 | 64.5 |
| embed-multilingual-v3.0 | Cohere | 1024 | 512 tokens | $0.10 | — (multi) |
| text-embedding-004 | 768 | 2,048 tokens | Free (limits apply) | 62.0 | |
| all-MiniLM-L6-v2 | Open Source | 384 | 256 tokens | Free (local) | 56.3 |
| bge-large-en-v1.5 | BAAI (Open) | 1024 | 512 tokens | Free (local) | 63.6 |
| e5-large-v2 | Microsoft (Open) | 1024 | 512 tokens | Free (local) | 62.2 |
Choosing the Right Model
-
For General Use: OpenAI text-embedding-3-small
Best default choice. Widely supported by vector databases, good quality, very affordable at $0.02/1M tokens. Start here unless you have a specific reason not to.
-
For Maximum Quality: Voyage AI voyage-3
Consistently tops retrieval benchmarks. Supports 32K token context windows for long documents. Excellent for RAG applications where search quality is critical.
-
For Code: Voyage AI voyage-code-3
Specifically optimized for code search and understanding. Best choice for code-related applications like code search, documentation retrieval, or code review tools.
-
For Multilingual: Cohere embed-multilingual-v3.0
Supports 100+ languages in a single model. A query in English can find relevant documents in Japanese, Spanish, or any other supported language.
-
For Zero Cost: Sentence Transformers (local)
Run locally with no API costs. Best for prototyping, privacy-sensitive applications, or offline use.
bge-large-en-v1.5offers the best quality among open-source models. -
For Budget-Sensitive Production: OpenAI 3-small with reduced dimensions
Reduce dimensions from 1536 to 512 or 256. Saves storage and compute with minimal quality loss. Best cost-to-quality ratio.
Multilingual Embeddings
Multilingual models encode text from different languages into the same vector space. This enables cross-lingual search and retrieval:
import cohere
import numpy as np
co = cohere.ClientV2()
# Documents in different languages
docs = [
"Machine learning is a subset of AI", # English
"El aprendizaje automático es un subconjunto de IA", # Spanish
"Das maschinelle Lernen ist ein Teilgebiet der KI", # German
]
# Query in English
query = "What is machine learning?"
# Embed everything with multilingual model
doc_response = co.embed(
texts=docs, model="embed-multilingual-v3.0",
input_type="search_document", embedding_types=["float"]
)
query_response = co.embed(
texts=[query], model="embed-multilingual-v3.0",
input_type="search_query", embedding_types=["float"]
)
# All three documents will score high similarity!
for i, doc in enumerate(docs):
sim = np.dot(query_response.embeddings.float_[0], doc_response.embeddings.float_[i])
print(f"Similarity: {sim:.4f} | {doc[:50]}")
Multi-Modal Embeddings (CLIP)
CLIP (Contrastive Language-Image Pre-training) by OpenAI encodes both images and text into the same vector space, enabling cross-modal search:
from sentence_transformers import SentenceTransformer
from PIL import Image
# Load CLIP model
model = SentenceTransformer("clip-ViT-B-32")
# Embed text and images into the same space
text_embedding = model.encode("a photo of a golden retriever")
image_embedding = model.encode(Image.open("dog.jpg"))
# These can be compared with cosine similarity!
# Search images with text queries or find similar images
Model Versioning
- Embeddings from different models (or model versions) live in different vector spaces and cannot be compared.
- If you upgrade to a new model version, you must re-embed all your data.
- Store the model name and version alongside your vectors in metadata.
- Keep your raw text data so you can re-embed when needed.
💡 Try It Yourself
Embed the same 10 sentences with two different models (e.g., OpenAI text-embedding-3-small and sentence-transformers all-MiniLM-L6-v2). Compare the pairwise similarity matrices. Do the models agree on which sentences are most similar?