Intermediate

Embedding Models

Compare the leading embedding models across dimensions, context length, pricing, quality benchmarks, and specialized capabilities to choose the right one for your use case.

Model Comparison Table

Model Provider Dimensions Context Length Pricing (per 1M tokens) MTEB Avg
text-embedding-3-small OpenAI 1536 8,191 tokens $0.02 62.3
text-embedding-3-large OpenAI 3072 8,191 tokens $0.13 64.6
voyage-3 Voyage AI 1024 32,000 tokens $0.06 67.1
voyage-3-lite Voyage AI 512 32,000 tokens $0.02 62.4
voyage-code-3 Voyage AI 1024 32,000 tokens $0.06 — (code)
embed-english-v3.0 Cohere 1024 512 tokens $0.10 64.5
embed-multilingual-v3.0 Cohere 1024 512 tokens $0.10 — (multi)
text-embedding-004 Google 768 2,048 tokens Free (limits apply) 62.0
all-MiniLM-L6-v2 Open Source 384 256 tokens Free (local) 56.3
bge-large-en-v1.5 BAAI (Open) 1024 512 tokens Free (local) 63.6
e5-large-v2 Microsoft (Open) 1024 512 tokens Free (local) 62.2
💡
MTEB (Massive Text Embedding Benchmark) is the standard benchmark for evaluating embedding models across retrieval, classification, clustering, and other tasks. Higher scores indicate better overall quality. Check the MTEB Leaderboard for the latest rankings.

Choosing the Right Model

  1. For General Use: OpenAI text-embedding-3-small

    Best default choice. Widely supported by vector databases, good quality, very affordable at $0.02/1M tokens. Start here unless you have a specific reason not to.

  2. For Maximum Quality: Voyage AI voyage-3

    Consistently tops retrieval benchmarks. Supports 32K token context windows for long documents. Excellent for RAG applications where search quality is critical.

  3. For Code: Voyage AI voyage-code-3

    Specifically optimized for code search and understanding. Best choice for code-related applications like code search, documentation retrieval, or code review tools.

  4. For Multilingual: Cohere embed-multilingual-v3.0

    Supports 100+ languages in a single model. A query in English can find relevant documents in Japanese, Spanish, or any other supported language.

  5. For Zero Cost: Sentence Transformers (local)

    Run locally with no API costs. Best for prototyping, privacy-sensitive applications, or offline use. bge-large-en-v1.5 offers the best quality among open-source models.

  6. For Budget-Sensitive Production: OpenAI 3-small with reduced dimensions

    Reduce dimensions from 1536 to 512 or 256. Saves storage and compute with minimal quality loss. Best cost-to-quality ratio.

Multilingual Embeddings

Multilingual models encode text from different languages into the same vector space. This enables cross-lingual search and retrieval:

Python - Cross-Lingual Search
import cohere
import numpy as np

co = cohere.ClientV2()

# Documents in different languages
docs = [
    "Machine learning is a subset of AI",          # English
    "El aprendizaje automático es un subconjunto de IA",  # Spanish
    "Das maschinelle Lernen ist ein Teilgebiet der KI",  # German
]

# Query in English
query = "What is machine learning?"

# Embed everything with multilingual model
doc_response = co.embed(
    texts=docs, model="embed-multilingual-v3.0",
    input_type="search_document", embedding_types=["float"]
)
query_response = co.embed(
    texts=[query], model="embed-multilingual-v3.0",
    input_type="search_query", embedding_types=["float"]
)

# All three documents will score high similarity!
for i, doc in enumerate(docs):
    sim = np.dot(query_response.embeddings.float_[0], doc_response.embeddings.float_[i])
    print(f"Similarity: {sim:.4f} | {doc[:50]}")

Multi-Modal Embeddings (CLIP)

CLIP (Contrastive Language-Image Pre-training) by OpenAI encodes both images and text into the same vector space, enabling cross-modal search:

Python - CLIP Embeddings
from sentence_transformers import SentenceTransformer
from PIL import Image

# Load CLIP model
model = SentenceTransformer("clip-ViT-B-32")

# Embed text and images into the same space
text_embedding = model.encode("a photo of a golden retriever")
image_embedding = model.encode(Image.open("dog.jpg"))

# These can be compared with cosine similarity!
# Search images with text queries or find similar images

Model Versioning

Critical: Always track your model version.
  • Embeddings from different models (or model versions) live in different vector spaces and cannot be compared.
  • If you upgrade to a new model version, you must re-embed all your data.
  • Store the model name and version alongside your vectors in metadata.
  • Keep your raw text data so you can re-embed when needed.

💡 Try It Yourself

Embed the same 10 sentences with two different models (e.g., OpenAI text-embedding-3-small and sentence-transformers all-MiniLM-L6-v2). Compare the pairwise similarity matrices. Do the models agree on which sentences are most similar?

Different models should agree on the general similarity rankings (most similar / least similar) but may differ on exact scores and subtle distinctions.