Intermediate

Vector Search

Choose and set up the right vector database for your RAG system. Compare Pinecone, ChromaDB, Weaviate, Qdrant, Milvus, and pgvector.

Vector Database Comparison

Database Type Hosting Free Tier Best For
ChromaDB Embedded Local / Cloud Open source Prototyping, small datasets
Pinecone Managed cloud Cloud only Yes (limited) Production, managed infrastructure
Weaviate Self-hosted / Cloud Both Open source Hybrid search, multi-modal
Qdrant Self-hosted / Cloud Both Open source High performance, filtering
Milvus Self-hosted / Cloud Both Open source Large scale, enterprise
pgvector PostgreSQL extension Any PostgreSQL Open source Existing PostgreSQL setups

ChromaDB (Quick Start)

Python - ChromaDB
pip install chromadb langchain-chroma

from langchain_chroma import Chroma
from langchain_openai import OpenAIEmbeddings

# Create embeddings model
embeddings = OpenAIEmbeddings(model="text-embedding-3-small")

# Create vector store and add documents
vectorstore = Chroma.from_documents(
    documents=chunks,
    embedding=embeddings,
    persist_directory="./chroma_db",
    collection_name="my_docs"
)

# Search for similar documents
results = vectorstore.similarity_search(
    "How do I reset my password?",
    k=5
)

for doc in results:
    print(doc.page_content[:200])

Pinecone

Python - Pinecone
pip install pinecone langchain-pinecone

from langchain_pinecone import PineconeVectorStore
from langchain_openai import OpenAIEmbeddings
from pinecone import Pinecone

# Initialize Pinecone
pc = Pinecone(api_key="your-api-key")

# Create vector store
vectorstore = PineconeVectorStore.from_documents(
    documents=chunks,
    embedding=OpenAIEmbeddings(),
    index_name="my-rag-index"
)

# Search with metadata filtering
results = vectorstore.similarity_search(
    "deployment guide",
    k=5,
    filter={"department": "engineering"}
)

Similarity Metrics

Metric Range Best For Notes
Cosine Similarity -1 to 1 Most text embeddings Default choice; ignores magnitude
Dot Product -inf to inf Normalized embeddings Faster than cosine; same result when normalized
Euclidean (L2) 0 to inf Spatial similarity Measures absolute distance; lower is more similar
Default choice: Use cosine similarity for text embeddings. Most embedding models (OpenAI, Cohere, BGE) are optimized for cosine similarity.

Indexing Strategies

📈

HNSW

Hierarchical Navigable Small World. The most popular index type. Fast approximate search with high recall. Good default for most use cases.

📊

IVF

Inverted File Index. Partitions vectors into clusters. Good for very large datasets. Requires training on sample data.

📋

Flat (Brute Force)

Exact search comparing every vector. 100% recall but slow for large datasets. Use for small collections or testing.

Hybrid Search (Vector + Keyword)

Combines semantic vector search with traditional keyword (BM25) search for better results:

Python - Hybrid Search with Weaviate
from langchain_weaviate import WeaviateVectorStore

vectorstore = WeaviateVectorStore(
    client=weaviate_client,
    index_name="Documents",
    text_key="content",
    embedding=embeddings
)

# Hybrid search: combines vector + keyword (BM25)
retriever = vectorstore.as_retriever(
    search_type="hybrid",
    search_kwargs={
        "k": 5,
        "alpha": 0.5  # 0 = pure keyword, 1 = pure vector
    }
)
📚
When to use hybrid search: Hybrid search excels when queries contain specific terms (product names, error codes, IDs) that pure vector search might miss. It combines the precision of keyword matching with the semantic understanding of vector search.

Metadata Filtering

Python - Filtered Search
# Search only in specific documents or categories
results = vectorstore.similarity_search(
    "API authentication",
    k=5,
    filter={
        "source_type": "documentation",
        "department": {"$in": ["engineering", "security"]},
        "last_updated": {"$gte": "2025-01-01"}
    }
)

What's Next?

The next lesson covers advanced retrieval strategies and reranking to further improve the quality of retrieved results.