Intermediate
Vector Search
Choose and set up the right vector database for your RAG system. Compare Pinecone, ChromaDB, Weaviate, Qdrant, Milvus, and pgvector.
Vector Database Comparison
| Database | Type | Hosting | Free Tier | Best For |
|---|---|---|---|---|
| ChromaDB | Embedded | Local / Cloud | Open source | Prototyping, small datasets |
| Pinecone | Managed cloud | Cloud only | Yes (limited) | Production, managed infrastructure |
| Weaviate | Self-hosted / Cloud | Both | Open source | Hybrid search, multi-modal |
| Qdrant | Self-hosted / Cloud | Both | Open source | High performance, filtering |
| Milvus | Self-hosted / Cloud | Both | Open source | Large scale, enterprise |
| pgvector | PostgreSQL extension | Any PostgreSQL | Open source | Existing PostgreSQL setups |
ChromaDB (Quick Start)
Python - ChromaDB
pip install chromadb langchain-chroma from langchain_chroma import Chroma from langchain_openai import OpenAIEmbeddings # Create embeddings model embeddings = OpenAIEmbeddings(model="text-embedding-3-small") # Create vector store and add documents vectorstore = Chroma.from_documents( documents=chunks, embedding=embeddings, persist_directory="./chroma_db", collection_name="my_docs" ) # Search for similar documents results = vectorstore.similarity_search( "How do I reset my password?", k=5 ) for doc in results: print(doc.page_content[:200])
Pinecone
Python - Pinecone
pip install pinecone langchain-pinecone from langchain_pinecone import PineconeVectorStore from langchain_openai import OpenAIEmbeddings from pinecone import Pinecone # Initialize Pinecone pc = Pinecone(api_key="your-api-key") # Create vector store vectorstore = PineconeVectorStore.from_documents( documents=chunks, embedding=OpenAIEmbeddings(), index_name="my-rag-index" ) # Search with metadata filtering results = vectorstore.similarity_search( "deployment guide", k=5, filter={"department": "engineering"} )
Similarity Metrics
| Metric | Range | Best For | Notes |
|---|---|---|---|
| Cosine Similarity | -1 to 1 | Most text embeddings | Default choice; ignores magnitude |
| Dot Product | -inf to inf | Normalized embeddings | Faster than cosine; same result when normalized |
| Euclidean (L2) | 0 to inf | Spatial similarity | Measures absolute distance; lower is more similar |
Default choice: Use cosine similarity for text embeddings. Most embedding models (OpenAI, Cohere, BGE) are optimized for cosine similarity.
Indexing Strategies
HNSW
Hierarchical Navigable Small World. The most popular index type. Fast approximate search with high recall. Good default for most use cases.
IVF
Inverted File Index. Partitions vectors into clusters. Good for very large datasets. Requires training on sample data.
Flat (Brute Force)
Exact search comparing every vector. 100% recall but slow for large datasets. Use for small collections or testing.
Hybrid Search (Vector + Keyword)
Combines semantic vector search with traditional keyword (BM25) search for better results:
Python - Hybrid Search with Weaviate
from langchain_weaviate import WeaviateVectorStore vectorstore = WeaviateVectorStore( client=weaviate_client, index_name="Documents", text_key="content", embedding=embeddings ) # Hybrid search: combines vector + keyword (BM25) retriever = vectorstore.as_retriever( search_type="hybrid", search_kwargs={ "k": 5, "alpha": 0.5 # 0 = pure keyword, 1 = pure vector } )
When to use hybrid search: Hybrid search excels when queries contain specific terms (product names, error codes, IDs) that pure vector search might miss. It combines the precision of keyword matching with the semantic understanding of vector search.
Metadata Filtering
Python - Filtered Search
# Search only in specific documents or categories results = vectorstore.similarity_search( "API authentication", k=5, filter={ "source_type": "documentation", "department": {"$in": ["engineering", "security"]}, "last_updated": {"$gte": "2025-01-01"} } )
What's Next?
The next lesson covers advanced retrieval strategies and reranking to further improve the quality of retrieved results.
Lilly Tech Systems