AI Vector Databases

Master vector databases end-to-end. 18 vector DBs (Pinecone, Weaviate, Qdrant, Milvus, pgvector, Chroma, LanceDB, Vespa, Elastic, Redis, Mongo, OpenSearch, and more), every major index algorithm, embedding strategies, search patterns, and production operations.

50 Topics
300 Lessons
18 Vector DBs
100% Free

All Topics

50 topics organized into 5 categories spanning the full vector DB stack — databases, algorithms, embeddings, search patterns, and operations.

Major Vector Databases

🌲

Pinecone Mastery

Master Pinecone serverless and pod-based deployments. Learn indexes, namespaces, metadata filtering, hybrid search, and production patterns for the leading managed vector DB.

6 Lessons
📁

Weaviate Mastery

Master Weaviate's GraphQL-first vector database. Learn collections, multi-tenancy, modules (vectorizers, generative), and the patterns for building agentic search.

6 Lessons
📊

Qdrant Mastery

Master Qdrant: open-source vector DB written in Rust. Learn collections, payload filtering, sharding, and the patterns for self-hosted vector search at scale.

6 Lessons
🌈

Chroma Mastery

Master Chroma: developer-friendly embedding database for LLM apps. Learn collections, persistent storage, embedding functions, and the patterns for fast prototyping.

6 Lessons
🌏

Milvus Mastery

Master Milvus: cloud-native, distributed vector DB built for billion-scale workloads. Learn the architecture, CRUD operations, indexes, and the patterns for scaling.

6 Lessons
🐘

pgvector for PostgreSQL

Add vector search to PostgreSQL with pgvector. Master vector columns, HNSW and IVFFlat indexes, hybrid SQL+vector queries, and the patterns for using your existing DB.

6 Lessons
🏹

LanceDB Mastery

Master LanceDB: serverless, embedded vector DB on Apache Arrow. Learn its file-based architecture, multi-modal support, and the patterns for AI app developers.

6 Lessons
😀

Vespa for AI Search

Master Vespa: Yahoo's open-source big-data serving engine for AI. Learn schemas, ranking expressions, and the patterns for tensor-based ranking and retrieval.

6 Lessons
🔍

Elasticsearch kNN Search

Add vector search to Elasticsearch. Master dense_vector fields, kNN search, hybrid retrieval, and the patterns for combining BM25 and vectors in Elastic.

6 Lessons
👋

Redis Vector Search

Add vector search to Redis with RediSearch. Master FT.CREATE with vector fields, KNN, hybrid queries, and the patterns for sub-millisecond vector search.

6 Lessons
🌿

MongoDB Atlas Vector Search

Add vector search to MongoDB with Atlas Vector Search. Master vector indexes, $vectorSearch aggregation, and the patterns for combining documents with vectors.

6 Lessons
🌍

OpenSearch Vector Search

Add vector search to OpenSearch. Master the k-NN plugin, neural search, hybrid queries, and the patterns for AWS-native vector workloads.

6 Lessons
🌾

Marqo Vector Search

Master Marqo: end-to-end vector search engine with built-in inference. Learn indexes, multi-modal search, and the patterns for tensor search without managing embeddings.

6 Lessons
💰

SingleStore Vector

Add vector search to SingleStore. Master VECTOR data types, DOT_PRODUCT and EUCLIDEAN_DISTANCE functions, and the patterns for SQL-native vector workloads.

6 Lessons
🍭

Couchbase Vector Search

Add vector search to Couchbase. Master Search Service vector indexes, FTS-vector queries, and the patterns for combining JSON documents with vectors.

6 Lessons
🚀

Turbopuffer

Master Turbopuffer: object-storage-native vector DB built for cost. Learn its architecture, namespaces, and the patterns for cheap, scalable vector search.

6 Lessons

Vald Vector Search

Master Vald: cloud-native distributed ANN engine on Kubernetes. Learn its agent-based architecture, CRD-based config, and the patterns for K8s-native vector ops.

6 Lessons
🔬

Cassandra Vector Search

Add vector search to Apache Cassandra (DataStax Astra DB). Master VECTOR types, ANN with SAI indexes, and the patterns for highly distributed vector workloads.

6 Lessons

Index Algorithms

📖

Vector Index Fundamentals

Understand the fundamentals of vector indexes. Learn the tradeoffs between recall, latency, memory, and build time across the major ANN algorithm families.

6 Lessons
🔗

HNSW Algorithm Deep Dive

Master HNSW (Hierarchical Navigable Small World) — the dominant ANN index in modern vector DBs. Learn the layer structure, M and efConstruction, and tuning for your workload.

6 Lessons
📁

IVF (Inverted File) Index

Master IVF: cluster-then-search ANN that scales to billions of vectors. Learn nlist, nprobe, IVF+PQ, and the patterns for very large vector DBs.

6 Lessons
📋

Product Quantization (PQ)

Master Product Quantization: 32-100x memory savings via vector compression. Learn subspaces, codebooks, asymmetric distance, and the precision tradeoffs.

6 Lessons
🔭

ScaNN by Google

Master ScaNN: Google's state-of-the-art ANN library. Learn anisotropic vector quantization, asymmetric hashing, and the patterns for max recall at min latency.

6 Lessons
🔭

FAISS Library

Master FAISS: Facebook AI's library for efficient similarity search. Learn IndexFlat, IndexIVF, IndexHNSW, IndexPQ, GPU acceleration, and the patterns for self-built vector pipelines.

6 Lessons
🍃

Annoy Library

Master Annoy: Spotify's tree-based ANN library. Learn random projections, n_trees vs search_k, memory-mapping, and when Annoy beats HNSW for your workload.

6 Lessons
🔒

LSH (Locality Sensitive Hashing)

Master LSH: hashing-based ANN with strong theoretical guarantees. Learn random hyperplane LSH, MinHash, banding, and when LSH beats graph-based methods.

6 Lessons

Embeddings & Vector Quality

Search Patterns

🔍

Hybrid Search (Vector + BM25)

Combine vector and BM25 keyword search to beat either alone. Master Reciprocal Rank Fusion, weighted scoring, and the framework-specific implementations across vector DBs.

6 Lessons
🏷

Filtered Vector Search

Combine vector search with metadata filters: dates, tenants, permissions. Master pre-filtering vs post-filtering, filter index design, and performance at scale.

6 Lessons
📙

Multi-Vector Search (ColBERT)

Beat single-vector retrieval with token-level multi-vector search. Master ColBERT, late interaction, and the patterns for high-precision retrieval at moderate cost.

6 Lessons
📋

Reranking After Vector Recall

Boost retrieval precision 10-30% with a second-stage reranker. Master Cohere Rerank, BGE-Reranker, ColBERT-as-reranker, and the latency/quality tradeoffs.

6 Lessons
💾

Semantic Caching with Vector DBs

Cut LLM calls 30-80% with semantic caching. Master vector-similarity-based cache lookups, similarity thresholds, cache eviction, and stale cache detection.

6 Lessons
📚

Vector Search for RAG

Use vector search effectively in RAG. Master chunking-aware retrieval, k tuning, MMR for diversity, contextual retrieval, and the patterns that make RAG actually work.

6 Lessons
🌟

Vector Search for Recommendations

Build recommendations on top of vector DBs. Master item-item, user-item, two-tower models, and the patterns for production recommendation systems.

6 Lessons
🔎

Anomaly Detection with Vectors

Detect outliers, fraud, and novelty with vector DBs. Master nearest-neighbor distance, density-based scoring, and the patterns for production anomaly detection.

6 Lessons

Operations

📦

Vector DB Sharding Strategies

Scale vector DBs to billions of vectors with sharding. Master hash-based, range-based, and tenant-based sharding, and the rebalancing patterns.

6 Lessons
🔄

Vector DB Replication

Make vector DBs highly available with replication. Master leader-follower, multi-leader, eventual vs strong consistency, and the failover patterns.

6 Lessons
💾

Vector DB Backup and Restore

Don't lose your vectors. Master snapshot strategies, incremental backups, restore-time tradeoffs, and the testing patterns for production-grade backup of vector DBs.

6 Lessons
🏠

Multi-Tenancy in Vector DBs

Serve thousands of tenants from one vector DB. Master namespace-per-tenant, collection-per-tenant, hard isolation patterns, and the tradeoffs of each approach.

6 Lessons
📣

Vector DB Migration Patterns

Migrate between vector DBs without downtime. Master dual-write, backfill, cut-over, and rollback patterns for zero-downtime vector DB migrations.

6 Lessons
🔎

Vector DB Monitoring and Metrics

Monitor vector DBs in production. Master the metrics that matter (recall, latency p99, QPS, error rate), Prometheus integration, and alerting patterns.

6 Lessons
💰

Vector DB Cost Optimization

Cut vector DB bills 50-90%. Master object-storage-backed DBs, quantization, tiered storage, dimension reduction, and the cost levers that compound at scale.

6 Lessons

Vector DB Latency Tuning

Drive p99 latency below 50ms. Master batch query optimization, connection pooling, async clients, and the patterns to get vector search out of the latency budget.

6 Lessons
🚀

Vector DB Throughput Scaling

Push vector DB QPS into the tens of thousands. Master read replicas, horizontal scaling, query parallelism, and the load-test patterns that prove your scale claims.

6 Lessons

Self-Hosted vs Managed Vector DBs

Decide between self-hosted (Qdrant, Milvus, Weaviate, Chroma) and managed (Pinecone, Astra DB, Atlas, Turbopuffer). Learn the cost, operational, and feature tradeoffs.

6 Lessons
🔒

Vector DB Security and Access Control

Lock down vector DBs in production. Master authentication, RBAC, encryption at rest and in transit, audit logging, and the patterns for tenant data isolation.

6 Lessons

Why a Vector Databases Track?

Vector DBs are now the backbone of RAG, search, recs, and caching. The leading edge moves fast and is poorly documented in one place.

🌲

18 Vector DBs Compared

Pinecone, Weaviate, Qdrant, Chroma, Milvus, pgvector, LanceDB, Vespa, Elastic, Redis, Mongo, OpenSearch, Marqo, SingleStore, Couchbase, Turbopuffer, Vald, Cassandra.

🔗

Every Major Algorithm

HNSW, IVF, PQ, ScaNN, FAISS, Annoy, LSH — with the math behind each and the tuning knobs that matter.

🔍

Search Patterns

Hybrid search, filtering, multi-vector (ColBERT), reranking, semantic caching, RAG retrieval, recs, anomaly detection.

Production Operations

Sharding, replication, backup, multi-tenancy, migration, monitoring, cost, latency, throughput, security.