Learn RAG
Master Retrieval Augmented Generation — the technique that grounds AI responses in real, up-to-date data. Build the full pipeline: data ingestion, chunking, embedding, vector search, retrieval, and generation.
Your Learning Path
Follow these lessons in order, or jump to any topic that interests you.
1. Introduction
What is RAG? Why it matters for reducing hallucinations, using private data, and keeping AI responses current.
2. RAG Architecture
Offline and online pipelines, components, and architecture patterns: naive RAG, advanced RAG, modular RAG.
3. Data Ingestion
Load data from PDFs, web pages, databases, APIs, Slack, Notion, and Confluence. Clean and preprocess text.
4. Chunking Strategies
Fixed-size, recursive, sentence-based, semantic, and hierarchical chunking. Chunk size selection and overlap.
5. Vector Search
Vector databases compared: Pinecone, ChromaDB, Weaviate, Qdrant, pgvector. Indexing, similarity metrics, hybrid search.
6. Retrieval & Reranking
Similarity search, MMR, reranking with cross-encoders, multi-query retrieval, HyDE, and ensemble strategies.
7. Generation
Construct prompts with context, manage token windows, add citations, stream responses, and multi-turn RAG.
8. Evaluation
Measure faithfulness, relevancy, precision, and recall. Use RAGAS, TruLens, and LangSmith frameworks.
9. Best Practices
Optimization checklist, failure modes, production deployment, cost optimization, scaling, and multi-modal RAG.
What You'll Learn
By the end of this course, you'll be able to:
Ingest Any Data
Load and process documents from PDFs, web pages, databases, Slack, Notion, and any other data source.
Build Vector Search
Embed documents, store them in vector databases, and implement efficient semantic search with reranking.
Generate Grounded Answers
Augment LLM prompts with retrieved context to produce accurate, cited responses with reduced hallucinations.
Evaluate & Optimize
Measure RAG quality with industry-standard metrics and continuously improve retrieval and generation.
Lilly Tech Systems