Week 4 — RAG & Hybrid RAG

"Your LLM is only as good as what you put in its context."
— This week you take control of that context.

What You'll Build This Week

By the end of Week 4, you will have built a production-grade RAG pipeline that can answer questions over any document corpus. You'll go from "stuff everything in the prompt" to a proper two-stage retrieve-then-generate system with evaluation metrics to prove it works.

Topics at a Glance

#	Topic	What You Learn
01	Chunking Strategies	How to split documents intelligently
02	Vector Databases	FAISS, ChromaDB, Qdrant, PGVector — when to use what
03	Hybrid Search	Dense + Sparse retrieval, BM25, RRF fusion
04	Query Augmentation	HyDE, query rewriting, step-back prompting
05	Reranking	Cross-encoders, Cohere Rerank, ColBERT
06	RAGAS Evaluation	Measuring your RAG with faithfulness, precision, recall
07	Multimodal Embeddings	ColPali, CLIP, embedding images + text together
08	LLM Grounding	Source attribution, citations, hallucination control
09	Late Chunking	Jina AI's token-level pooling trick
10	GraphRAG	Microsoft's entity graph approach
11	Contextual Retrieval	Anthropic's chunk-level context injection
12	Semantic Caching	Avoid redundant LLM calls with cosine-threshold caching

Labs & Capstones

Type	Name	Key Tech
`CAPSTONE`	BS Degree Chatbot	Hybrid RAG · RAGAS · FastAPI
`CAPSTONE`	Policy Chatbot	NeMo Guardrails · Google Auth
`LAB`	RAGAS Evaluation Dashboard	Naive RAG vs Hybrid RAG vs Contextual

Why RAG?

LLMs have a fixed knowledge cutoff and a finite context window. RAG solves both problems:

Knowledge cutoff → pull fresh documents at query time
Hallucination → ground answers in retrieved evidence
Context limits → retrieve only what's relevant (not the full corpus)
Cost → cheaper than fine-tuning for every new document set

The naive approach (paste all documents into prompt) breaks at scale. This week you build the proper pipeline.

Prerequisites

Make sure you've done Week 3. You'll need:

Basic LLM API calls (OpenAI / Anthropic)
Understanding of embeddings and cosine similarity (Week 3 → Vector Embeddings)
Python + FastAPI basics

What You'll Build This Week​

Topics at a Glance​

Labs & Capstones​

Why RAG?​

Prerequisites​

What You'll Build This Week

Topics at a Glance

Labs & Capstones

Why RAG?

Prerequisites