Skip to main content

Week 03 — LLM Engineering

Goal: Stop treating LLMs as magic boxes. Learn to talk to them precisely, structure their outputs, control costs, and trace every call — so you ship reliable AI features.

Week 3 bridges the gap between "I can call the API" and "I'm building a production AI system". Every skill here gets used again in RAG (Week 4), Agents (Week 5), and Fine-tuning (Week 8).


The Week 3 Stack

code
User Query


Prompt Engineering ──→ Context Engineering ──→ Prompt Cache (save 90%)

LLM (Claude / GPT / Gemini / Ollama)

┌───────────┼────────────┐
▼ ▼ ▼
Structured Multimodal Embeddings
Output (img/PDF) + Similarity
(Instructor) Search

└──→ LangSmith traces + LiteLLM cost tracking

Topics This Week

#TopicCore Skill
01Prompt EngineeringZero-shot, few-shot, CoT, ToT, Self-Consistency
02Context EngineeringAGENTS.md, CLAUDE.md, system prompt design
03LLM CLI Toolsllm CLI, aichat, shell pipelines
04AI Coding AssistantsClaude Code, Cursor, Copilot, Gemini CLI
05Pydantic & Structured OutputInstructor, JSON mode, auto-retries
06Multimodal InputsImages, PDFs, audio via API
07Vector EmbeddingsBGE-M3, cosine similarity, dimensionality
08Similarity SearchHNSW, IVF, FAISS, hnswlib
09Prompt CachingAnthropic/OpenAI cache, up to 90% cost savings
10LangSmith & LiteLLMTracing, cost analytics, AI gateway
11LLM Architecture SurveyTransformer, MoE, CLIP, SoTA 2025 models

Labs This Week

LabTitleTime
Lab 3.1YouTube → Subtitles → Topics → Timestamps JSON pipeline~3 hrs
Lab 3.2Cost-Tracking Dashboard via LangSmith + prompt benchmarks~2 hrs

Project 1 — Due End of This Week

Project 1 covers all of Weeks 1–3: development tooling, deployment, and LLM engineering. See the Projects section when the specification is released.


Mental Model: Four Layers

code
┌──────────────────────────────────────┐
│ Layer 4: OBSERVABILITY │ LangSmith traces · LiteLLM spend
├──────────────────────────────────────┤
│ Layer 3: OUTPUT SHAPING │ Instructor · Pydantic · Multimodal
├──────────────────────────────────────┤
│ Layer 2: PROMPT & CONTEXT │ Engineering · Caching · CLI tools
├──────────────────────────────────────┤
│ Layer 1: MODEL & EMBEDDINGS │ LLMs · BGE-M3 · FAISS · Architecture
└──────────────────────────────────────┘

Build bottom-up. Get reliable outputs first, then optimize cost, then observe.