Skip to main content

Week 3 — Large Language Models

Large Language Models (LLMs) have transformed data science. This week, you will learn how to interact with LLMs effectively — from crafting better prompts to extracting structured data and calling external tools. These skills are the foundation for building AI-powered applications.

Overview

This week covers the practical aspects of working with LLMs. You will learn prompt engineering techniques, how to get structured JSON output from LLMs, how to extract data from documents, how to give LLMs access to external tools, and how to use embeddings for semantic search.

Pages

#PageTopic
1Prompt EngineeringZero-shot, few-shot, chain-of-thought, role prompting
2Structured OutputJSON mode, Pydantic + LLM, instructor library
3LLM ExtractionNamed entities, tables from PDFs
4Function CallingTool use, parallel tools, tool choice
5LLM CLIllm CLI tool, plugins, templates
6EmbeddingsCosine similarity, embedding models
7Prompt CachingCost + latency reduction via cache-friendly prompts
8Structured Output with instructorSchema-first extraction and validation loops

Learning Outcomes

By the end of this week, you will be able to:

  • Apply prompt engineering techniques (zero-shot, few-shot, chain-of-thought) to get better LLM outputs
  • Extract structured JSON data from LLM responses using Pydantic schemas
  • Use LLMs to extract named entities and structured data from unstructured text and PDFs
  • Implement function calling to give LLMs access to external tools and APIs
  • Use the llm CLI tool for quick LLM interactions and template-based workflows
  • Generate and compare text embeddings for semantic similarity search
Time estimate

Expect to spend 8-10 hours on this week's material: ~3 hours reading, ~3 hours on walkthroughs, and ~3 hours on the lab.