Skip to main content

Advanced Docker

You already know the basics of Docker from Week 2. Now we level up to production-grade containerization: smaller images, faster builds, better security, and reliable health checks. These techniques are essential for deploying LLM applications that are secure, efficient, and observable.

Multi-Stage Builds

Multi-stage builds separate the build environment from the runtime environment, producing dramatically smaller images by discarding build tools and intermediate artifacts.

Python Multi-Stage Example

dockerfile
# Stage 1: Build dependencies
FROM python:3.12-slim AS builder

WORKDIR /build

# Install build tools (gcc, etc.) needed for compiled packages
RUN apt-get update && apt-get install -y --no-install-recommends \
build-essential \
&& rm -rf /var/lib/apt/lists/*

COPY requirements.txt .
RUN pip install --no-cache-dir --prefix=/install -r requirements.txt

# Stage 2: Runtime (no build tools)
FROM python:3.12-slim AS runtime

WORKDIR /app

# Copy only the installed packages from builder
COPY --from=builder /install /usr/local

# Copy application code
COPY app/ ./app/

# Run as non-root user
RUN useradd --create-home appuser
USER appuser

EXPOSE 8000
CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8000"]

Result: The builder stage may be 800 MB, but the final image is often under 150 MB because build tools and pip cache are excluded.

Copy only what you need

Each COPY instruction creates a new layer. Be specific — COPY app/ ./app/ instead of COPY . . — to avoid leaking secrets, tests, and dev files into the image.

Node.js Multi-Stage Example

dockerfile
# Stage 1: Install and build
FROM node:20-alpine AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci
COPY . .
RUN npm run build

# Stage 2: Serve with a minimal server
FROM node:20-alpine AS runtime
WORKDIR /app
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/node_modules ./node_modules
COPY --from=builder /app/package.json ./
EXPOSE 3000
CMD ["node", "dist/server.js"]

BuildKit Cache Mounts

BuildKit is Docker's modern build engine. It supports cache mounts that persist package manager caches across builds — even in CI.

dockerfile
# syntax=docker/dockerfile:1

FROM python:3.12-slim AS builder

WORKDIR /build
COPY requirements.txt .

# Mount pip cache as a persistent cache volume
RUN --mount=type=cache,target=/root/.cache/pip \
pip install -r requirements.txt

COPY . .
RUN --mount=type=cache,target=/root/.cache/pip \
pip install -e .

The --mount=type=cache directive keeps the pip cache between builds without bloating the image. In CI, this can cut build times by 50-80%.

Enable BuildKit in your CI pipeline:

yaml
# GitHub Actions
env:
DOCKER_BUILDKIT: 1
bash
# Local development
DOCKER_BUILDKIT=1 docker build -t myapp .

Distroless Images

Google's distroless images contain only your application and its runtime dependencies — no shell, no package manager, no utilities. This eliminates entire classes of vulnerabilities.

dockerfile
# Stage 1: Build
FROM python:3.12 AS builder
WORKDIR /build
COPY requirements.txt .
RUN pip install --no-cache-dir --prefix=/install -r requirements.txt
COPY app/ ./app/

# Stage 2: Distroless runtime
FROM gcr.io/distroless/python3-debian12
COPY --from=builder /install /usr/local
COPY --from=builder /build/app /app
WORKDIR /app
CMD ["main.py"]

Distroless vs. Slim vs. Alpine

ImageSizeShellPackage ManagerAttack Surface
python:3.12~1 GBYesaptHigh
python:3.12-slim~150 MBYesaptMedium
python:3.12-alpine~50 MBYesapkLow
distroless/python3~30 MBNoNoMinimal
No shell means no debug

Distroless images have no /bin/sh. You cannot docker exec -it <container> bash to debug. Use docker cp or structured logging instead. For development, use slim images; switch to distroless for production.

Health Checks

Docker health checks let the orchestrator (Docker Compose, Kubernetes) know if your application is actually working — not just running.

dockerfile
FROM python:3.12-slim
WORKDIR /app
COPY . .
RUN pip install -r requirements.txt

HEALTHCHECK --interval=30s --timeout=5s --start-period=10s --retries=3 \
CMD python -c "import urllib.request; urllib.request.urlopen('http://localhost:8000/health')" || exit 1

CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8000"]

Parameters explained:

ParameterPurposeTypical Value
--intervalTime between checks30s
--timeoutMax time for check to complete5s
--start-periodGrace period on startup10-30s
--retriesConsecutive failures before unhealthy3

Docker Compose Profiles

Profiles let you define optional services that only start when explicitly requested — perfect for separating dev tools from production services:

yaml
# docker-compose.yml
services:
app:
build: .
ports:
- "8000:8000"
environment:
- OPENAI_API_KEY=${OPENAI_API_KEY}
depends_on:
db:
condition: service_healthy

db:
image: postgres:16-alpine
environment:
POSTGRES_DB: myapp
POSTGRES_PASSWORD: ${DB_PASSWORD}
volumes:
- pgdata:/var/lib/postgresql/data
healthcheck:
test: ["CMD-SHELL", "pg_isready -U postgres"]
interval: 5s
timeout: 5s
retries: 5

# Dev-only tools (only started with --profile dev)
pgadmin:
image: dpage/pgadmin4
profiles: ["dev"]
ports:
- "5050:80"
environment:
PGADMIN_DEFAULT_EMAIL: admin@example.com
PGADMIN_DEFAULT_PASSWORD: admin

redis-insight:
image: redis/redisinsight
profiles: ["dev"]
ports:
- "5540:5540"

volumes:
pgdata:
bash
# Start only core services
docker compose up -d

# Start core + dev tools
docker compose --profile dev up -d

Security Scanning with Docker Scout

Vulnerability scanning should be part of every CI pipeline. Docker Scout analyzes your image for known CVEs:

bash
# Scan a local image
docker scout cves myapp:latest

# Scan and get remediation advice
docker scout recommendations myapp:latest

# Compare two images (before and after update)
docker scout compare --to myapp:v2 myapp:v1

Integrating Scout in CI

yaml
# GitHub Actions integration
- name: Docker Scout
uses: docker/scout-action@v1
with:
command: cves
image: myapp:latest
sarif-file: scout-results.sarif
summary: true
Zero-day vulnerabilities

Scanning catches known CVEs but not zero-day exploits. Layer your defenses: distroless images + non-root user + read-only filesystem + network policies.

Production Dockerfile Checklist

Before shipping an image to production, verify:

  • Uses multi-stage builds (builder + runtime)
  • Runs as non-root user (USER appuser)
  • No secrets in image (use runtime env vars or secrets mounts)
  • Health check defined (HEALTHCHECK or Compose healthcheck)
  • .dockerignore excludes tests, docs, .git, .env
  • Image passes docker scout cves with no critical/high CVEs
  • Pinned base image digest (FROM python:3.12-slim@sha256:abc...)
  • Read-only filesystem where possible (--read-only flag)