Advanced Docker

You already know the basics of Docker from Week 2. Now we level up to production-grade containerization: smaller images, faster builds, better security, and reliable health checks. These techniques are essential for deploying LLM applications that are secure, efficient, and observable.

Multi-Stage Builds

Multi-stage builds separate the build environment from the runtime environment, producing dramatically smaller images by discarding build tools and intermediate artifacts.

Python Multi-Stage Example

dockerfile

# Stage 1: Build dependencies
FROM python:3.12-slim AS builder

WORKDIR /build

# Install build tools (gcc, etc.) needed for compiled packages
RUN apt-get update && apt-get install -y --no-install-recommends \
    build-essential \
    && rm -rf /var/lib/apt/lists/*

COPY requirements.txt .
RUN pip install --no-cache-dir --prefix=/install -r requirements.txt

# Stage 2: Runtime (no build tools)
FROM python:3.12-slim AS runtime

WORKDIR /app

# Copy only the installed packages from builder
COPY --from=builder /install /usr/local

# Copy application code
COPY app/ ./app/

# Run as non-root user
RUN useradd --create-home appuser
USER appuser

EXPOSE 8000
CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8000"]

Result: The builder stage may be 800 MB, but the final image is often under 150 MB because build tools and pip cache are excluded.

Copy only what you need

Each COPY instruction creates a new layer. Be specific — COPY app/ ./app/ instead of COPY . . — to avoid leaking secrets, tests, and dev files into the image.

Node.js Multi-Stage Example

dockerfile

# Stage 1: Install and build
FROM node:20-alpine AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci
COPY . .
RUN npm run build

# Stage 2: Serve with a minimal server
FROM node:20-alpine AS runtime
WORKDIR /app
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/node_modules ./node_modules
COPY --from=builder /app/package.json ./
EXPOSE 3000
CMD ["node", "dist/server.js"]

BuildKit Cache Mounts

BuildKit is Docker's modern build engine. It supports cache mounts that persist package manager caches across builds — even in CI.

dockerfile

# syntax=docker/dockerfile:1

FROM python:3.12-slim AS builder

WORKDIR /build
COPY requirements.txt .

# Mount pip cache as a persistent cache volume
RUN --mount=type=cache,target=/root/.cache/pip \
    pip install -r requirements.txt

COPY . .
RUN --mount=type=cache,target=/root/.cache/pip \
    pip install -e .

The --mount=type=cache directive keeps the pip cache between builds without bloating the image. In CI, this can cut build times by 50-80%.

Enable BuildKit in your CI pipeline:

yaml

# GitHub Actions
env:
  DOCKER_BUILDKIT: 1

bash

# Local development
DOCKER_BUILDKIT=1 docker build -t myapp .

Distroless Images

Google's distroless images contain only your application and its runtime dependencies — no shell, no package manager, no utilities. This eliminates entire classes of vulnerabilities.

dockerfile

# Stage 1: Build
FROM python:3.12 AS builder
WORKDIR /build
COPY requirements.txt .
RUN pip install --no-cache-dir --prefix=/install -r requirements.txt
COPY app/ ./app/

# Stage 2: Distroless runtime
FROM gcr.io/distroless/python3-debian12
COPY --from=builder /install /usr/local
COPY --from=builder /build/app /app
WORKDIR /app
CMD ["main.py"]

Distroless vs. Slim vs. Alpine

Image	Size	Shell	Package Manager	Attack Surface
`python:3.12`	~1 GB	Yes	apt	High
`python:3.12-slim`	~150 MB	Yes	apt	Medium
`python:3.12-alpine`	~50 MB	Yes	apk	Low
`distroless/python3`	~30 MB	No	No	Minimal

No shell means no debug

Distroless images have no /bin/sh. You cannot docker exec -it <container> bash to debug. Use docker cp or structured logging instead. For development, use slim images; switch to distroless for production.

Health Checks

Docker health checks let the orchestrator (Docker Compose, Kubernetes) know if your application is actually working — not just running.

dockerfile

FROM python:3.12-slim
WORKDIR /app
COPY . .
RUN pip install -r requirements.txt

HEALTHCHECK --interval=30s --timeout=5s --start-period=10s --retries=3 \
    CMD python -c "import urllib.request; urllib.request.urlopen('http://localhost:8000/health')" || exit 1

CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8000"]

Parameters explained:

Parameter	Purpose	Typical Value
`--interval`	Time between checks	30s
`--timeout`	Max time for check to complete	5s
`--start-period`	Grace period on startup	10-30s
`--retries`	Consecutive failures before unhealthy	3

Docker Compose Profiles

Profiles let you define optional services that only start when explicitly requested — perfect for separating dev tools from production services:

yaml

# docker-compose.yml
services:
  app:
    build: .
    ports:
      - "8000:8000"
    environment:
      - OPENAI_API_KEY=${OPENAI_API_KEY}
    depends_on:
      db:
        condition: service_healthy

  db:
    image: postgres:16-alpine
    environment:
      POSTGRES_DB: myapp
      POSTGRES_PASSWORD: ${DB_PASSWORD}
    volumes:
      - pgdata:/var/lib/postgresql/data
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U postgres"]
      interval: 5s
      timeout: 5s
      retries: 5

  # Dev-only tools (only started with --profile dev)
  pgadmin:
    image: dpage/pgadmin4
    profiles: ["dev"]
    ports:
      - "5050:80"
    environment:
      PGADMIN_DEFAULT_EMAIL: admin@example.com
      PGADMIN_DEFAULT_PASSWORD: admin

  redis-insight:
    image: redis/redisinsight
    profiles: ["dev"]
    ports:
      - "5540:5540"

volumes:
  pgdata:

bash

# Start only core services
docker compose up -d

# Start core + dev tools
docker compose --profile dev up -d

Security Scanning with Docker Scout

Vulnerability scanning should be part of every CI pipeline. Docker Scout analyzes your image for known CVEs:

bash

# Scan a local image
docker scout cves myapp:latest

# Scan and get remediation advice
docker scout recommendations myapp:latest

# Compare two images (before and after update)
docker scout compare --to myapp:v2 myapp:v1

Integrating Scout in CI

yaml

# GitHub Actions integration
- name: Docker Scout
  uses: docker/scout-action@v1
  with:
    command: cves
    image: myapp:latest
    sarif-file: scout-results.sarif
    summary: true

Zero-day vulnerabilities

Scanning catches known CVEs but not zero-day exploits. Layer your defenses: distroless images + non-root user + read-only filesystem + network policies.

Production Dockerfile Checklist

Before shipping an image to production, verify:

Uses multi-stage builds (builder + runtime)
Runs as non-root user (USER appuser)
No secrets in image (use runtime env vars or secrets mounts)
Health check defined (HEALTHCHECK or Compose healthcheck)
.dockerignore excludes tests, docs, .git, .env
Image passes docker scout cves with no critical/high CVEs
Pinned base image digest (FROM python:3.12-slim@sha256:abc...)
Read-only filesystem where possible (--read-only flag)

Multi-Stage Builds​

Python Multi-Stage Example​

Node.js Multi-Stage Example​

BuildKit Cache Mounts​

Distroless Images​

Distroless vs. Slim vs. Alpine​

Health Checks​

Docker Compose Profiles​

Security Scanning with Docker Scout​

Integrating Scout in CI​

Production Dockerfile Checklist​

Multi-Stage Builds

Python Multi-Stage Example

Node.js Multi-Stage Example

BuildKit Cache Mounts

Distroless Images

Distroless vs. Slim vs. Alpine

Health Checks

Docker Compose Profiles

Security Scanning with Docker Scout

Integrating Scout in CI

Production Dockerfile Checklist