Skip to main content

Lab 16: Guardrails + Red-Team Regression

Difficulty: Advanced · Estimated time: ~3–6 hours

Objective

Harden an LLM app (RAG chatbot or agent) with guardrails and prove the hardening works:

  • implement guardrails (NeMo or equivalent)
  • build a red-team suite
  • run it in CI as regression tests

Requirements

  • At least 20 attacks (prompt injection, exfiltration, tool misuse)
  • A pass/fail report artifact
  • Document 3 mitigations and show before/after

Deliverables

  • guardrails/ config
  • redteam/ harness + attacks
  • REPORT.md with before/after results