Lab 16: Guardrails + Red-Team Regression
Difficulty: Advanced · Estimated time: ~3–6 hours
Objective
Harden an LLM app (RAG chatbot or agent) with guardrails and prove the hardening works:
- implement guardrails (NeMo or equivalent)
- build a red-team suite
- run it in CI as regression tests
Requirements
- At least 20 attacks (prompt injection, exfiltration, tool misuse)
- A pass/fail report artifact
- Document 3 mitigations and show before/after
Deliverables
guardrails/configredteam/harness + attacksREPORT.mdwith before/after results