Security Best Practices

The OWASP LLM Top 10 defines the most critical security risks for LLM applications. Understanding these is essential before deploying any LLM service to production.

OWASP LLM Top 10

#	Risk	Description
LLM01	Prompt Injection	Manipulating LLM behavior through crafted inputs
LLM02	Insecure Output Handling	Treating LLM output as safe without validation
LLM03	Training Data Poisoning	Manipulating training data to introduce vulnerabilities
LLM04	Model Denial of Service	Overloading LLM with resource-intensive requests
LLM05	Supply Chain Vulnerabilities	Compromised third-party models, data, or plugins
LLM06	Sensitive Information Disclosure	LLM revealing confidential data in responses
LLM07	Insecure Plugin Design	Plugins with inadequate input validation or access control
LLM08	Excessive Agency	LLM with too much autonomy or access
LLM09	Overreliance	Blindly trusting LLM output without verification
LLM10	Model Theft	Unauthorized access, copying, or extraction of model weights

Rate limiting

Essential for preventing abuse and DoS attacks:

python

from fastapi import FastAPI, Request, HTTPException
from time import time
from collections import defaultdict

app = FastAPI()

class RateLimiter:
    def __init__(self, max_requests: int = 10, window_seconds: int = 60):
        self.max_requests = max_requests
        self.window = window_seconds
        self.requests: dict[str, list[float]] = defaultdict(list)

    def is_allowed(self, key: str) -> bool:
        now = time()
        self.requests[key] = [t for t in self.requests[key] if now - t < self.window]
        if len(self.requests[key]) >= self.max_requests:
            return False
        self.requests[key].append(now)
        return True

limiter = RateLimiter(max_requests=20, window_seconds=60)

@app.middleware("http")
async def rate_limit_middleware(request: Request, call_next):
    client_ip = request.client.host
    if not limiter.is_allowed(client_ip):
        raise HTTPException(status_code=429, detail="Rate limit exceeded")
    return await call_next(request)

Input sanitization

Never trust user input — validate and sanitize before sending to the LLM:

python

import re
from html import escape

def sanitize_llm_input(text: str, max_length: int = 4000) -> str:
    """Sanitize user input before sending to LLM."""
    # Truncate
    text = text[:max_length]

    # Remove null bytes
    text = text.replace('\x00', '')

    # Remove control characters (except newline, tab)
    text = re.sub(r'[\x00-\x08\x0b\x0c\x0e-\x1f\x7f]', '', text)

    # Remove common injection patterns
    injection_patterns = [
        r'(?i)ignore\s+(all\s+)?previous\s+instructions',
        r'(?i)system\s*:',
        r'(?i)you\s+are\s+now\s+a',
        r'(?i)jailbreak',
    ]
    for pattern in injection_patterns:
        text = re.sub(pattern, '[FILTERED]', text)

    return text.strip()

API key management

python

import os
from dotenv import load_dotenv
from google.cloud import secretmanager

def get_api_key(key_name: str) -> str:
    """Get API key from environment or Secret Manager."""
    # First check environment (for local dev)
    env_key = os.environ.get(key_name)
    if env_key:
        return env_key

    # Fall back to Google Secret Manager (for production)
    client = secretmanager.SecretManagerServiceClient()
    project_id = os.environ.get("GCP_PROJECT_ID")
    name = f"projects/{project_id}/secrets/{key_name}/versions/latest"
    response = client.access_secret_version(request={"name": name})
    return response.payload.data.decode("UTF-8")

# Key rotation strategy
class APIKeyRotator:
    def __init__(self, keys: list[str]):
        self.keys = keys
        self.index = 0

    def get_key(self) -> str:
        return self.keys[self.index]

    def rotate(self):
        self.index = (self.index + 1) % len(self.keys)

Audit logging

Log all LLM interactions for security analysis:

python

import logging
import json
from datetime import datetime
from fastapi import Request

logger = logging.getLogger("llm_audit")

def log_llm_interaction(
    user_id: str,
    input_text: str,
    output_text: str,
    model: str,
    flagged: bool = False,
):
    """Log LLM interaction for audit trail."""
    entry = {
        "timestamp": datetime.utcnow().isoformat(),
        "user_id": user_id,
        "model": model,
        "input_hash": hash(input_text),  # Don't log raw input for privacy
        "output_hash": hash(output_text),
        "input_length": len(input_text),
        "output_length": len(output_text),
        "flagged": flagged,
    }
    logger.info(json.dumps(entry))

# Flag suspicious interactions
def is_suspicious(text: str) -> bool:
    suspicious_patterns = [
        r'(?i)(hack|exploit|attack|vulnerability)',
        r'(?i)(password|credential|secret|token)',
        r'(?i)(admin|root|sudo)',
    ]
    return any(re.search(p, text) for p in suspicious_patterns)

Dependency scanning

bash

# Scan Python dependencies
pip install safety
safety check --json

# Scan Docker image for vulnerabilities
docker scout cves my-image:latest

# Scan with Trivy
trivy image my-image:latest

Production security checklist

Before deploying your LLM application:

Security is a process, not a feature

Security requires ongoing attention. Set up automated scanning, regular audits, and a culture of security awareness in your team.

OWASP LLM Top 10​

Rate limiting​

Input sanitization​

API key management​

Audit logging​

Dependency scanning​

Production security checklist​

OWASP LLM Top 10

Rate limiting

Input sanitization

API key management

Audit logging

Dependency scanning

Production security checklist