How do I prevent prompt injection in a production LLM app?

Question

Accepted Answer

You cannot fully "prevent" prompt injection — you reduce blast radius. The 2026 best-practice stack has five layers: (1) **input classification** to catch obvious and semantic injection before the model sees it; (2) **least-privilege tool access** so a compromised model cannot exfiltrate or pay; (3) **context isolation** — never put untrusted documents in the same scope as system instructions, use structured channels (e.g., separate `documents` array, dedicated user role); (4) **output validation** — schema/regex-check anything that leaves the model en route to a tool, a database, or a user; (5) **monitoring** — log injection-classifier verdicts and alert on spikes. Practical deployment: put InjectShield (or any classifier) in front of the model as a gate, run heuristics on every request (~1 ms, free), escalate ambiguous traffic to a Haiku semantic check (~$0.0002/req), block or sanitize on a positive verdict, and feed the verdict into your SIEM. Combine with platform defenses — Claude's constitutional training, OpenAI's instruction hierarchy — but do not rely on them alone. Red-team quarterly with adversarial suites like garak and Promptmap.