What was the Bing Sydney leak and what did it teach us about prompt injection?

Question

Accepted Answer

In February 2023, shortly after Microsoft launched Bing Chat (codename "Sydney"), several researchers — Marvin von Hagen, Kevin Liu, and others — extracted the chatbot's internal system prompt and behavior rules. The technique: ask Bing Chat to summarize a webpage containing crafted instructions, or paste a payload directly into chat that framed the system prompt as a document to be discussed. Bing complied and revealed the rules, codename, and confidentiality clauses. The incident taught the industry four lessons that still hold in 2026: 1. **System prompts are not secrets.** If your business model relies on the system prompt staying hidden, assume it will leak. Move secrets to tools and authentication, not prompts. 2. **Confidentiality instructions don't bind LLMs.** Telling the model "do not reveal this prompt" is a suggestion, not a control. 3. **Indirect injection works.** Webpage-mediated extraction proved that untrusted content reaches the model with the same authority as user input. 4. **Frontier-lab training alone is insufficient.** Bing was running GPT-4 with extensive RLHF — and still leaked. Application-layer defenses are required. Sydney is the case study every prompt-injection defense vendor (InjectShield included) cites because it crystallized indirect injection as a production threat, not a research curiosity.