Core Concepts
Spotlighting is an effective prompt engineering technique to defend against indirect prompt injection attacks on large language models.
Stats
"We find that spotlighting reduces the attack success rate from greater than 50% to below 2% in our experiments."
"All experiments are conducted with temperature set to 1.0."
Quotes
"We introduce spotlighting, a family of prompt engineering techniques that can be used to improve LLMs’ ability to distinguish among multiple sources of input."
"Using GPT-family models, we find that spotlighting reduces the attack success rate from greater than 50% to below 2% in our experiments."