Prompt Injection Attacks on Large Language Models

Iniciar sesión

Información - Prompt Injection Attacks on Large Language Models

Automatic and Universal Prompt Injection Attacks against Large Language Models: Understanding, Framework, and Defense

The author introduces a unified framework for prompt injection attacks on Large Language Models (LLMs) and presents an automated gradient-based method to generate effective and universal prompt injection data. The core thesis is the importance of understanding and defending against prompt injection attacks in LLM-integrated applications.

Scaling Behavior of Large Language Models in Machine Translation with Prompt Injection Attacks

Large language models can exhibit inverse scaling behavior under prompt injection attacks, affecting machine translation tasks.

Automated Prompt Injection Testing for Robust Large Language Model Evaluation

Leveraging fuzzing techniques to systematically assess the robustness of large language models against prompt injection attacks and uncover vulnerabilities, even in the presence of strong defense mechanisms.

Acerca de

Productos | Recursos

Perspectivas