Exploiting Vulnerabilities in Large Language Models: A Framework for Bypassing Content Security Measures through Intent Obfuscation
Large language models (LLMs) can be vulnerable to prompt-based jailbreak attacks that bypass their content security measures by obfuscating the true malicious intent behind user prompts.