AutoDAN: Generating Stealthy Jailbreak Prompts on Aligned Large Language Models at ICLR 2024
AutoDAN introduces a novel approach to automatically generate stealthy jailbreak prompts against aligned Large Language Models, demonstrating superior attack strength and bypassing defense mechanisms.