SATBA proposes an imperceptible backdoor attack using spatial attention, overcoming limitations of existing methods and ensuring high attack success rate.
VLMs are vulnerable to ImgTrojan attacks, compromising safety barriers with poisoned images.
Gradient Cuff proposes a two-step method to detect jailbreak attempts on Large Language Models by exploring refusal loss landscapes.
Decomposing and reconstructing prompts can effectively jailbreak LLMs, concealing malicious intent and increasing success rates.
Proposing AerisAI for secure decentralized AI collaboration with differential privacy and homomorphic encryption.
Proposing a defense method using backtranslation to protect LLMs from jailbreaking attacks.
Bridging the gap between academic threat models and practical AI security by studying real-world threat models.
Watermark-based AI-generated image detectors are vulnerable to transfer evasion attacks, even in the absence of access to detection APIs.
Bergeron introduces a framework to enhance the robustness of AI models against adversarial attacks without additional training.
Robust defense strategies are crucial in countering adversarial patch attacks on object detection AI models.