Comprehensive Benchmark for Assessing Safety Risks in Large Language Models through Adversarial Red Teaming
Introducing ALERT, a comprehensive benchmark to assess the safety of large language models through adversarial red teaming and a novel fine-grained safety risk taxonomy.