Curiosity-Driven Red-Teaming for Large Language Models: Enhancing Safety and Diversity
The author argues that curiosity-driven red teaming can enhance safety by identifying toxic outputs of large language models while also improving diversity in test cases.