Probing and Mitigating Intersectional Social Biases in Vision-Language Models using Synthetic Counterfactual Examples
Leveraging text-to-image diffusion models, we generate a large-scale dataset of synthetic counterfactual image-text pairs to probe and mitigate intersectional social biases in state-of-the-art vision-language models.