Core Concepts
Proposing the Bias-to-Text (B2T) framework to interpret visual biases as keywords, aiding in bias discovery and debiasing in image classifiers.
Abstract
The content introduces the Bias-to-Text (B2T) framework for identifying and mitigating visual biases in computer vision models. It addresses the challenges of visual bias mitigation by proposing a method that interprets biases as keywords extracted from image captions. The framework aims to provide a clear group naming for bias discovery and facilitate debiasing using these group names. The content is structured into sections covering the introduction, related work, the Bias-to-Text framework, bias discovery, applications of B2T keywords, model comparison, and label diagnosis. It includes experiments, results, and discussions on identifying known biases, discovering novel biases, and applying bias keywords for debiased training, CLIP prompting, model comparison, and label diagnosis.
1. Introduction
- Addressing biases in computer vision models is crucial for real-world AI deployments.
- Visual biases are challenging to mitigate due to their unexplainable nature.
2. Related Work
- Previous research has focused on recognizing and addressing biases in models.
- Studies have attempted to identify visual biases by analyzing problematic samples or attributes.
3. Bias-to-Text (B2T) Framework
- B2T interprets visual biases as keywords extracted from image captions.
- The framework validates bias keywords using a vision-language scoring model like CLIP.
4. Discovering Biases in Image Classifiers
- B2T identifies known biases in benchmark datasets and uncovers novel biases in larger datasets.
- The bias keywords inferred by B2T can be used for debiased training and model comparison.
5. Applications of the B2T Keywords
- B2T keywords can be utilized for debiased training, CLIP zero-shot prompting, model comparison, and label diagnosis.
- The framework offers various applications to assist in responsible image recognition.
6. Ablation Study
- The study evaluates the effect of different captioning and scoring models on the B2T framework.
- Results show consistent rankings and reliable performance across various models.
7. Conclusion
- The B2T framework offers a practical approach to identifying and mitigating biases in image classifiers.
- The framework aims to assist humans in making decisions based on bias keywords.
Stats
To tackle this issue, we propose the Bias-to-Text (B2T) framework, which interprets visual biases as keywords.
Our experiments demonstrate that B2T can identify known biases, such as gender bias in CelebA, background bias in Waterbirds, and distribution shifts in ImageNet-R/C.
For example, we discovered a contextual bias between “bee” and “flower” in ImageNet.
Quotes
"Our experiments demonstrate that B2T can identify known biases, such as gender bias in CelebA, background bias in Waterbirds, and distribution shifts in ImageNet-R/C."