Conceptos Básicos
The core message of this paper is to address the robust fairness issue in adversarial learning by leveraging distributional robust optimization (DRO) to learn class-wise distributionally adversarial weights, which can enhance the model's robustness and fairness simultaneously.
Resumen
The paper investigates the robust fairness issue in adversarial training, where the robust accuracy can vary significantly across different classes or categories. The authors propose a novel learning paradigm called Fairness-Aware Adversarial Learning (FAAL) to address this challenge.
The key highlights are:
- The authors analyze the robust fairness issue from the perspective of group/class distributional shift, and propose to leverage distributional robust optimization (DRO) to handle this challenge.
- FAAL extends the conventional min-max adversarial training framework into a min-max-max formulation, where the intermediate maximization step is dedicated to learning the class-wise distributionally adversarial weights.
- Comprehensive experiments on CIFAR-10 and CIFAR-100 datasets demonstrate that FAAL can fine-tune a robust but unfair model into a model with both fairness and robustness within only two epochs, without compromising the overall clean and robust accuracies.
- The authors also show that combining FAAL with stronger adversarial attacks (e.g., AWP) can further boost the worst-class robust accuracy without sacrificing the average performance.
- When training the model from scratch, FAAL also outperforms state-of-the-art methods like CFA and WAT in terms of achieving high worst-class robust accuracy while maintaining comparable average robustness.
Estadísticas
The robust accuracy on the "cat" class is the lowest among all classes in the adversarially trained Wide-ResNet34-10 model on CIFAR-10.
The most significant disparity between clean and robust accuracy arises in the "deer" class.
Citas
"The robust fairness issue in the conventional AT is due to the unknown group (class) distribution shift induced by the generated adversarial perturbations, which results in the overfitting problem."
"Rather than assuming a fixed uniform data distribution, DRO acknowledges the inherent distributional uncertainty in real-world data, offering a more resilient and adaptable model structure."