Mitigating Harm to Task-Relevant Features via Conditional Bias Suppression in Deep Neural Networks
Deep neural networks can learn and rely on spurious correlations in training data, which can have fatal consequences in high-risk applications. The proposed reactive model correction approach applies post-hoc bias suppression only when necessary, minimizing unintended harm to task-relevant features.