DeNetDM: A Depth-Modulated Approach for Debiasing Neural Networks without Bias Annotations
DeNetDM, a novel debiasing method, leverages the variations in linear decodability of bias and core attributes across neural network depths to effectively separate and mitigate unwanted biases without requiring any prior knowledge or annotations about the biases.