Core Concepts
Addressing the potential introduction of bias during the machine unlearning process through causal intervention and the use of counterfactual examples.
Abstract
The paper proposes a method to mitigate bias in machine unlearning, which is the process of selectively removing specific knowledge from a trained model without requiring full retraining. The authors identify two main sources of bias in unlearning: data-level bias, characterized by uneven data removal, and algorithm-level bias, which leads to the contamination of the remaining dataset.
To address data-level bias, the authors adopt a causal intervention approach, where they decouple the spurious causal correlation by directly intervening on causal factors. This helps mitigate both shortcut and label bias.
To address algorithmic bias, the authors leverage counterfactual examples (CFs) as pivotal points to encompass forgotten samples into semantically similar classes. This strategy aims to make forgotten samples and their CFs indistinguishable by the model, effectively broadening the local decision boundary and minimizing the impact of forgetting on the adjacent remaining samples.
The authors validate their approach in both uniform and non-uniform deletion setups, demonstrating that their method outperforms existing unlearning baselines on various evaluation metrics, including remaining accuracy, forgetting accuracy, membership inference attack, and bias metrics such as disparate impact and equal opportunity difference.
Stats
"The right to be forgotten (RTBF) seeks to safeguard individuals from the enduring effects of their historical actions by implementing machine-learning techniques."
"Typically, we introduce an intervention-based approach, where knowledge to forget is erased with a debiased dataset."
"Experimental results demonstrate that our method outperforms existing machine unlearning baselines on evaluation metrics."
Quotes
"To mitigate bias during the unlearning process, we examined the impact of retraining an unbiased model without including the samples to be forgotten."
"We leverage CFs as pivotal points to encompass forgotten samples into semantically similar classes."
"Our main contributions are summarized as follows: (i) We propose a causal framework to formulate the machine unlearning procedure and analyze the potential source of bias induced."