Core Concepts
Efficiently addressing threats and attacks in machine unlearning systems is crucial for enhancing AI safety and reliability.
Abstract
This comprehensive survey delves into the threats, attacks, and defenses within machine unlearning systems. It provides a detailed analysis of methodologies, creates a taxonomy based on threat models, explores how unlearning can act as a defense, and discusses how attacks can serve as tools for testing and improving unlearning systems. The survey also identifies challenges and outlines future research directions to improve the safety, reliability, and privacy compliance of machine unlearning.
Directory:
Abstract
Machine Unlearning's Importance in AI Safety
Introduction
Knowledge Removal Concerns
Machine Unlearning Systems Structure
Roles of Participants in MU Systems
Threats in Unlearning
Information Leakage from Model Discrepancy and Knowledge Dependency
Malicious Unlearning Attacks
Direct vs Preconditioned Attacks
Defense Through Unlearning
Model Recovery Strategies
Value Alignment with Unlearning
Aligning AI Operations with Ethical Standards
Evaluating Unlearning Through Attacks
Audit of Privacy Leakage, Assessment of Model Robustness, Proof of Unlearning
Challenges and Promising Directions
Defenses against Malicious Unlearning, Federated Unlearning Challenges, Privacy Preservation Concerns, Large Models Exploration
Stats
"Recently, Machine Unlearning (MU) has gained considerable attention for its potential to improve AI safety by removing the influence of specific data from trained Machine Learning (ML) models."
"Efforts have been made to design efficient unlearning approaches..."
"Exact unlearning techniques typically involve retraining but limit the scope of data involved to enhance efficiency over naive retraining approaches."
Quotes
"Unlearned Data Training Retraining Retrained Model"
"Model developer: responsible for conducting model training based on the training data."
"Data contributors: responsible for providing data to construct the training dataset."