Core Concepts
Developing efficient methods to remove private document data from well-trained AI models for document classification, while maintaining high performance on retained data.
Abstract
The paper explores machine unlearning techniques for document classification models, which aim to efficiently remove the knowledge of specific document categories from a well-trained model upon user request, while preserving high performance on the retained data.
Key highlights:
- Proposes machine unlearning methods for document classification, which is the first work in this area.
- Constrains the training data usage to 10% or less, making the study more practical for real-world use cases.
- Develops a label-guided sample generator to create synthetic forget set, allowing unlearning without storing the real forget data.
- Comprehensive experiments validate the effectiveness of the proposed unlearning methods, including scenarios with and without access to the real forget set.
- Finds that random labeling is a good trade-off between accuracy and efficiency for unlearning, and generated samples can effectively replace the real forget set.
- Visualizes the feature space changes during unlearning to provide insights into the underlying mechanisms.
Stats
The RVL-CDIP dataset contains 400,000 grayscale document images across 16 categories, with 25,000 images per class.
The baseline document classification model achieves 93.53% accuracy on the training set and 84.29% on the test set.
Quotes
"Machine unlearning is a new research line aimed at facilitating user requests for the removal of sensitive data."
"Privacy issues have emerged as a prominent issue within the broader field of deep learning models."