Core Concepts
Misalignment in shadow models, primarily caused by different weight initializations, significantly impacts white-box membership inference attacks.
Abstract
The study delves into the impact of misalignment in shadow models on white-box membership inference attacks. It explores causes such as dataset differences, randomness in weight initialization, batch ordering, and dropout selection. The research highlights the importance of re-alignment techniques to reduce misalignment and improve attack performance. Results show that misalignment affects internal layer features used for attacks, emphasizing the need for alignment strategies to enhance attack accuracy.
Stats
On the CIFAR10 dataset with a false positive rate of 1%, white-box MIA using re-aligned shadow models improves the true positive rate by 4.5%.