Evaluating the Effectiveness of Distribution Inference Attacks and Defenses on Machine Learning Models
Distribution inference attacks aim to infer statistical properties of data used to train machine learning models. The authors develop a new black-box attack that outperforms the best known white-box attack in most settings, and evaluate the impact of relaxing assumptions about the adversary's knowledge. They also find that while noise-based defenses provide little mitigation, a simple re-sampling defense can be highly effective.