Core Concepts
The author presents a new framework called Private Evolution (PE) to generate differentially private synthetic data using blackbox APIs of foundation models, achieving promising results without the need for model training.
Abstract
The content discusses the challenges of privacy in data-driven approaches and introduces a novel framework, PE, that leverages foundation model APIs to generate differentially private synthetic data. PE shows promising results in generating high-quality synthetic images while maintaining privacy guarantees. The paper highlights the potential of API-based solutions in democratizing the deployment of DP synthetic data and addresses ethical considerations related to privacy and model usage.
Key points include:
- Introduction to differential privacy and the importance of generating differentially private synthetic data.
- Proposal of the Private Evolution (PE) framework for generating DP synthetic data via APIs.
- Experimental results demonstrating the effectiveness of PE on various datasets with large distribution shifts.
- Ablation studies on pre-trained networks, hyperparameters, and scalability of PE in generating unlimited samples.
- Future work suggestions including exploring applications beyond images and addressing other privacy concerns.
Stats
For example, on CIFAR10 (with ImageNet as the public data), we achieve FID≤7.9 with privacy cost ϵ = 0.67, significantly improving the previous SOTA from ϵ = 32.
We create a DP synthetic version (with ε = 7.58) of Camelyon17, a medical dataset for classification of breast cancer metastases, using the same ImageNet-pre-trained model.
Quotes
"Generating differentially private (DP) synthetic data that closely resembles the original private data is a scalable way to mitigate privacy concerns in the current data-driven world."
"In this paper, we present a new framework called Private Evolution (PE) to solve this problem and show its initial promise on synthetic images."
"PE can match or even outperform state-of-the-art methods without any model training."