Detecting Leaked Datasets Used for Training Classification Models through Synthetic Data Injection and Model Querying
A novel method, Local Distribution Shifting Synthesis (LDSS), can effectively detect if a classification model has been trained on a dataset that was leaked from the original owner's dataset, by injecting a small volume of synthetic data with local shifts in class distribution and then querying the target model.