核心概念
A method for detecting unauthorized data usage in text-to-image diffusion models by planting injected memorization into the models trained on the protected dataset.
摘要
The paper proposes a method called DIAGNOSIS for detecting unauthorized data usage in the training or fine-tuning process of text-to-image diffusion models. The key idea is to plant unique behaviors, called injected memorization, into the models trained on the protected dataset by modifying the dataset. This is done by adding stealthy transformations (signal function) to a subset of the protected images. The models trained or fine-tuned on the modified dataset will memorize the signal function, which can then be detected using a binary classifier.
The paper defines two types of injected memorization: unconditional and trigger-conditioned. The former is always activated, while the latter is only activated when a specific text trigger is used. The paper then describes the overall pipeline, including the dataset coating phase and the detection phase.
Experiments are conducted on mainstream text-to-image diffusion models (Stable Diffusion and VQ Diffusion) with different training or fine-tuning methods (LoRA, DreamBooth, and standard training). The results show that DIAGNOSIS can effectively detect unauthorized data usage with 100% accuracy, while having a small influence on the generation quality of the models.
The paper also discusses the influence of different warping strengths and coating rates on the injected memorization and the generation quality. It compares DIAGNOSIS to an existing method and demonstrates its superior performance.
统计
The average memorization strength for models with unauthorized data usage is 91.2%, while it is only 5.1% for models without unauthorized data usage.
The FID for the model with unconditional injected memorization is 218.28, compared to 199.29 for the standard model without any injected memorization.
The FID for the model with trigger-conditioned injected memorization is 239.03 when the text trigger is added, compared to 209.16 without the text trigger.
引用
"Recent text-to-image diffusion models have shown surprising performance in generating high-quality images. However, concerns have arisen regarding the unauthorized data usage during the training or fine-tuning process."
"Existing work such as Glaze (Shan et al., 2023) prevents unauthorized usage of data by adding carefully calculated perturbations to safeguarded artworks, causing text-to-image diffusion models to learn significantly different image styles. While it prevents the unauthorized usages, it also makes authorized training impossible."
"Different from the sample-level memorization, in this work, we focus on diffusion models' memorization on specific elements in the training data and propose an approach for detecting unauthorized data usages via planting the injected element-level memorizations into the model trained or fine-tuned on the protected dataset by modifying the protected training data."