Core Concepts
Introducing a dataset and benchmark for copyright protection using text-to-image diffusion models.
Abstract
This content introduces a dataset and benchmark for copyright protection from text-to-image diffusion models. It addresses the challenges posed by advancements in text-to-image generation techniques to copyright protection. The work provides a standardized dataset, evaluation metrics, and benchmarks to assess potential copyright infringements in generated content. The dataset includes anchor images, prompts, and images generated by stable diffusion models across various categories like style, portrait, artistic creation figures, and licensed illustrations. The paper also discusses unlearning methods to forget copyrighted images using gradient ascent-based and weight pruning-based approaches.
Stats
"The CPDM dataset contains 21,000 images with 2,100 anchor images and 18,900 generated images."
"Anchor images include 1,500 in the style category, 200 in portrait category, 200 in artistic creation figure category, and 200 in licensed illustration category."
"Public access link: http://149.104.22.83/unlearning.tar.gz"