Core Concepts
Neural networks can be vulnerable to poisoning attacks, but crafting base samples with guided diffusion can lead to potent poisons and backdoors.
Stats
Modern neural networks are often trained on massive datasets that are web scraped with minimal human inspection. As a result of this insecure curation pipeline, an adversary can poison or backdoor the resulting model by uploading malicious data to the internet and waiting for a victim to scrape and train on it.
Existing approaches for creating poisons and backdoors start with randomly sampled clean data, called base samples, and then modify those samples to craft poisons.
In our experiments on CIFAR-10, injecting only 25-50 poisoned samples was enough for the attack to be effective. On ImageNet, modifying only a tiny subset of training images (0.004%-0.008%) was sufficient to poison the model effectively.
Quotes
"Crafting base samples from scratch allows us to optimize them specifically for the poisoning objective."
"Our approach amplifies the effects of state-of-the-art targeted data poisoning and backdoor attacks across multiple datasets."
"GDP outperforms all existing backdoor attacks in our experiments."