Backdoor attacks can mislead contrastive learning feature extractors to associate trigger patterns with target classes, leading to misclassification of triggered inputs. The authors propose a bi-level optimization approach to identify a resilient backdoor trigger design that can maintain high similarity between triggered and target-class data in the embedding space, even under special contrastive learning mechanisms like data augmentation and uniformity.
A framework for secure and efficient private inference of deep neural network models by partitioning the model layers between a trusted execution environment (TEE) and a GPU accelerator, balancing privacy preservation and computational efficiency.
The core message of this work is to provide a sound mathematical formulation to prove the existence of an optimal explanation variance threshold that an adversary can utilize to launch membership inference attacks against machine learning models.
Unlabeled data can be maliciously poisoned to inject backdoors into self-supervised learning models, even without any label information.
Distribution inference attacks aim to infer statistical properties of data used to train machine learning models. The authors develop a new black-box attack that outperforms the best known white-box attack in most settings, and evaluate the impact of relaxing assumptions about the adversary's knowledge. They also find that while noise-based defenses provide little mitigation, a simple re-sampling defense can be highly effective.
Quantization increases the average point distance to the decision boundary, making it more difficult for attacks to optimize over the loss surface. Quantization can act as a noise attenuator or amplifier, depending on the noise magnitude, and causes gradient misalignment. Train-based defenses increase adversarial robustness by increasing the average point distance to the decision boundary, but still need to address quantization-shift and gradient misalignment.
Unlearning inversion attacks can reveal the feature and label information of unlearned data by exploiting the difference between the original and unlearned models in machine unlearning.
Integrating foundation models into federated learning systems introduces new vulnerabilities that can be exploited by adversaries through a novel attack strategy, highlighting the critical need for enhanced security measures.
Adversaries can poison pre-trained models to significantly increase the success rate of membership inference attacks, even when victims fine-tune the models using their own private datasets.
An attacker can tamper with the weights of a pretrained machine learning model to create "privacy backdoors" that enable the reconstruction of individual training samples used to finetune the model.