Core Concepts
Providing a statistically significant human evaluation framework to assess the imperceptibility of unrestricted adversarial attacks on machine learning models.
Abstract
The paper proposes SCOOTER, a human evaluation framework for assessing the imperceptibility of unrestricted adversarial examples (AEs) in the image domain. Unrestricted AEs are maliciously perturbed data points that appear natural to humans but can significantly mislead state-of-the-art machine learning models.
The key highlights of the framework are:
Online study design: The framework outlines a 13-minute online study on the Prolific platform, with carefully designed prescreening, colorblindness, and comprehension checks to ensure high-quality participant data.
Continuous rating scale: Instead of a binary "modified" or "unmodified" choice, participants rate the degree of modification on a continuous scale from -100 (100% certain unmodified) to +100 (100% certain modified). This captures finer nuances of attack perceptibility.
Empirical sample size estimation: The authors conduct preliminary studies to empirically determine the appropriate sample size needed for statistically significant results, rather than relying on an a priori estimation.
Modular web application: The authors provide a ready-to-use web application with a modular design, allowing researchers to easily integrate their own AEs and conduct the human evaluation studies.
Leaderboard and image database: The framework includes an online leaderboard for comparing the imperceptibility of different unrestricted attacks across target models, as well as a database of the generated AEs for further analysis.
The proposed SCOOTER framework aims to facilitate rigorous research into unrestricted adversarial examples by providing researchers with a statistically significant human evaluation protocol and supporting tools.