Core Concepts
Incorporating human-perceptual saliency into model training can improve generalization in biometric presentation attack detection and synthetic face detection tasks. The optimal level of saliency granularity and source varies across biometric modalities and model architectures.
Abstract
The paper explores the impact of saliency granularity and source on the generalization performance of models for iris presentation attack detection (iris-PAD) and synthetic face detection tasks.
For iris-PAD, the authors find that using saliency at the Area of Interest (AOI) granularity, either from human subjects or models trained to mimic human saliency, leads to the best generalization across different CNN architectures. Fine-grained saliency at the Features of Interest (FOI) level does not provide additional benefits over the simpler AOI saliency.
For synthetic face detection, the optimal saliency granularity depends more on the model architecture. ResNet models perform best with Boundary of Interest (BOI) saliency, while DenseNet and Inception models benefit more from FOI saliency. Saliency from models mimicking human subjects also provides substantial gains over the baseline for some architectures.
The authors also explore using saliency from domain-specific segmentation models, but find this to be less effective than human-sourced saliency or models mimicking human saliency. The results suggest that human involvement in the saliency generation process is crucial for achieving the best generalization performance.
Stats
The paper reports Area Under the Curve (AUC) scores for model generalization performance on iris-PAD and synthetic face detection tasks.
Quotes
"Our results suggest that the quantity of saliency contributes more to model generalization more than its quality (depending on the biometric modality)."
"We find that substantial performance gains can be made within saliency-based training by using optimal salience granularity with no additional overhead."