Efficient Illicit Object Detection in X-Ray Scans Using Vision Transformers and Hybrid Architectures
This paper systematically evaluates the capabilities of Vision Transformers and hybrid architectures for the task of illicit item detection in X-ray images, demonstrating the remarkable accuracy of the DINO Transformer detector in the low-data regime, the impressive real-time performance of YOLOv8, and the effectiveness of the NextViT backbone.