Core Concepts
A novel Feature Corrective Transfer Learning (FCTL) approach that leverages transfer learning and a custom loss function to enable end-to-end object detection in challenging non-ideal visual conditions, without the need for image preprocessing.
Abstract
The paper introduces a Feature Corrective Transfer Learning (FCTL) framework to address the challenge of robust object detection under non-ideal visual conditions, such as rain, fog, low illumination, or raw Bayer images without ISP processing.
The key aspects of the methodology are:
Initial training of a comprehensive object detection model (Faster R-CNN) on a pristine RGB dataset to establish a strong baseline.
Generation of non-ideal image versions (e.g., rainy, foggy, low-light, raw Bayer) from the original dataset.
Fine-tuning of the same model on the non-ideal images, but with a novel loss function called Extended Area Novel Structural Discrepancy Loss (EANSDL) that compares the feature maps of the model trained on ideal and non-ideal images. This allows for direct feature map correction without modifying the underlying model architecture.
The EANSDL loss function adaptively balances the analysis between detailed pixel-level discrepancies and broader spatial pattern alignments, dynamically adjusting the gradient consistency evaluation across the feature pyramid's hierarchical layers.
The proposed Non-Ideal Image Transfer Faster R-CNN (NITF-RCNN) model, which incorporates the FCTL approach, demonstrates significant improvements in mean Average Precision (mAP) compared to the baseline Faster R-CNN model, with relative gains of 3.8-8.1% under various non-ideal conditions. The model's performance on non-ideal datasets also approaches that of the baseline on the original ideal dataset, showcasing its robustness and versatility.
Stats
The Rainy-KITTI dataset has 7 different rain intensity levels, and the Foggy-KITTI dataset has 7 different visibility conditions due to fog.
The Dark-KITTI dataset was generated using the UNIT algorithm to create realistic night-time images from the KITTI and BDD100K datasets.
The Raw-KITTI dataset was generated by applying a method from prior work to create synthetic color Bayer images from the original KITTI dataset.
Quotes
"Feature Corrective Transfer Learning (FCTL), a novel approach that leverages transfer learning and a bespoke loss function to facilitate the end-to-end detection of objects in these challenging scenarios without the need to convert non-ideal images into their RGB counterparts."
"By prioritizing direct feature map correction over traditional preprocessing, this process iteratively enhances the model's ability to detect objects under adverse conditions."