Core Concepts
Zero-Shot Aerial Object Detection with Visual Description Regularization significantly improves detection accuracy for unseen classes in aerial images.
Abstract
Existing object detection models are trained on large-scale labeled datasets, posing challenges for novel aerial object classes due to expensive annotation.
Proposed method, DescReg, addresses weak semantic-visual correlation in aerial objects by incorporating textual descriptions.
Extensive experiments on DIOR, xView, and DOTA datasets show DescReg outperforms state-of-the-art ZSD methods.
DescReg integrates structural regularization to improve inter-class similarity and transfer knowledge effectively.
Contributions include in-depth analysis, methodological design, and validation on challenging datasets.
Future research directions include exploring non-uniform spatial processing and incorporating label-efficient methods.
Stats
"DescReg significantly outperforms the best reported ZSD method on DIOR by 4.5 mAP on unseen classes and 8.1 in HM."
"DescReg achieves nearly two-fold improvement in unseen mAP compared to the best-performing ContrastZSD method on xView and DOTA datasets."
Quotes
"Our method is extensively validated on three challenging aerial object detection datasets and shows significantly improved performance to the prior ZSD methods."
"We hope our method and newly established experimental setups provide a baseline for future research."