核心概念
The Lower Biased Teacher model improves the accuracy of pseudo-label generation in semi-supervised object detection tasks by integrating a localization loss into the teacher model, addressing key issues such as class imbalance and bounding box precision.
摘要
This research proposes the Lower Biased Teacher (LBT) model, an enhancement of the Unbiased Teacher (UBT) model, for semi-supervised object detection tasks. The key innovation of the LBT model is the integration of a localization loss into the teacher model, which significantly improves the accuracy of pseudo-label generation.
The paper first provides background on the challenges of semi-supervised object detection, including dataset imbalances, unclear differentiation between foreground and background, and the discrepancy between classification and object detection tasks. It then reviews relevant literature on semi-supervised learning methods, such as pseudo-labeling, consistency regularization, and the Mean Teacher framework.
The LBT model builds upon the UBT and Consistency-based Semi-Supervised Learning for Object Detection (CSD) models. During the initial "burn-in" phase, the LBT incorporates the CSD method to enable the model to learn more precise and robust feature representations from labeled data using both original and flipped images. It then introduces the Teacher-Student Mutual Learning regimen, where the Student is optimized using the pseudo-labels generated by the Teacher, and the Teacher is updated by gradually transferring the weights from the continually learned Student model.
To address the issues of duplicated box predictions and imbalanced predictions, the LBT applies class-wise non-maximum suppression (NMS) and uses focal loss instead of cross-entropy loss for the ROIhead classifier. Additionally, it adds a Consistency Localization Loss to the supervised loss to enhance the model's generalization ability on unlabeled data.
Extensive experiments on the MS-COCO and PASCAL VOC datasets demonstrate that the LBT model outperforms the UBT and CSD models, especially when the amount of labeled data is limited (0.5% to 10%). The improvements are attributed to the LBT's ability to generate more accurate pseudo-labels, address class imbalance, and mitigate errors from incorrect bounding boxes.
統計資料
The COCO dataset contains over 2.5 million labeled instances in over 328,000 images, covering 91 object types.
The PASCAL VOC dataset consists of tens of thousands of images and covers 20 different object classes.
引述
"The primary innovation of this model is the integration of a localization loss into the teacher model, which significantly improves the accuracy of pseudo-label generation."
"By addressing key issues such as class imbalance and the precision of bounding boxes, the Lower Biased Teacher model demonstrates superior performance in object detection tasks."
"Extensive experiments on multiple semi-supervised object detection datasets show that the Lower Biased Teacher model not only reduces the pseudo-labeling bias caused by class imbalances but also mitigates errors arising from incorrect bounding boxes."