Lin, S., Kou, Y., Li, B., Hu, W., & Gao, J. (2024). HSTrack: Bootstrap End-to-End Multi-Camera 3D Multi-object Tracking with Hybrid Supervision. arXiv preprint arXiv:2411.06780.
This paper addresses the challenge of optimizing end-to-end camera-based 3D multi-object tracking (MOT) systems, particularly the competition between track queries (for tracking existing objects) and object queries (for detecting new objects) within the popular tracking-by-query-propagation paradigm.
The researchers propose HSTrack, a plug-and-play method that introduces a parallel decoder alongside the standard transformer decoder in the tracking model. This parallel decoder shares weights with the standard decoder but lacks self-attention layers, mitigating the competition between query types. HSTrack employs hybrid supervision, using one-to-one label assignment for track queries and one-to-many assignment for object queries in the parallel decoder. Additionally, it incorporates associative supervision based on an affinity matrix to enhance the learning of discriminative representations for both query types.
HSTrack offers a simple yet effective solution to improve the optimization and performance of end-to-end 3D MOT systems. By mitigating competition between query types and employing hybrid supervision, HSTrack achieves superior accuracy in both object detection and tracking.
This research contributes to the field of computer vision and autonomous driving by advancing the development of more accurate and efficient 3D MOT systems. The proposed method has the potential to enhance the performance of perception systems in self-driving vehicles and other applications that rely on robust object tracking.
The study primarily focuses on the nuScenes dataset and a specific tracking paradigm. Future research could explore the generalizability of HSTrack to other datasets and tracking paradigms. Additionally, investigating the impact of different training sample lengths and label assignment strategies could further optimize the method's performance.
To Another Language
from source content
arxiv.org
Key Insights Distilled From
by Shubo Lin, Y... at arxiv.org 11-12-2024
https://arxiv.org/pdf/2411.06780.pdfDeeper Inquiries