Core Concepts
Designing a novel spatio-temporal tracker with bi-directional memory and Gaussian mask filtering enhances object tracking accuracy.
Abstract
The content introduces the STMD-Tracker, focusing on 3D single object tracking within LIDAR point clouds. It addresses challenges faced by existing methods, such as tracker drift due to similar objects or occlusions. The innovative approach involves a multi-frame spatio-temporal graph convolution backbone, bi-directional cross-frame memory module, and Gaussian mask filtering to improve tracking precision and reduce errors caused by distractors. Extensive experiments on KITTI, NuScenes, and Waymo datasets demonstrate superior performance compared to state-of-the-art methods.
Abstract:
- 3D single object tracking in LIDAR point clouds is crucial for computer vision applications.
- Existing methods face challenges like tracker drift due to similar objects or occlusions.
- The STMD-Tracker introduces innovative features to enhance tracking accuracy.
Introduction:
- Deep learning approaches have advanced 2D single object tracking but face challenges in 3D point cloud tracking.
- Siamese trackers primarily use matching-based or motion-based methods but overlook historical frame contextual information.
Methodology:
- The STMD-Tracker integrates multi-frame temporal encoding and a bi-directional cross-frame memory module.
- A Gaussian mask is applied to filter out distractor points for accurate localization.
Results:
- Extensive experiments on KITTI, NuScenes, and Waymo datasets show significant improvements over state-of-the-art methods.
- Visualization of tracking results demonstrates the effectiveness of the proposed method.
Stats
"Our method surpasses previous state-of-the-art method MBPTrack of 0.3/0.3 in average performance."
"STMD-Tracker outperforms MBPTrack (Xu et al. 2023b) in the Pedestrian category by 1.54 in Success and 1.24 in Precision."
Quotes
"Our method can track the target through intermittent occlusions and clearances."
"Our approach achieves best tracking outcomes that surpass all other methods across various degrees of point sparsity."