แนวคิดหลัก
OneTracker unifies RGB and RGB+X tracking tasks efficiently, achieving state-of-the-art performance by pretraining a Foundation Tracker and adapting it to downstream tasks using prompt-tuning techniques.
บทคัดย่อ
OneTracker introduces a general framework for visual object tracking, combining Foundation Tracker for pretraining on RGB tracking datasets and Prompt Tracker for efficient adaptation to downstream RGB+X tracking tasks. The approach involves large-scale pretraining, parameter-efficient finetuning, and the integration of multimodal information through CMT Prompters and TTP Transformer layers. Extensive experiments across 6 popular tracking tasks demonstrate superior performance compared to existing models.
สถิติ
OneTracker achieves 70.5 AUC on LaSOT and 69.7 AUC on TrackingNet.
Prompt Tracker outperforms all existing RGB+N trackers with at least 1.7 AUC and 2.5 precision on OTB99.
Prompt Tracker surpasses all other trackers in DepthTrack, LasHeR, VisEvent, OTB, DAVIS17 benchmarks.
The number of CMT Prompter layers positively impacts the performance of the model.
คำพูด
"Our contributions are summarized as follows: We present a unified tracking architecture termed as OneTracker."
"OneTracker achieves state-of-the-art performance on 11 benchmarks from 6 tracking tasks."
"Our results demonstrate the effectiveness of CMT Prompters and TTP Transformer layers in enhancing tracking performance."