Keskeiset käsitteet
A novel progressive representation learning framework, PRL-Track, is proposed to learn robust fine object representations for real-time UAV tracking by leveraging the complementary strengths of CNNs and Vision Transformers.
Tiivistelmä
The proposed PRL-Track framework consists of two main components: coarse representation learning and fine representation learning.
Coarse Representation Learning:
- The CNN-based backbone is used to extract multi-scale features.
- An appearance-aware regulator is designed to mitigate appearance interference and extract useful information from shallow features.
- A semantic-aware regulator is developed to capture semantic information and promote the concentration of deep features.
Fine Representation Learning:
- A hierarchical modeling generator (HMG) is proposed to fuse the interaction information between coarse object representations.
- The HMG decomposes the coarse object representations into query, key, and value pairings with different hierarchies, and then performs cross-attention to capture the relationship between them.
The progressive learning process empowers PRL-Track to generate robust object representations, enabling it to better address the challenges in complex UAV scenarios, such as occlusion and aspect ratio change. Extensive experiments, including challenging real-world tests, demonstrate that PRL-Track achieves outstanding performance compared to other state-of-the-art trackers.
Tilastot
The proposed PRL-Track achieves a precision of 0.786 and a success rate of 0.602 on the UAVTrack112 benchmark.
PRL-Track surpasses the average precision and success rate of 14 state-of-the-art trackers by 7.8% and 14.1%, respectively, on the combination of UAV tracking benchmarks.
PRL-Track can achieve a tracking speed exceeding 42.6 frames per second on a typical UAV platform equipped with an edge smart camera.
Lainaukset
"PRL-Track delivers exceptional performance on three authoritative UAV tracking benchmarks."
"Real-world tests indicate that the proposed PRL-Track realizes superior tracking performance with 42.6 frames per second on the typical UAV platform equipped with an edge smart camera."