المفاهيم الأساسية
The authors propose a Selective Fusion method to address the challenges in Unsupervised Gait Recognition, including sequences of the same person in different clothes tending to cluster separately and sequences taken from front/back views lacking walking postures and not clustering well with other views.
الملخص
The authors focus on the task of Unsupervised Gait Recognition (UGR), which aims to train gait recognition models without labeled datasets. They first establish a baseline using a cluster-based method with contrastive learning.
The authors identify two main challenges in UGR:
- Sequences of the same person in different clothes tend to cluster separately due to significant appearance changes.
- Sequences taken from front/back views (0°/180°) lack walking postures and do not cluster well with sequences taken from other views.
To address these challenges, the authors propose a Selective Fusion method, which includes:
- Selective Cluster Fusion (SCF): This module generates a support set for each cluster to find potential candidate clusters of the same person in different clothes, and uses a multi-cluster update strategy to pull these candidate clusters closer.
- Selective Sample Fusion (SSF): This module uses a view classifier to identify sequences captured from front/back views, and then employs curriculum learning to gradually incorporate these sequences with those captured from other views.
Extensive experiments on CASIA-BN, Outdoor-Gait, and GREW datasets show that the proposed Selective Fusion method can bring consistent improvement over the baseline, especially in the walking with different coat conditions.
الإحصائيات
"Sequences of different subjects tend to cluster separately due to significant appearance changes."
"Sequences taken from front/back views (0°/180°) lack walking postures and do not cluster well with sequences taken from other views."
اقتباسات
"Sequences of the same person in different clothes tend to cluster separately due to the significant appearance changes."
"Sequences taken from 0°and 180°views lack walking postures and do not cluster with sequences taken from other views."