Core Concepts
A novel framework for domain generalizable person search that uses an automatically labeled unreal dataset for training, alleviating the need for time-consuming and labor-intensive data labeling as well as privacy issues in real datasets.
Abstract
The proposed method introduces a framework for domain generalizable person search that uses an automatically labeled unreal dataset (JTA*) as the only source for training. To address the domain gap between the unreal and real datasets, the method employs two key components:
-
Fidelity Adaptive Training (FAT):
- Estimates the fidelity of person instances in the unreal dataset using deep features.
- Adaptively computes the detection and confidence losses based on the estimated fidelity to suppress the influence of degraded instances.
- Updates the ID lookup table using the fidelity-weighted features to improve robustness.
-
Domain Invariant Feature Learning (DIL):
- Extracts ID-specific and domain-specific features from person instances.
- Applies domain-guided normalization to the ID-specific features to suppress domain-related information.
- Introduces a domain separation loss to encourage the network to learn distinct ID-specific and domain-specific representations.
The proposed method achieves competitive performance compared to existing supervised, weakly-supervised, and unsupervised domain adaptation methods on real-world datasets, despite being trained solely on the unreal dataset and without any additional training on the target datasets.
Stats
The unreal JTA* dataset has 10,049 training images with 175,035 person instances and 175,035 identities, and 4,426 test images with 74,382 person instances and 1,480 identities.
The real CUHK-SYSU dataset has 11,206 training images with 55,272 person instances and 5,532 identities, and 6,978 test images with 40,871 person instances and 2,900 identities.
The real PRW dataset has 5,134 training images with 16,243 person instances and 482 identities, and 6,112 test images with 25,062 person instances and 450 identities.
Quotes
"To reduce the burden of data labeling, attempts have been made such as weakly supervised learning and unsupervised domain adaptation, whose concepts are compared in Figure 1."
"We propose a fully generalizable person search framework based on domain generalization (DG) from unreal dataset to arbitrary real datasets."
"To alleviate the domain gaps of annotation between the unreal and real datasets, we estimate the fidelity of each person instance using the deep features, which is used for fidelity adaptive training."