Основные понятия
The core message of this paper is to enhance the practicality of domain generalization by proposing a novel perturbation distillation method that transfers knowledge from large-scale vision-language models to lightweight vision models, and introducing a new benchmark called Hybrid Domain Generalization (HDG) to comprehensively evaluate the robustness of algorithms.
Аннотация
The paper addresses the limitations of existing domain generalization (DG) and open set domain generalization (OSDG) methods, which often rely on complex architectures, extensive training strategies, and assume identical label sets across source domains. To enhance the practicality of DG, the authors make the following key contributions:
Perturbation Distillation (PD) Method:
The authors propose a novel PD method called SCI-PD that transfers knowledge from large-scale vision-language models (VLMs) to lightweight vision models.
SCI-PD introduces perturbation from three perspectives: Score, Class, and Instance, to effectively distill the semantics from VLMs.
This approach avoids the high computational costs of fine-tuning or re-training VLMs, making it more practical for real-world applications.
Hybrid Domain Generalization (HDG) Benchmark:
The authors introduce a new HDG benchmark that considers diverse and disparate label sets across source domains, which is more representative of real-world scenarios.
HDG comprises four different splits with varying degrees of hybridness (H) to comprehensively evaluate the robustness of algorithms.
A novel evaluation metric, H2-CV, is proposed to measure the comprehensive robustness of algorithms across different H settings.
Experimental Evaluation:
Extensive experiments on three datasets (PACS, OfficeHome, and DomainNet) demonstrate that the proposed SCI-PD method outperforms state-of-the-art DG and OSDG methods in terms of accuracy, H-score, and the new H2-CV metric.
The authors also show the transferability of SCI-PD by applying it to various lightweight vision backbones, achieving superior performance with significantly fewer parameters.
Ablation studies and visualizations further validate the effectiveness of the individual components of SCI-PD and its ability to learn domain-invariant representations.
Overall, the paper presents a practical and robust solution for domain generalization by leveraging VLMs through perturbation distillation, and introduces a new benchmark and evaluation metric to better assess the real-world applicability of DG and OSDG algorithms.