Core Concepts
Exploring off-the-shelf ViL models for Source-Free Domain Adaptation improves adaptation performance significantly.
Abstract
Abstract:
SFDA adapts a source model to a target domain using only unlabeled target data.
Conventional methods rely on pseudo-labeling and auxiliary supervision, leading to errors.
Introducing ViL models like CLIP enhances adaptation performance.
Introduction:
SFDA addresses challenges of accessing source data by transferring pre-trained models to target domains.
Methodology:
DIFO framework alternates between customizing and distilling knowledge from ViL models for task-specific adaptation.
Experiments:
Evaluation on standard benchmarks shows DIFO outperforms state-of-the-art alternatives in closed-set, partial-set, and open-set settings.
Model Analysis:
Feature distribution visualization and ablation study confirm the effectiveness of DIFO in task-specific knowledge adaptation.
Stats
Relying on pseudo labeling and/or auxiliary supervision leads to errors in conventional methods.
Extensive experiments show that DIFO significantly outperforms state-of-the-art alternatives.
Quotes
"Directly applying the ViL model to the target domain in a zero-shot fashion is unsatisfactory."
"We propose a novel Distilling multImodal Foundation mOdel (DIFO) approach."