Domain-Agnostic Mutual Prompting (DAMP) aims to bridge the gap in UDA by aligning visual and textual embeddings, leading to superior performance over existing methods. The approach involves mutual prompting with cross-attention mechanisms and auxiliary regularizations to ensure domain-agnostic and instance-conditioned knowledge transfer.
Conventional UDA methods focus on minimizing distribution discrepancies between domains, while DAMP leverages large-scale pre-trained Vision-Language Models for more guided adaptation. By aligning visual and textual embeddings through mutual prompting, DAMP demonstrates notable gains over state-of-the-art approaches in three UDA benchmarks.
Large-scale pre-trained Vision-Language Models have shown impressive successes in various downstream tasks, providing a foundation for leveraging rich semantics from data. DAMP's innovative approach of mutual prompting enhances adaptability across different domains, showcasing its superiority in handling complex domain shifts.
他の言語に翻訳
原文コンテンツから
arxiv.org
深掘り質問