This paper proposes a novel motion-guided unsupervised domain adaptation (MoDA) method for semantic segmentation. The key contributions are:
MoDA utilizes self-supervised object motion information learned from unlabeled video frames as cues to guide cross-domain alignment, without requiring any target domain annotations. This is in contrast to existing domain adaptation methods that rely on adversarial learning or self-training using noisy target pseudo-labels.
MoDA consists of two key modules:
Experiments on domain adaptive video and image segmentation benchmarks show that MoDA outperforms existing methods that use optical flow information for temporal consistency. MoDA also complements existing state-of-the-art unsupervised domain adaptation approaches.
The key insight is that self-supervised object motion provides stronger guidance for domain alignment compared to optical flow, as it can capture 3D motion patterns that are crucial for real-world dynamic scenes with multiple moving objects.
Til et andet sprog
fra kildeindhold
arxiv.org
Vigtigste indsigter udtrukket fra
by Fei Pan,Xu Y... kl. arxiv.org 04-16-2024
https://arxiv.org/pdf/2309.11711.pdfDybere Forespørgsler