The paper introduces a novel approach for Unsupervised Domain Adaptation (UDA) in sparse Temporal Action Localization (TAL), called Semantic Adversarial unsupervised Domain Adaptation (SADA). The key contributions are:
SADA is the first UDA method for sparse detection scenarios on TAL, overcoming the limitations of existing works focused on dense action segmentation.
SADA introduces a novel adversarial loss that factorizes standard global alignment into independent class-wise and background alignments, providing a more sensitive and semantically meaningful adaptation.
The paper presents new comprehensive benchmarks based on EpicKitchens100 and CharadesEgo to evaluate multiple domain shifts in sparse TAL, showing SADA outperforms fully supervised and alternative UDA methods.
The paper first defines the problem of UDA for TAL, where a labeled source domain and an unlabeled target domain need to be aligned. It then presents the overall framework, which consists of a feature pyramid and a classification/localization head, coupled with the proposed SADA loss.
The SADA loss aims to align the feature embeddings of the source and target domains in a semantically meaningful way. It does this by adversarially training a domain classifier on the embeddings, but conditioning it on the class labels (obtained via pseudo-labeling for the target domain). This allows aligning the distributions of each action class independently, rather than just globally aligning the overall feature distributions.
The paper then introduces the new benchmarks based on EpicKitchens100 and CharadesEgo, which evaluate different types of domain shifts, including appearance, acquisition, and viewpoint changes. Experiments show SADA consistently outperforms fully supervised baselines and alternative UDA methods on these benchmarks.
翻譯成其他語言
從原文內容
arxiv.org
深入探究