Core Concepts
The proposed Deep Hierarchical Optimal Transport (DeepHOT) framework learns domain-invariant yet category-discriminative representations for unsupervised domain adaptation by incorporating domain-level and image-level optimal transport into a unified framework.
Abstract
The content discusses an unsupervised domain adaptation (UDA) method called Deep Hierarchical Optimal Transport (DeepHOT) that aims to learn domain-invariant yet category-discriminative representations.
The key ideas are:
- DeepHOT incorporates both domain-level optimal transport (OT) and image-level OT into a unified OT framework to capture hierarchical structural relations between domains and images.
- The image-level OT serves as the ground distance metric for the domain-level OT, allowing the method to learn discriminative features by modeling the structural associations among local regions of images.
- To address the high computational complexity of OT, DeepHOT uses sliced Wasserstein distance for image-level OT and mini-batch unbalanced OT for domain-level OT, making it efficient and scalable.
- Extensive experiments on benchmark datasets show the superiority of DeepHOT over state-of-the-art UDA methods.
Stats
"The computational complexity of domain-level OT over mini-batch solved by the generalized Sinkhorn-Knopp matrix scaling algorithm is O(n^2), where n is the size of mini-batch, which is much smaller than O(N^3 log(N)) for the original problem with N samples."
"The computational complexity of image-level OT formulation using sliced Wasserstein distance is O(MNd), where M is the number of projections, N is the number of samples, and d is the feature dimension."
Quotes
"Compared with the ground distance of the conventional domain-level OT, the image-level OT captures structural associations among local regions of images that are beneficial to classification."
"The benefit of image-level OT is that it is capable of capturing correspondences between local regions of two images for enhancing discriminative features."