Core Concepts
The core message of this work is to propose a novel unsupervised domain adaptation (UDA) framework called FD-GLGAN that leverages frequency decomposition techniques and global-local context modeling to improve the cross-domain transferability and generalization capability of semantic segmentation models for remote sensing images.
Abstract
The proposed FD-GLGAN framework consists of three key components:
- High/Low-Frequency Decomposition (HLFD) Module:
- Decomposes the feature maps into high- and low-frequency components before performing domain alignment in the corresponding subspaces.
- Aims to retain cross-domain local spatial details and global contextual semantics simultaneously, which is crucial for remote sensing image semantic segmentation.
- Global-Local Generative Adversarial Network (GLGAN):
- Employs global-local transformer blocks (GLTBs) in both the generator and discriminator to effectively capture global contexts and local details.
- Facilitates domain alignment by leveraging global-local context modeling between the source and target domains.
- Integrated FD-GLGAN Framework:
- Combines the HLFD module and the GLGAN to improve the cross-domain transferability and generalization capability of semantic segmentation models.
Extensive experiments on two benchmark datasets, ISPRS Potsdam and ISPRS Vaihingen, demonstrate the effectiveness and superiority of the proposed FD-GLGAN approach compared to state-of-the-art UDA methods for remote sensing image semantic segmentation.
Stats
The authors report the following key metrics to support their findings:
Overall Accuracy (OA): FD-GLGAN achieved the highest OA of 83.66% on the adaptation from P-IRRG to V-IRRG, outperforming the baseline Advent by 6.03%.
Mean F1 Score (mF1): FD-GLGAN attained the highest mF1 of 80.30% on the adaptation from P-IRRG to V-IRRG, improving upon Advent by 7.79%.
Mean Intersection over Union (mIoU): FD-GLGAN reached the highest mIoU of 68.09% on the adaptation from P-IRRG to V-IRRG, surpassing Advent by 9.41%.
Quotes
"The core idea of UDA methods is to learn domain-invariant features across domains based on domain alignment, including discrepancy-based, reconstruction-based and adversarial-based optimization principles."
"Notably, UDA on semantic segmentation of remote sensing images presents unique challenges. For example, the ground objects and their spatial relationships are complex in fine-resolution remote sensing images."
"To address these problems, we propose a frequency decomposition-driven UDA method based on a global-local GAN model, namely FD-GLGAN, considering alignment in low-frequency global representations and high-frequency local information."