The paper introduces StegoGAN, a novel model for non-bijective image-to-image translation tasks. Existing GAN-based translation methods assume a one-to-one correspondence between classes in the source and target domains. However, this assumption does not always hold in real-world scenarios, leading to the hallucination of spurious features in the generated images.
To address this challenge, StegoGAN leverages steganography, a process where models hide information in low-amplitude patterns to bypass cycle consistency objectives. Instead of disabling this phenomenon, StegoGAN makes the steganographic process explicit and disentangles the matchable and unmatchable information in feature space. This allows the model to prevent the generation of spurious instances of unmatchable classes without requiring additional post-processing or supervision.
The paper evaluates StegoGAN on three datasets featuring non-bijective class mappings: PlanIGN (aerial photos to maps with toponyms), GoogleMaps (aerial photos to maps with highways), and Brats MRI (T1 scans to FLAIR scans with tumors). Across these tasks, StegoGAN outperforms existing GAN-based models in terms of reconstruction fidelity, pixel accuracy, and false positive rates, demonstrating its effectiveness in handling semantic misalignment between domains.
In un'altra lingua
dal contenuto originale
arxiv.org
Approfondimenti chiave tratti da
by Sidi Wu,Yizi... alle arxiv.org 04-01-2024
https://arxiv.org/pdf/2403.20142.pdfDomande più approfondite