Concetti Chiave
A novel method for adapting pre-trained learned image compression models to multiple target domains by plugging in domain-specific adapter modules into the decoder, without compromising performance on the source domain.
Sintesi
The paper proposes a method for domain adaptation in learned image compression (LIC) models. The key ideas are:
- Plug-in K+1 adapter modules into the decoder of a pre-trained LIC model, where K are for target domains and 1 is for the source domain.
- Train a gate network that predicts a probability distribution over the K+1 domains, which is used to blend the outputs of the adapters during decoding.
- Train the adapters and gate jointly, while freezing the pre-trained encoder and decoder parameters.
This approach improves rate-distortion performance on the target domains without catastrophic forgetting on the source domain. It also enhances reconstruction quality for unseen image domains by leveraging the learned adapters.
The authors experiment with two state-of-the-art LIC models (Zou et al. and Cheng et al.) and demonstrate significant BD-Rate and BD-PSNR gains on the target sketch and comic domains compared to the reference pre-trained models. They also show improvements on unseen domains like infographics, drawings, and documents.
The proposed method is effective, efficient, and practical, as the adapters and gate do not modify the original pre-trained model parameters, allowing the original model to be used even if the adapters are unavailable during decoding.
Statistiche
The Kodak dataset has a BD-Rate of 0.0012 and BD-PSNR of ~0 for the Zou et al. model.
The CLIC dataset has a BD-Rate of 0.038 and BD-PSNR of ~0 for the Zou et al. model.
The Sketch dataset has a BD-Rate of -2.45 and BD-PSNR of 0.1718 for the Zou et al. model.
The Comic dataset has a BD-Rate of -4.93 and BD-PSNR of 0.28 for the Zou et al. model.
Citazioni
"Our method visibly improves the performance for both the target domains to the pre-trained model."
"Our method strikes rate reductions of 5% and 10% over the comic and sketch domains. Yet, we achieve some gain also over the source domain(Kodak and Clic datasets) for Cheng et al., proving our domain adaptation method does not incur in any catastrophic forgetting."
"For all domains the adopted training policy induces the exploitation of all the available adapters to increase the performance since; even though the gate is capable of identifying the predominant domain for an image, it still utilizes the others to a lesser extent."