Core Concepts
Enhanced Swin Transformer improves image super-resolution by aggregating local-global features.
Abstract
The content introduces an Enhanced Swin Transformer network for image super-resolution reconstruction. It addresses limitations of traditional models by incorporating local and global feature aggregation. The proposed network outperforms state-of-the-art models on publicly available datasets. Key components include shift convolution, block sparse global-awareness module, multi-scale self-attention, and low-parameter residual channel attention. Ablation studies demonstrate the effectiveness of these components in improving performance metrics. Local attribution maps visualize the impact of different pixels on the reconstruction results, showing the model's ability to restore accurate textures.
Stats
The PSNR of the network with BSGM module is improved by 0.12 dB compared to ELAN-light.
The PSNR of the network with LRCAB module is improved by 0.09 dB compared to ELAN-light.
Quotes
"The proposed ESTN achieves a state-of-the-art performance in super-resolution reconstruction for all five test sets."
"The reconstructed SR image via the ESTN is closer to the HR image than the ones obtained by other networks."