Conceitos Básicos
Proposing a novel method for arbitrary-scale image generation and super-resolution using latent diffusion models and implicit neural decoders.
Resumo
The proposed method combines an auto-encoder, latent diffusion model, and implicit neural decoder to generate images at arbitrary scales with high fidelity, diversity, and fast inference speed. Existing methods suffer from over-smoothing, artifacts, lack of diversity in output images, and scale consistency. The model operates efficiently in the latent space while aligning with the output image space. Extensive experiments show that the proposed method outperforms relevant methods in metrics of image quality, diversity, and scale consistency.
Estatísticas
Most relevant work is IDM [9].
FID scores: CIPS-256 (450), CIPS-1024 (400), MS-PE (350).
LIIF shows good PSNR scores but lower perceptual quality.
Our model outperforms LIIF at larger scales.
Citações
"The proposed method adopts diffusion processes in a latent space, thus efficient yet aligned with output image space decoded by MLPs at arbitrary scales."
"Our model not only achieves good FID scores on all scales but also shows high consistency."
"Our model is significantly faster compared to IDM while showing better output quality."