Core Concepts
The author introduces zero-shot text-guided exploration for open-domain image super-resolution, aiming to provide diverse solutions while maintaining data consistency with low-resolution inputs. Two approaches are proposed using text-to-image diffusion models and CLIP guidance, showing advantages in restoration quality, diversity, and explorability.
Abstract
The content introduces the challenging task of zero-shot open-domain extreme super-resolution guided by text prompts. It explores two approaches utilizing pretrained diffusion-based T2I models and CLIP guidance for zero-shot image restoration. The methods improve adherence to input text prompts while maintaining consistency with observations and demonstrate significantly improved diversity in solutions.
Key points include:
Introduction of zero-shot text-guided exploration for image super-resolution.
Proposal of two approaches using T2I models and CLIP guidance.
Improvement in adherence to text prompts while maintaining data consistency.
Demonstration of enhanced diversity in solutions through the proposed methods.
User study results indicating better performance of T2I model-based methods over CLIP-guided restoration.
Stats
LR PSNR(dB): 50.42, 75.40, 51.68, 67.02, 50.16, 51.08 (Faces)
NIQE: 5.59, 8.41, 6.17, 5.54, 6.12, 6.86 (Faces)
LR PSNR(dB): 47.01, 72.94, 50.34, 66.33 (Nocaps)
NIQE: 9.66, 10.27, 4.62, 4.88 (Nocaps)
Quotes
"We propose for the first time zero-shot open-domain image super-resolution using simple and intuitive text prompts."
"Our work opens up a promising direction of developing efficient tools for text-guided exploration of image recovery."
"The use of powerful T2I models in zero-shot restoration can recover data consistent solutions matching complex text prompts."