Text-to-Image Diffusion Models for Zero-Shot Sketch-Based Image Retrieval
The author argues that text-to-image diffusion models excel at connecting sketches and photos, bridging the gap between different data types with robust cross-modal capabilities and shape bias. By leveraging pre-trained diffusion models effectively, significant performance improvements can be achieved in zero-shot sketch-based image retrieval.