The Spatial-Semantic Map Guided (SSMG) diffusion model addresses limitations of token-guided and image-guided L2I methods. It leverages feature maps for spatial and semantic controllability, introducing Relation-Sensitive Attention (RSA) and Location-Sensitive Attention (LSA). SSMG achieves state-of-the-art results across fidelity, diversity, and controllability metrics. The model allows free-form textual descriptions and supports various layout positional representations. Extensive experiments demonstrate the effectiveness of SSMG in generating high-quality images with precise control over semantics and spatial layouts.
To Another Language
from source content
arxiv.org
ข้อมูลเชิงลึกที่สำคัญจาก
by Chengyou Jia... ที่ arxiv.org 03-14-2024
https://arxiv.org/pdf/2308.10156.pdfสอบถามเพิ่มเติม