Improving Multi-Subject Generation in Text-to-Image Diffusion Models
Diffusion models face challenges in generating images with multiple subjects, often resulting in subject neglect or blending. This work proposes a novel approach to address these issues by manipulating the cross-attention maps and latent space to obtain favorable layouts for multi-subject generation.