Object-Centric Attention Map Alignment for Improved Text-to-Image Diffusion Models
The core message of this paper is to introduce a novel object-conditioned Energy-Based Attention Map Alignment (EBAMA) method to address the issues of incorrect attribute binding and catastrophic object neglect in text-to-image diffusion models.