MULAN: A Comprehensive Dataset for Multi-Layer Controllable Text-to-Image Generation
The core message of this paper is to introduce MuLAn, a novel dataset comprising over 44K multi-layer annotations of decomposed RGB images, which aims to enable new avenues for text-to-image generative AI research by providing comprehensive scene decomposition information and scene instance consistency.