Core Concepts
The core message of this work is to propose a self-discovery approach to find interpretable latent directions in the diffusion model's internal representation, which can be leveraged to enhance responsible text-to-image generation, including fair generation, safe generation, and responsible text-enhancing generation.
Abstract
The paper presents a novel self-discovery approach to identify interpretable latent directions in the diffusion model's internal representation, specifically the bottleneck layer of the U-Net architecture. The key highlights are:
The authors demonstrate that the diffusion model's internal representation, particularly the h-space, exhibits semantic structures that can be associated with specific concepts in the generated images. However, existing approaches cannot easily discover latent directions for arbitrary concepts, such as those related to inappropriate content.
The proposed self-discovery method leverages the diffusion model's acquired semantic knowledge to learn a latent vector that effectively represents a given concept. This is achieved by optimizing the latent vector to reconstruct images generated with prompts related to the concept, while using a modified prompt that omits the concept information.
The discovered latent vectors can be utilized to enhance responsible text-to-image generation in three ways:
a. Fair generation: Sampling from learned concept vectors (e.g., gender) with equal probability to generate images with balanced attributes.
b. Safe generation: Incorporating a safety-related concept vector (e.g., anti-sexual) to suppress the generation of inappropriate content.
c. Responsible text-enhancing generation: Extracting responsible concepts from the text prompt and activating the corresponding learned vectors to reinforce the expression of desired visual features.
Extensive experiments demonstrate the effectiveness of the proposed approach in promoting fairness, safety, and responsible text guidance, outperforming existing methods. The authors also showcase the generalization capability and compositionality of the discovered concept vectors.