Keskeiset käsitteet
Selectively Informative Descriptions can mitigate undesired embedding entanglements in text-to-image personalization, improving subject preservation and alignment.
Tiivistelmä
This content discusses the challenges of biases in text-to-image personalization, proposing Selectively Informative Descriptions (SID) as a solution. The study categorizes biases and presents experimental results supporting the effectiveness of SID in reducing undesired embedding entanglements. It also compares different models and measures to evaluate the impact of SID on subject preservation and alignment.
Directory:
- Introduction
- Text-to-image diffusion models have shown remarkable capabilities.
- Recent works focus on personalized image generation.
- Related Work
- Overview of text-to-image diffusion models and vision-language models.
- Method
- Proposal of SID to reduce embedding entanglements.
- Experiments
- Comprehensive experiments verifying the enhancement from SID.
- Analysis of cross-attention map
- Visualization of cross-attention maps highlighting embedding focus.
- Analysis of three key measures
- Introduction of customized measures for evaluating subject preservation and non-subject disentanglement.
- Discussion
- Comparison with negative prompts and segmentation, limitations, and potential enhancements.
Tilastot
In text-to-image personalization, overfitting is addressed by optimization-based or encoder-based approaches [12, 15, 24, 45, 57].
DreamBooth [45] fine-tunes pre-trained models with few reference images using specific text descriptions like "a [v] [class name]" or "photo of a [v] [class name]."
SID (Selectively Informative Description) deviates from traditional approaches by including informative specifications about undesired objects in train descriptions to reduce undesired embedding entanglements.
Lainaukset
"SID significantly diminishes the probability of undesired entanglement between subject embedding [v] and non-subject information."
"Our method is selective because we deliberately avoid incorporating informative specifications of the 'subject' itself into the train descriptions."