The paper presents a novel diffusion-based image generation method called the observation-guided diffusion probabilistic model (OGDM). The key idea is to reestablish the training objective by integrating the guidance of the observation process with the Markov chain in a principled way. This is achieved by introducing an additional loss term derived from the observation based on a conditional discriminator on noise level, which employs a Bernoulli distribution indicating whether its input lies on the (noisy) real manifold or not.
The authors show that this strategy allows them to optimize the more accurate negative log-likelihood induced in the inference stage, especially when the number of function evaluations is limited. The proposed training scheme is also advantageous even when incorporated only into the fine-tuning process, and it is compatible with various fast inference strategies since their method yields better denoising networks using the exactly the same inference procedure without incurring extra computational cost.
The authors demonstrate the effectiveness of their training algorithm using diverse inference techniques on strong diffusion model baselines. The results show that the proposed approach outperforms the baseline models in terms of both FID and recall scores, especially when the number of function evaluations (NFEs) is small.
To Another Language
from source content
arxiv.org
Key Insights Distilled From
by Junoh Kang,J... at arxiv.org 04-02-2024
https://arxiv.org/pdf/2310.04041.pdfDeeper Inquiries