Core Concepts
CTM introduces a novel generative model that combines the advantages of score-based and distillation models, achieving state-of-the-art results in sampling and training.
Abstract
Abstract:
Consistency Trajectory Model (CTM) bridges the gap between score-based diffusion models and distillation models.
CTM enables efficient combination of adversarial training and denoising score matching loss.
Achieves new state-of-the-art FIDs for single-step diffusion model sampling on CIFAR-10 and ImageNet at 64 × 64 resolution.
Introduction:
Deep generative models face challenges like posterior collapse in VAEs and training instability in GANs.
Diffusion Models (DM) address these issues by learning the score but involve gradual denoising slowing down sampling.
Preliminary:
DM encoder structure formulated using continuous-time random variables.
Reverse-time process established to align with forward process marginally.
CTM: An Unification of Score-Based and Distillation Models:
Introduces CTM as a unified framework assessing both integrand (score function) and integral (jump) of PF ODE trajectory.
Enables anytime-to-anytime jump along PF ODE, providing increased flexibility at inference time.
Sampling with CTM:
CTM enables exact score evaluation through gθ(xt, t, t), supporting standard score-based sampling with ODE/SDE solvers.
Introduces γ-sampling method allowing for deterministic or stochastic long jumps along the solution trajectory.
Experiments:
CTM surpasses previous models in FID and likelihood for few-steps diffusion model sampling on CIFAR-10 and ImageNet 64 × 64.
Stats
Recent developments focus on Distillation models that directly estimate the integral along the Probability Flow ODE sample trajectory.
Quotes
"CTM bridges the gap between score-based diffusion models and distillation models."
"CTM achieves new state-of-the-art FIDs for single-step diffusion model sampling."