Conceitos Básicos
The Sub-path Linear Approximation Model (SLAM) accelerates diffusion models while maintaining high-quality image generation by treating the PF-ODE trajectory as a series of sub-paths and using sub-path linear ODEs to form a progressive and continuous error estimation along each sub-path.
Resumo
The paper proposes the Sub-path Linear Approximation Model (SLAM) to accelerate diffusion models while maintaining high-quality image generation.
Key highlights:
- Diffusion models have advanced the state-of-the-art in image generation, but their slow inference speed hinders practical applications.
- SLAM treats the PF-ODE trajectory as a series of sub-paths and uses sub-path linear (SL) ODEs to form a progressive and continuous error estimation along each sub-path.
- The optimization on such SL-ODEs allows SLAM to construct denoising mappings with smaller cumulative approximated errors.
- An efficient distillation method is developed to facilitate the incorporation of more advanced diffusion models, such as latent diffusion models.
- Extensive experiments demonstrate that SLAM achieves efficient training, requiring only 6 A100 GPU days, and surpasses existing acceleration methods in few-step generation tasks, achieving state-of-the-art performance on FID and image quality.
Estatísticas
SLAM requires only 6 A100 GPU days to produce a high-quality generative model capable of 2 to 4-step generation.
SLAM achieves FID scores of 10.09, 10.06, and 20.77 on the LAION, MS COCO 2014, and MS COCO 2017 datasets, respectively.
Citações
"SLAM adheres to the foundational concept of cumulative approximation of PF-ODE trajectories but innovates through its sustained learning from Sub-path Linear (SL) ODEs."
"The optimization on such SL-ODEs allows SLAM to construct denoising mappings with smaller cumulative approximated errors."
"Extensive experimental results demonstrate that SLAM achieves an efficient training regimen, requiring only 6 A100 GPU days to produce a high-quality generative model capable of 2 to 4-step generation with high performance."