핵심 개념
The proposed Energy-Calibrated Variational Autoencoder (EC-VAE) utilizes a conditional Energy-Based Model (EBM) to calibrate the generative direction of a Variational Autoencoder (VAE) during training, enabling it to generate high-quality samples without requiring expensive Markov Chain Monte Carlo (MCMC) sampling at test time. The energy-based calibration can also be extended to enhance variational learning and normalizing flows, and applied to zero-shot image restoration tasks.
초록
The paper proposes a novel generative model called Energy-Calibrated Variational Autoencoder (EC-VAE) that addresses the limitations of traditional VAEs and EBMs.
Key highlights:
- VAEs often suffer from blurry generated samples due to the lack of explicit training on the generative direction. EBMs can generate high-quality samples but require expensive MCMC sampling.
- EC-VAE introduces a conditional EBM to calibrate the generative direction of VAE during training, without requiring MCMC sampling at test time.
- The energy-based calibration can also be extended to enhance variational learning and normalizing flows.
- EC-VAE is applied to zero-shot image restoration tasks, leveraging the neural transport prior and range-null space theory.
- Extensive experiments show that EC-VAE outperforms state-of-the-art VAEs, EBMs, and GANs on various image generation benchmarks, while being significantly more efficient in sampling.
통계
VAEs often suffer from blurry generated samples due to the lack of explicit training on the generative direction.
EBMs can generate high-quality samples but require expensive MCMC sampling.
The proposed EC-VAE achieves competitive performance over single-step non-adversarial generation on image generation benchmarks.
EC-VAE outperforms advanced GANs and Score-based Models on various datasets, including CIFAR-10, STL-10, ImageNet 32, LSUN Church 64, CelebA 64, and CelebA-HQ-256.
EC-VAE is hundreds to thousands of times faster than NCSN and VAEBM in sampling, while requiring much less training time.
EC-VAE achieves competitive performance on zero-shot image restoration tasks compared to strong baselines.
인용구
"VAEs often suffer from blurry generated samples due to the lack of a tailored training on the samples generated in the generative direction."
"EBMs can generate high-quality samples but require expensive Markov Chain Monte Carlo (MCMC) sampling."
"We demonstrate that it is possible to drop MCMC steps during test time sampling without compromising the quality of generation."