Core Concepts
Heavy-tailed data can be effectively learned using t3VAE with Student’s t-distributions and power divergence.
Abstract
The content introduces t3VAE, a modified VAE framework using heavy-tailed models for better data representation. It explores the use of Student’s t-distributions and power divergence to fit real-world datasets. The t3VAE outperforms other models on CelebA and imbalanced CIFAR-100 datasets.
- Introduction to VAE and its components.
- Issues with Gaussian VAE and the need for heavy-tailed models.
- Introduction of t3VAE with Student’s t-distributions.
- Derivation of the new objective function, γ-loss.
- Superior performance of t3VAE on synthetic and real datasets.
- Comparison with other VAE models and their limitations.
- Theoretical background on information geometry and γ-power divergence.
- Structure of t3VAE and its components.
- Application of t3VAE to hierarchical architecture, t3HVAE.
Stats
t3VAE demonstrates superior generation of low-density regions.
t3VAE significantly outperforms other models on CelebA and imbalanced CIFAR-100 datasets.
Quotes
"The Gaussian VAE encodes many points in low-density regions of the prior."
"t3VAE requires a single hyperparameter ν which is coupled to the degrees of freedom of the t-distributions."