Core Concepts
Heavy-tailed data can be effectively learned using t3VAE with Student’s t-distributions and power divergence.
Abstract
The content introduces t3VAE, a modified VAE framework using heavy-tailed models for better data representation. It explores the use of Student’s t-distributions and power divergence to fit real-world datasets. The t3VAE outperforms other models on CelebA and imbalanced CIFAR-100 datasets.
Introduction to VAE and its components.
Issues with Gaussian VAE and the need for heavy-tailed models.
Introduction of t3VAE with Student’s t-distributions.
Derivation of the new objective function, γ-loss.
Superior performance of t3VAE on synthetic and real datasets.
Comparison with other VAE models and their limitations.
Theoretical background on information geometry and γ-power divergence.
Structure of t3VAE and its components.
Application of t3VAE to hierarchical architecture, t3HVAE.
Stats
t3VAE demonstrates superior generation of low-density regions.
t3VAE significantly outperforms other models on CelebA and imbalanced CIFAR-100 datasets.
Quotes
"The Gaussian VAE encodes many points in low-density regions of the prior."
"t3VAE requires a single hyperparameter ν which is coupled to the degrees of freedom of the t-distributions."