Learning Flexible Multi-Modal Generative Models with Permutation-Invariant Encoders and Tighter Variational Bounds
This paper proposes a new variational bound that can tightly approximate the multi-modal data log-likelihood, and develops more flexible aggregation schemes based on permutation-invariant neural networks to encode latent variables from different modality subsets.