Liu, S., Ramadge, P.J., Adams, R.P. (2024). Generative Marginalization Models. Proceedings of the 41st International Conference on Machine Learning, Vienna, Austria. PMLR 235.
This paper introduces Marginalization Models (MAMs), a novel family of generative models designed to overcome the limitations of existing autoregressive models in efficiently estimating marginal probabilities for high-dimensional discrete data. The research aims to demonstrate the effectiveness of MAMs in both maximum likelihood estimation (MLE) and energy-based training (EB) settings.
The authors propose a novel model architecture that directly models the marginal distribution p(xS) for any subset of variables xS in a discrete data point x. To ensure the validity of the model, they introduce the concept of "marginalization self-consistency," which enforces the sum rule of probability. The authors develop scalable training objectives based on this principle, enabling efficient learning of both marginal and conditional probabilities. They evaluate MAMs on various discrete data distributions, including images, text, physical systems, and molecules, comparing their performance against existing state-of-the-art models in both MLE and EB settings.
MAMs present a significant advancement in generative modeling of discrete data by enabling efficient and scalable approximation of arbitrary marginal probabilities. Their ability to handle high-dimensional problems in both MLE and EB settings makes them a powerful tool for various applications, including image generation, text modeling, molecule design, and physical system simulation.
This research significantly contributes to the field of generative modeling by introducing a novel model architecture and training procedure that addresses key limitations of existing methods. The proposed MAMs offer a more efficient and scalable approach for learning and inferring complex discrete data distributions, potentially leading to advancements in various domains requiring accurate and efficient probabilistic modeling.
While MAMs demonstrate promising results, further research could explore their application to continuous data and investigate the potential benefits of incorporating more sophisticated neural network architectures. Additionally, exploring alternative sampling strategies for the REINFORCE gradient estimator could further improve the model's performance and scalability in energy-based training settings.
Egy másik nyelvre
a forrásanyagból
arxiv.org
Mélyebb kérdések