Core Concepts
The core message of this article is to propose a novel particle-based algorithm called "variational transport" for solving distributional optimization problems, where the goal is to minimize a functional defined over a family of probability distributions.
Abstract
The article introduces the background on metric spaces, Riemannian manifolds, and Wasserstein spaces, which are necessary for studying optimization problems over probability distributions. It then presents the variational transport algorithm, which aims to solve distributional optimization problems with a variational form.
The key highlights and insights are:
Distributional optimization problems arise in various machine learning and statistics applications, such as Bayesian inference, distributionally robust optimization, and generative adversarial networks.
Existing approaches that parameterize the probability distribution by a finite-dimensional parameter suffer from issues like approximation bias, non-convexity, and tension between optimization and sampling.
The variational transport algorithm directly optimizes the functional on the Wasserstein space of probability distributions by approximating the Wasserstein gradient descent using a set of particles.
The algorithm utilizes the variational form of the objective functional, which enables efficient estimation of the Wasserstein gradient and convenient sampling from the iterates.
When the objective functional satisfies the Polyak-Lojasiewicz condition and the Wasserstein gradient is approximated using kernel methods, the authors prove that variational transport converges linearly to the global minimum up to a statistical error that decays sublinearly.
Variational transport provides a unified algorithmic framework that can be applied to a broad class of distributional optimization problems, without suffering from the approximation bias of finite-dimensional parameterization.