Core Concepts
The core message of this paper is to provide a non-asymptotic, instance-dependent analysis of a variance-reduced proximal gradient (VRPG) algorithm for stochastic convex optimization under convex constraints. The algorithm's performance is shown to be governed by the scaled distance between the solutions of the given problem and a certain small perturbation of the given problem, both solved under the given convex constraints.
Abstract
The paper considers the problem of stochastic convex optimization under convex constraints, where the goal is to minimize the expected value of a strongly convex and smooth function f(x, z) over a convex set X, given access to N i.i.d. samples from the underlying distribution P0.
The key insights and highlights are:
The authors motivate a non-asymptotic, instance-dependent benchmark for the problem, which is based on the scaled distance between the solutions of the original problem and a perturbed version of the problem, both solved under the given constraints.
The authors propose a variance-reduced proximal gradient (VRPG) algorithm to solve the constrained stochastic optimization problem. The algorithm proceeds in epochs, where in each epoch, it constructs an approximate problem by recentering the gradient, and then runs a proximal stochastic gradient descent on this approximate problem.
The authors provide a non-asymptotic, instance-dependent upper bound on the performance of the VRPG algorithm. Specifically, they show that the algorithm's error is bounded by the square root of the instance-dependent benchmark, up to logarithmic factors in the sample size.
The authors further show that as the number of samples N goes to infinity, the VRPG algorithm achieves the renowned local minimax lower bound of Hajek and Le Cam, up to universal constants and a logarithmic factor in the sample size.
Overall, the paper provides a refined, instance-dependent analysis of a variance-reduced algorithm for constrained stochastic optimization, which improves upon the existing asymptotic results.
Stats
The paper does not contain any explicit numerical data or statistics. The analysis is theoretical in nature.