Kassing, S., & Weissmann, S. (2024). Polyak’s Heavy Ball Method Achieves Accelerated Local Rate of Convergence under Polyak- Lojasiewicz Inequality [Preprint]. arXiv:2410.16849v1.
This paper investigates the convergence properties of Polyak's heavy ball method, a momentum-based optimization algorithm, when applied to non-convex objective functions that satisfy the Polyak-Łojasiewicz (PL) inequality. The authors aim to determine if the method retains its accelerated convergence rate, typically observed under strong convexity assumptions, in this more general setting.
The authors utilize a novel differential geometric perspective on the PL-inequality, leveraging the fact that the set of global minima forms a manifold under this condition. They analyze the heavy ball dynamics under a coordinate chart that separates the optimization space into tangential and normal directions relative to this manifold.
The study demonstrates that Polyak's heavy ball method can indeed accelerate convergence beyond the class of strongly convex functions, achieving accelerated rates under the weaker assumption of the PL-inequality. This finding has significant implications for the application of this widely used optimization method in various machine learning tasks where strong convexity might not hold.
This research provides a theoretical justification for the empirical success of Polyak's heavy ball method in optimizing non-convex objective functions commonly encountered in machine learning. It expands the understanding of this algorithm's capabilities and its potential for broader application in complex optimization problems.
The discrete-time analysis focuses on local convergence, assuming the iterates reach a specific neighborhood of the global minima. Future research could explore conditions for global convergence or analyze the behavior of the method outside this local region. Additionally, investigating the impact of specific practical considerations, such as inexact gradient computations, on the convergence properties would be valuable.
На другой язык
из исходного контента
arxiv.org
Дополнительные вопросы