Core Concepts
Heavy-tail properties of SGD analyzed through stochastic recurrence equations reveal significant insights.
Abstract
The content delves into the heavy-tail properties of Stochastic Gradient Descent (SGD) using stochastic recurrence equations. It explores the theory behind machine learning and linear regression setups, providing detailed analysis and results on tail behavior, stationary solutions, and moments of finite iterations. The study extends previous works by applying the theory of irreducible-proximal matrices to cover various scenarios.
Introduction:
- Discusses Stochastic Gradient Descent for learning problems.
- Defines risk functions and empirical risk functions.
Data Extraction:
- "E |Rn|α grows linearly with n."
- "There is a substantial probability that iterations of SGD go far away from the minimum."
Quotations:
- "A random variable X is a stationary solution to (5) if X has the same law as A1X + B1."
- "The main condition for the existence of such a stationary solution is that the top Lyapunov exponent is negative."
Stats
"E |Rn|α grows linearly with n."
"There is a substantial probability that iterations of SGD go far away from the minimum."
Quotes
"A random variable X is a stationary solution to (5) if X has the same law as A1X + B1."
"The main condition for the existence (and uniqueness) of such a stationary solution is that the top Lyapunov exponent is negative."