The authors investigate the behavior of gradient descent in high-dimensional landscapes, revealing a transition from informative to uninformative local curvature during optimization. Successful recovery is achieved before the algorithmic transition at large dimensions.
Modified gradient descent methods achieve global L2 cost minimization in overparametrized and underparametrized settings.
新しい局所最適性の境界を開発する勾配降下法(GD)に関する研究。