The paper introduces a unified framework, DSM, to analyze the global convergence of decentralized stochastic subgradient methods. It encompasses various efficient decentralized subgradient methods like DSGD, DSGDm, and DSGD-T. The proposed SignSGD is also included in this framework. The convergence results establish global convergence for these methods when applied to nonsmooth nonconvex objectives. Preliminary numerical experiments demonstrate the efficiency of the proposed framework in training nonsmooth neural networks.
Existing works on decentralized optimization problems are discussed with applications in data science and machine learning. Various decentralized subgradient methods are explored under differentiability assumptions of objective functions. The paper addresses challenges in computing subgradients for loss functions in deep learning packages due to nonsmooth activation functions like ReLU.
The content delves into concepts like mixing matrices, conservative fields, differential inclusions, and stochastic approximation techniques. Lemmas and propositions are presented to support the theoretical analysis of convergence properties for decentralized subgradient methods.
翻譯成其他語言
從原文內容
arxiv.org
深入探究