Deep Learning Optimization

Увійти

ідея - Deep Learning Optimization

Efficient Gauss-Newton Approach for Training Generative Adversarial Networks

A novel first-order method based on the Gauss-Newton approach is proposed to efficiently solve the min-max optimization problem in training generative adversarial networks (GANs). The method uses a fixed-point iteration with a Gauss-Newton preconditioner and achieves state-of-the-art performance on image generation tasks while maintaining computational efficiency.

Automated Discovery of Powerful Deep Learning Optimizers, Decay Functions, and Learning Rate Schedules

The authors propose a new dual-joint search space for neural optimizer search (NOS) that simultaneously optimizes the weight update equation, internal decay functions, and learning rate schedules. They discover multiple optimizers, learning rate schedules, and Adam variants that outperform standard deep learning optimizers across image classification tasks.

Variational Stochastic Gradient Descent: A Probabilistic Approach to Optimizing Deep Neural Networks

The core message of this paper is to propose a novel optimizer called Variational Stochastic Gradient Descent (VSGD) that combines gradient descent with probabilistic modeling of the true gradients as latent random variables. This approach allows for more principled modeling of gradient noise and uncertainty, leading to improved optimization performance compared to existing methods like ADAM and SGD.

Sharpness-Aware Minimization: Analyzing the Edge of Stability and Its Impact on Neural Network Training

Sharpness-Aware Minimization (SAM) is a gradient-based neural network training algorithm that explicitly seeks to find solutions that avoid "sharp" minima. The authors derive an "edge of stability" for SAM, which depends on the norm of the gradient, and show empirically that SAM operates at this edge of stability across multiple deep learning tasks.

XGrad: Boosting Gradient-Based Optimizers with Weight Prediction

XGrad introduces weight prediction into popular gradient-based optimizers like SGD with momentum, Adam, AdamW, AdaBelief, and AdaM3 to boost their convergence and generalization when training deep neural network models.

Convergence Analysis of Controlled Particle Systems in Deep Learning: From Finite to Infinite Sample Size

This paper establishes quantitative convergence results for the value functions and optimal parameters of neural SDEs as the sample size grows to infinity. The authors analyze the Hamilton-Jacobi-Bellman equation corresponding to the N-particle system and obtain uniform regularity estimates, which are then used to show the convergence of the minima of objective functionals and optimal parameters.

Optimizing Deep Learning Performance through Comprehensive Randomization Techniques

Injecting randomness at various stages of the deep learning training process, including data, model, optimization, and learning, can significantly improve performance across computer vision benchmarks.

Investigating the Relationship Between Neural Collapse and Plasticity Loss in Deep Learning

The core message of this paper is that there exists a complex relationship between neural collapse and plasticity loss in deep learning models, which can be leveraged to mitigate plasticity loss.

Conjugate-Gradient-like Adaptive Moment Estimation Optimization Algorithm for Improving Deep Learning Performance

The authors propose a new optimization algorithm named CG-like-Adam that combines the advantages of conjugate gradient and adaptive moment estimation to speed up training and enhance the performance of deep neural networks.

Efficient Quantized Matrix Multiplication for Accelerating Large-Scale Generative Language Models

LUT-GEMM, an efficient kernel for quantized matrix multiplication, eliminates the resource-intensive dequantization process and reduces computational costs compared to previous kernels for weight-only quantization, enabling substantial acceleration of token generation latency in large-scale generative language models.

Про нас

Продукти | Ресурси

Ідеї