Banerjee, S., Carbonetto, P., & Stephens, M. (2024). Gradient-based optimization for variational empirical Bayes multiple regression. arXiv preprint arXiv:2411.14570.
This paper aims to address the computational limitations of coordinate ascent variational inference (CAVI) in fitting large, sparse multiple regression models by proposing a novel gradient-based optimization approach called GradVI. The authors investigate the efficacy of GradVI in comparison to CAVI, particularly in scenarios involving correlated predictors and trend filtering applications.
The authors leverage a recent finding that reframes the variational Empirical Bayes (VEB) regression objective function as a penalized regression problem. They propose two strategies within GradVI to handle the non-analytical penalty function: numerical inversion of the posterior mean operator and a reparametrization approach using a compound penalty function. The performance of GradVI is evaluated against CAVI using simulated datasets for high-dimensional multiple linear regression with independent and correlated variables, as well as for Bayesian trend filtering. The evaluation criteria include ELBO convergence, root mean squared error (RMSE) in predicting responses, number of iterations to convergence, and runtime.
GradVI presents a computationally efficient and accurate alternative to CAVI for VEB in multiple linear regression. Its ability to leverage fast matrix-vector computations and handle correlated predictors makes it particularly suitable for large-scale problems and applications like trend filtering.
This research contributes to the advancement of variational inference methods by introducing a gradient-based approach that addresses limitations of traditional coordinate ascent techniques. The proposed GradVI method holds promise for improving efficiency and scalability of Bayesian inference in various domains involving large-scale regression problems.
The study primarily focuses on the ash prior for its flexibility and accuracy. Exploring the performance of GradVI with other prior families could provide further insights. Additionally, investigating the application of GradVI in other high-dimensional settings beyond trend filtering would be beneficial.
На другой язык
из исходного контента
arxiv.org
Дополнительные вопросы