Bayesian Variable Selection in Generalized Linear Models
L. Filippozzi, I. Urteaga and C. Agostinelli
University of Trento, Fondazione Bruno Kessler (FBK), Basque Center for Applied Mathematics (BCAM), Ikerbasque — Basque Foundation for Science
Variable or covariate selection is a crucial step aimed at identifying the most relevant predictors for explaining the response variable.
In this work, we propose BayesVS-GLM, a novel Bayesian covariate selection method for Generalized Linear Models (GLMs). BayesVS-GLM is based on a fully conjugate Bayesian hierarchical model that comes with theoretical posterior consistency guarantees. Specifically, we extend the standard GLM framework by introducing a binary vector to indicate which covariates are included in the generalized linear predictor. The regression coefficients are modeled conditionally on , using conjugate priors for GLMs [4].
Although related method exists ([1], [2], [3]), our method is, to the best of our knowledge, the first that provides a unified framework in which: the formulation of the hierarchical GLM is fully conjugate; the GLM likelihood is explicitly dependent on indicator variables , enabling a regressor selection based uniquely on observed data; and the posterior asymptotical accuracy of and posterior consistency of the regression coefficients are guaranteed. For posterior inference, we present an efficient Gibbs sampling algorithm, based on a fully conjugate Bayesian hierarchical model.
The BayesVS-GLM formulation is applicable to any distribution within the exponential family, and unifies a range of existing Bayesian variable selection perspectives within a single coherent hierarchical framework.
Keywords: Bayesian Variable Selection, Generalized Linear Models, Posterior model selection.
References
- [1] L. Kuo, and B. Mallick (1998). Variable selection for regression models. Sankhyā: The Indian Journal of Statistics, Series B, 60(1), 65–81.
- [2] P. Dellaportas, J. Forster, and I. Ntzoufras (2002). On Bayesian model and variable selection using MCMC. In Statistics and computing, 12(1), 27–36.
- [3] N. N. Narisetty, S. Juan, and H. Xuming (2019). Skinny Gibbs: A Consistent and Scalable Gibbs Sampler for Model Selection. In Journal of the American Statistical Association, 114(527), 1205–1217.
- [4] M. Chen, and J. G. Ibrahim (2003). Conjugate priors for generalized linear models. In Statistica Sinica, 13(2), 461–476.