References

Bayesian Variable Selection in Generalized Linear Models

L. Filippozzi ${}^{a,b}$ , I. Urteaga ${}^{c,d}$ and C. Agostinelli ${}^{a}$

${}^{a}$ University of Trento, ${}^{b}$ Fondazione Bruno Kessler (FBK), ${}^{c}$ Basque Center for Applied Mathematics (BCAM), ${}^{d}$ Ikerbasque — Basque Foundation for Science

Variable or covariate selection is a crucial step aimed at identifying the most relevant predictors for explaining the response variable.

In this work, we propose BayesVS-GLM, a novel Bayesian covariate selection method for Generalized Linear Models (GLMs). BayesVS-GLM is based on a fully conjugate Bayesian hierarchical model that comes with theoretical posterior consistency guarantees. Specifically, we extend the standard GLM framework by introducing a binary vector $z$ to indicate which covariates are included in the generalized linear predictor. The regression coefficients $\beta$ are modeled conditionally on $z$ , using conjugate priors for GLMs [4].

Although related method exists ([1], [2], [3]), our method is, to the best of our knowledge, the first that provides a unified framework in which: $(i)$ the formulation of the hierarchical GLM is fully conjugate; $(ii)$ the GLM likelihood is explicitly dependent on indicator variables $z$ , enabling a regressor selection based uniquely on observed data; and $(iii)$ the posterior asymptotical accuracy of $z$ and posterior consistency of the regression coefficients $\beta$ are guaranteed. For posterior inference, we present an efficient Gibbs sampling algorithm, based on a fully conjugate Bayesian hierarchical model.

The BayesVS-GLM formulation is applicable to any distribution within the exponential family, and unifies a range of existing Bayesian variable selection perspectives within a single coherent hierarchical framework.

Keywords: Bayesian Variable Selection, Generalized Linear Models, Posterior model selection.

References

[1] L. Kuo, and B. Mallick (1998). Variable selection for regression models. Sankhyā: The Indian Journal of Statistics, Series B, 60(1), 65–81.
[2] P. Dellaportas, J. Forster, and I. Ntzoufras (2002). On Bayesian model and variable selection using MCMC. In Statistics and computing, 12(1), 27–36.
[3] N. N. Narisetty, S. Juan, and H. Xuming (2019). Skinny Gibbs: A Consistent and Scalable Gibbs Sampler for Model Selection. In Journal of the American Statistical Association, 114(527), 1205–1217.
[4] M. Chen, and J. G. Ibrahim (2003). Conjugate priors for generalized linear models. In Statistica Sinica, 13(2), 461–476.