We propose a novel Bayesian approach to the problem of variable selection in multiple linear regression models. In particular, we present a hierarchical setting which allows for direct specification of a-priori beliefs about the number of nonzero regression coefficients as well as a specification of beliefs that given coefficients are nonzero. To guarantee numerical stability, we adopt a g-prior with an additional ridge parameter for the unknown regression coefficients. In order to simulate from the joint posterior distribution an intelligent random walk Metropolis-Hastings algorithm which is able to switch between different models is proposed. Testing our algorithm on real and simulated data illustrates that it performs at least on par and often even better than other well-established methods. Finally, we prove that under some nominal assumptions, the presented approach is consistent in terms of model selection.
45However, the g-prior depends on the inverse of the empirical covariance matrix of the selected predictors. This matrix is singular if the number of selected covariates is greater than the number of observations n and, further, may be almost rank deficient given that the predictors are highly correlated. To overcome this problem Wang et al. (2015) replaced the classical inverse with the Moore-Penrose generalized inverse and thus ended up with the so-called gsg-prior (see West (2003)). In contrast to them, we adopt a g-prior 50 with an additional ridge parameter for the unknown regression coefficients to guarantee nonsingularity of the empirical covariance matrix. This modification of the classical g-prior was first proposed by Gupta and Ibrahim (2007) and further investigated by Baragatti and Pommeret (2012).Finally, in Section 2.2 we state that our approach is consistent in terms of model selection according to the consistency definition given by Fernández et al. (2001). The proof of this result is deferred to the 55 appendix. Moreover, in Section 3, we evaluate our approach on the basis of real and simulated data and compare the results with the already described shrinkage methods. We show that our approach performs at least on par and often better than the comparative methods.