On the effect of prior assumptions in Bayesian model averaging with applications to growth regression

Ley, Eduardo; Steel, Mark F. J.

doi:10.1002/jae.1057

Cited by 377 publications

(321 citation statements)

References 33 publications

(36 reference statements)

Supporting

Mentioning

310

Contrasting

Unclassified

Order By: Relevance

“…Because the hyperparameter g acts as a complexity parameter, its choice can be critical. Various choices and hyperpriors for g have been suggested, see for example Fernandez et al (2001), Ley and Steel (2009) and Liang et al (2008) for a review and suggestions for mixtures of g-priors. Avoiding the g-prior approach, Casella and Moreno (2006) propose intrinsic priors for objective Bayesian variable selection.…”

Section: Variable Selectionmentioning

confidence: 99%

Bayesian regularisation in structured additive regression: a unifying perspective on shrinkage, smoothing and predictor selection

2009

View full text Add to dashboard Cite

This paper surveys various shrinkage, smoothing and selection priors from a unifying perspective and shows how to combine them for Bayesian regularisation in the general class of structured additive regression models. As a common feature, all regularisation priors are conditionally Gaussian, given further parameters regularising model complexity. Hyperpriors for these parameters encourage shrinkage, smoothness or selection. It is shown that these regularisation (log-) priors can be interpreted as Bayesian analogues of several well-known frequentist penalty terms. Inference can be carried out with unified and computationally efficient MCMC schemes, estimating regularised regression coefficients and basis function coefficients simultaneously with complexity parameters and measuring uncertainty via corresponding marginal posteriors. For variable and function selection we discuss several variants of spike and slab priors which can also be cast into the framework of conditionally Gaussian priors. The performance of the Bayesian regularisation approaches is demonstrated in a hazard regression model and a high-dimensional geoadditive regression model. KeywordsConditionally Gaussian priors · lasso · MCMC · P-splines · Spike and slab prior · Structured additive regression Basic concepts of Bayesian regularisationIn quite general terms, the notion of regularisation summarises approaches that allow to solve systems of equations Aβ ≈ a with respect to β if A is close to singular or even exactly singular. Hence, the purpose of regularisation is to introduce additional assumptions that allow to characterise useful solutions β. In statistical terms, a typical example are linear regression models y = Xβ + ε, ε ∼ N(0, σ 2 I ) where y denotes a vector of responses, β is a q-dimensional vector of regression coefficients associated with covariates collected in the design matrix X and ε is a vector of error terms. The classical least squares estimate minimises the squared L 2 norm y − Xβ 2 2 by solving the normal equations X Xβ = X y with respect to β. Therefore, A = X X and a = X y in the general regularisation notation. If the dimension of β is large (possibly larger than the sample size) or if some columns in X are close to collinear, solving the normal equations becomes numerically instable. A classical regularisation approach to overcome this difficulty is to add an L 2 -Tikhonov regularisation penalty to the optimisation problem, yieldingwhere λ > 0 is a regularisation parameter that determines the impact of the penalty term leading to the penalised least squares estimateβ = (X X + λI ) −1 X y with covariance matrix Cov(β) = σ 2 (X X + λI ) −1 X X(X X + λI ) −1 . In a statistical interpretation, Tikhonov regularisation corresponds to ridge estimation. For more general types

show abstract

Section: Variable Selectionmentioning

confidence: 99%

Bayesian regularisation in structured additive regression: a unifying perspective on shrinkage, smoothing and predictor selection

2009

View full text Add to dashboard Cite

show abstract

“…We consider four different priors over the models: the uniform, the beta-binomial (Ley andSteel 2009), the Occam (St-Louis et al 2012) and the Kullback-Leibler (K-L, Burnham and Anderson 2002;Link and Barker 2006). We show that BMA achieves high accuracy on the safe instances but behaves almost as a random guesser on the prior-dependent instances, regardless the adopted prior over the models.…”

Section: Introductionmentioning

confidence: 94%

“…The gray area shows the interval within which the probability of the model varies according to CMA. We limit the X axis between 2 and 10 to improve readability Alternatively to the IB prior, the beta-binomial (BB) prior has been recommended (Clyde and George 2004;Ley and Steel 2009) since its inferences are less sensitive to the choice of θ . The BB prior treats θ as a random variable.…”

Section: Prior Over the Modelsmentioning

confidence: 99%

See 1 more Smart Citation

Robust Bayesian model averaging for the analysis of presence–absence data

Corani

Mignatti

2015

Environ Ecol Stat

View full text Add to dashboard Cite

When developing a species distribution model, usually one tests several competing models such as logistic regressions characterized by different sets of covariates. Yet, there is an exponential number of subsets of covariates to choose from. This generates the problem of model uncertainty. Bayesian model averaging (BMA) is a state-of-the-art approach to deal with model uncertainty. BMA weights the inferences of multiple models. However, the results yielded by BMA depend on the prior probability assigned to the models. Credal model averaging (CMA) extends BMA towards robustness. It substitutes the single prior over the models by a set of priors. The CMA inferences (e.g., posterior probability of inclusion of a covariate, coefficient of a covariate, probability of presence) are intervals. The interval shows the sensitivity of the BMA estimate on the prior over the models. CMA detects the prior-dependent instances, namely cases in which the most probable outcome becomes presence or absence depending on the adopted prior over the models. On such prior-dependent instances, BMA behaves almost as a random guesser. The weakness of BMA on the prior-dependent instances is to our knowledge pointed out for the first time in the ecological literature. On the prior-dependent instances CMA avoids random guessing acknowledging undecidability. In this way it stimulates the decision maker to convey Handling Editor: Bryan F. J. Manly. G. Corani (B) Istituto Dalle Molle di studi sull'Intelligenza Artificiale (IDSIA), Scuola universitaria professionale della Svizzera italiana (SUPSI), Università della Svizzera italiana (USI),

show abstract

“…Normally, the priors of the models can be of any form, as long as it follows the subjective knowledge of the person who conducts forecasting. The most convenient priors are the Uniform, the Binomial and the Beta-Binomial (Doppelhofer and Miller, 2004;Ley and Steel, 2009). After specifying all the priors, we can then calculate the posterior model inclusion probabilities through (6) where | is the integrated likelihood, and it can be calculated through (7) After estimating the in-sample posterior densities of models and parameters, we can then conduct out-of-sample forecasting according to…”

Section: Bayesian Model Averagingmentioning

confidence: 99%

Model selection on tourism forecasting: A comparison between Bayesian model averaging and Lasso

Wang¹,

Geng²,

Wang³

2017

Afr. J. Bus. Manage.

View full text Add to dashboard Cite

This study tries to tackle the tourism forecasting problem using online search queries. This recent-developed methodology is subject to several criticisms, one of which is how to choose satisfying search queries to be built in the forecasting model. This study compares two popular candidates, which are the Bayesian Model Averaging (BMA) approach and the Least Absolute Shrinkage and Selector Operator (Lasso) approach. Evidence shows that the two approaches produce similar forecasting performance but different query selection results.

show abstract

On the effect of prior assumptions in Bayesian model averaging with applications to growth regression

Cited by 377 publications

References 33 publications

Bayesian regularisation in structured additive regression: a unifying perspective on shrinkage, smoothing and predictor selection

Bayesian regularisation in structured additive regression: a unifying perspective on shrinkage, smoothing and predictor selection

Robust Bayesian model averaging for the analysis of presence–absence data

Model selection on tourism forecasting: A comparison between Bayesian model averaging and Lasso

Contact Info

Product

Resources

About