2019
DOI: 10.48550/arxiv.1905.08737
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

On the marginal likelihood and cross-validation

Abstract: In Bayesian statistics, the marginal likelihood, also known as the evidence, is used to evaluate model fit as it quantifies the joint probability of the data under the prior. In contrast, non-Bayesian models are typically compared using cross-validation on held-out data, either through k-fold partitioning or leave-p-out subsampling. We show that the marginal likelihood is formally equivalent to exhaustive leave-p-out cross-validation averaged over all values of p and all held-out test sets when using the log p… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
5
0

Year Published

2019
2019
2020
2020

Publication Types

Select...
4

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(5 citation statements)
references
References 27 publications
(38 reference statements)
0
5
0
Order By: Relevance
“…As a consequence, the asymptotic results of this article apply equally to the case where σ is marginalised. Cross-validation offers more possibilities, both rooted (Fong and Holmes, 2019) and not rooted (e.g., Rathinavel and Hickernell, 2019, Section 2.2.3) in the GP model, to some of which our results on maximum likelihood may be relevant. An empirical investigation has been performed in Bachoc (2013).…”
Section: Discussionmentioning
confidence: 99%
“…As a consequence, the asymptotic results of this article apply equally to the case where σ is marginalised. Cross-validation offers more possibilities, both rooted (Fong and Holmes, 2019) and not rooted (e.g., Rathinavel and Hickernell, 2019, Section 2.2.3) in the GP model, to some of which our results on maximum likelihood may be relevant. An empirical investigation has been performed in Bachoc (2013).…”
Section: Discussionmentioning
confidence: 99%
“…( 2), posterior belief distributions generated under the RoT are allowed to break a property referred to as coherence or Bayesian additivity (e.g. Bissiri et al, 2016;Fong and Holmes, 2019). In a nutshell, coherence says that posterior beliefs have to be generated according to some function ψ : R 2 → R which for the prior π(θ) and loss terms (θ, x 1 ), (θ, x 2 ) behaves as…”
Section: Coherence and The Rotmentioning
confidence: 99%
“…To increase stability, some have argued for repeated CV to optimize the penalty parameters (Boulesteix et al, 2017). A theoretical argument for repetition of subsampling is found in Fong and Holmes (2019), who establish an equivalence with marginal likelihood optimization.…”
Section: Extension 1: Repeated CVmentioning
confidence: 99%