Asymptotic Log-Loss of Prequential Maximum Likelihood Codes

Grünwald, Peter; Rooij, Steven de

doi:10.1007/11503415_44

Cited by 10 publications

(21 citation statements)

References 24 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…As a consequence its error rate decreases more slowly in the sample size if we put a prior on the generating distribution that assigns nonzero probability to both models. This result was surprising to us and has led to a theoretical analysis of the codelength of the plug-in code in [Grünwald and de Rooij 2005]. It turns out that the regret of the plug-in code does not necessarily grow with k 2 ln n like the NML and Bayesian codes do, if the sample is not distributed according to any element of the model.…”

Section: Discussionmentioning

confidence: 98%

“…We prove in [Grünwald and de Rooij 2005] that for single parameter exponential families, the regret for the plug-in code grows with 1 2 ln(n)Var P (X)/Var M (X), where n is the sample size, P is the generating distribution and M is the best element of the model (the element of M for which the Kullback Leibler divergence D(P M ) is minimised). The plug-in model has the same regret (to O(1)) as the NML model if and only if the variance of the generating distribution is the same as 13 the variance of the best element of the model.…”

Section: Poor Performance Of the Plug-in Criterionmentioning

confidence: 99%

See 1 more Smart Citation

An empirical study of minimum description length model selection with infinite parametric complexity

Rooij

Grünwald

2006

Journal of Mathematical Psychology

Self Cite

View full text Add to dashboard Cite

Parametric complexity is a central concept in Minimum Description Length (MDL) model selection. In practice it often turns out to be infinite, even for quite simple models such as the Poisson and Geometric families. In such cases, MDL model selection as based on NML and Bayesian inference based on Jeffreys' prior can not be used. Several ways to resolve this problem have been proposed. We conduct experiments to compare and evaluate their behaviour on small sample sizes.We find interestingly poor behaviour for the plug-in predictive code; a restricted NML model performs quite well but it is questionable if the results validate its theoretical motivation. A Bayesian marginal distribution with Jeffreys' prior can still be used if one sacrifices the first observation to make a proper posterior; this approach turns out to be most dependable.

show abstract

Section: Discussionmentioning

confidence: 98%

Section: Poor Performance Of the Plug-in Criterionmentioning

confidence: 99%

An empirical study of minimum description length model selection with infinite parametric complexity

Rooij

Grünwald

2006

Journal of Mathematical Psychology

Self Cite

View full text Add to dashboard Cite

show abstract

“…It has been shown that the expression in (15) essentially reduces to SC 3 in (11) as n → ∞ under regularity conditions (Rissanen, 1986(Rissanen, , 1987Dawid, 1992;Grünwald & de Rooij, 2005). 3 An implication of this observation is that the model that permits the greatest compression of the data is also the one that minimizes the accumulated prediction error, thereby providing justification for stochastic complexity as a predictive inference method, at least asymptotically.…”

Section: Predictive Inference and The MDL Principlementioning

confidence: 99%

“…The stimuli were generated 3 The primary regularity condition required for the equivalence proof is that the maximum likelihood estimateθ(x t ) satisfies the central limit theorem such that the tail probabilities are uniformly summable in the following sense: P √ n θ (x t ) − θ ≥ n ≤ δ(n) for all θ and n δ(n) < ∞ where θ denotes a norm measure (Rissanen, 1986, Theorem 1). Recently, Grünwald and de Rooij (2005) identified another important condition for the asymptotic approximation, i.e., that the model is correctly specified. According to their investigation, under model mis-specification, one can get quite different asymptotic results.…”

Section: Using Nml In Cognitive Modelingmentioning

confidence: 99%

Model selection by normalized maximum likelihood

Myung

Navarro

Pitt

2006

Journal of Mathematical Psychology

116

102

View full text Add to dashboard Cite

“…NML, however, requires knowledge of the time horizon and is impractical to calculate in many situations. A particularly simple and popular prediction strategy is the maximum likelihood (ML) strategy [1], [9], which predicts the next outcome x n by using the distribution Pθ n−1 , withθ n−1 being the ML estimator based on the n − 1 past outcomes. The ML strategy, contrary to NML, belongs to the family of plug-in strategies which in each iteration predict with one of the strategies from the model.…”

mentioning

confidence: 99%

Sequential normalized maximum likelihood in log-loss prediction

Kotłowski

Grünwald

2012

2012 IEEE Information Theory Workshop

View full text Add to dashboard Cite

The paper considers sequential prediction of individual sequences with log loss using an exponential family of distributions. We first show that the commonly used maximum likelihood strategy is suboptimal and requires an additional assumption about boundedness of the data sequence. We then show that both problems can be be addressed by adding the currently predicted outcome to the calculation of the maximum likelihood, followed by normalization of the distribution. The strategy obtained in this way is known in the literature as the sequential normalized maximum likelihood (SNML) strategy. We show that for general exponential families, the regret is bounded by the familiar (k/2) log n and thus optimal up to O(1). We also introduce an approximation to SNML, flattened maximum likelihood, much easier to compute that SNML itself, while retaining the optimal regret under some additional assumptions. We finally discuss the relationship to the Bayes strategy with Jeffreys' prior.

show abstract

Asymptotic Log-Loss of Prequential Maximum Likelihood Codes

Cited by 10 publications

References 24 publications

An empirical study of minimum description length model selection with infinite parametric complexity

An empirical study of minimum description length model selection with infinite parametric complexity

Model selection by normalized maximum likelihood

Sequential normalized maximum likelihood in log-loss prediction

Contact Info

Product

Resources

About