Asymptotics of Discrete MDL for Online Prediction

Poland, Jan; Hutter, Marcus

doi:10.1109/tit.2005.856956

Cited by 20 publications

(14 citation statements)

References 33 publications

(86 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Now, if the AIC–BIC dilemma is interpreted as a conflict between consistency and optimal sequential prediction, then cumulative risk is a natural and often‐considered performance criterion (Haussler and Opper, 1997; Rissanen et al. , 1992; Barron, 1998a; Yang and Barron, 1999; Poland and Hutter, 2005), and we can reasonably claim that our results solve the dilemma. However, it can also be interpreted as a dichotomy between model selection for truth finding and model selection‐based (non‐sequential) estimation.…”

Section: Discussionmentioning

confidence: 63%

See 1 more Smart Citation

Catching up Faster by Switching Sooner: A Predictive Approach to Adaptive Estimation with an Application to the AIC–BIC Dilemma

Erven

Grünwald

Rooij

2012

Journal of the Royal Statistical Society Series B: Statistical Methodology

View full text Add to dashboard Cite

Summary. Prediction and estimation based on Bayesian model selection and model averaging, and derived methods such as the Bayesian information criterion BIC, do not always converge at the fastest possible rate. We identify the catch-up phenomenon as a novel explanation for the slow convergence of Bayesian methods, which inspires a modification of the Bayesian predictive distribution, called the switch distribution. When used as an adaptive estimator, the switch distribution does achieve optimal cumulative risk convergence rates in non-parametric density estimation and Gaussian regression problems. We show that the minimax cumulative risk is obtained under very weak conditions and without knowledge of the underlying degree of smoothness. Unlike other adaptive model selection procedures such as the Akaike information criterion AIC and leave-one-out cross-validation, BIC and Bayes factor model selection are typically statistically consistent. We show that this property is retained by the switch distribution, which thus solves the AIC-BIC dilemma for cumulative risk. The switch distribution has an efficient implementation. We compare its performance with AIC, BIC and Bayesian model selection and averaging on a regression problem with simulated data.

show abstract

Section: Discussionmentioning

confidence: 63%

“…Minimax cumulative risk has previously been studied by, among others, Haussler and Opper (1997), Rissanen et al. (1992), Barron (1998a), Yang and Barron (1999) and Poland and Hutter (2005).…”

Section: Risk Bounds: Preliminaries and Parametric Casementioning

confidence: 99%

Catching up Faster by Switching Sooner: A Predictive Approach to Adaptive Estimation with an Application to the AIC–BIC Dilemma

Erven

Grünwald

Rooij

2012

Journal of the Royal Statistical Society Series B: Statistical Methodology

View full text Add to dashboard Cite

show abstract

“…Several estimation procedures do not only provide q n on X n , but measures on X ∞ or equivalently for each n separately a TC q n : X * → [0;1] (see Bayes and crude MDL below). While this opens further options forq, e.g.q(x n+1 |x 1:n ):=q n (x 1:n+1 )/q n (x 1:n ) with some (weak) results for MDL [PH05], it does not solve our main problem.…”

Section: Conversion Methodsmentioning

confidence: 96%

“…Crude MDL simply selects q n := argmax ν∈M {ν(x 1:n ) w(ν)} at time n, which is a probability measure on X ∞ . While this opens additional options for defining q, they also can perform poorly in the worst case [PH05]. Note that most versions of MDL perform often very well in practice, comparable to Bayes; robustness and proving guarantees are the open problems.…”

Section: Examplesmentioning

confidence: 99%

Offline to Online Conversion

Hutter

2014

Lecture Notes in Computer Science

View full text Add to dashboard Cite

We consider the problem of converting offline estimators into an online predictor or estimator with small extra regret. Formally this is the problem of merging a collection of probability measures over strings of length 1,2,3,... into a single probability measure over infinite sequences. We describe various approaches and their pros and cons on various examples. As a side-result we give an elementary non-heuristic purely combinatoric derivation of Turing's famous estimator. Our main technical contribution is to determine the computational complexity of online estimators with good guarantees in general.

show abstract

“…where 13 follows from Equations 7 and 8. Our ρ i , ρ norm i , and ρ stat i are closely inspired by Poland and Hutter (2005), who constructed (in our notation) ρ 1 , ρ norm 1 , and ρ stat 1 . Our first lemma bounds the deviation from ρ i being a measure.…”

Section: General Sequence Predictionmentioning

confidence: 99%

Fully General Online Imitation Learning

Cohen¹,

Hutter²,

Nanda³

2021

Preprint

View full text Add to dashboard Cite

In imitation learning, imitators and demonstrators are policies for picking actions given past interactions with the environment. If we run an imitator, we probably want events to unfold similarly to the way they would have if the demonstrator had been acting the whole time. No existing work provides formal guidance in how this might be accomplished, instead restricting focus to environments that restart, making learning unusually easy, and conveniently limiting the significance of any mistake. We address a fully general setting, in which the (stochastic) environment and demonstrator never reset, not even for training purposes. Our new conservative Bayesian imitation learner underestimates the probabilities of each available action, and queries for more data with the remaining probability. Our main result: if an event would have been unlikely had the demonstrator acted the whole time, that event's likelihood can be bounded above when running the (initially totally ignorant) imitator instead. Meanwhile, queries to the demonstrator rapidly diminish in frequency.

show abstract

Asymptotics of Discrete MDL for Online Prediction

Cited by 20 publications

References 33 publications

Catching up Faster by Switching Sooner: A Predictive Approach to Adaptive Estimation with an Application to the AIC–BIC Dilemma

Catching up Faster by Switching Sooner: A Predictive Approach to Adaptive Estimation with an Application to the AIC–BIC Dilemma

Offline to Online Conversion

Fully General Online Imitation Learning

Contact Info

Product

Resources

About