Jan Poland scite author profile

When applying aggregating strategies to Prediction with Expert Advice, the learning rate must be adaptively tuned. The natural choice of complexity/current loss renders the analysis of Weighted Majority derivatives quite complicated. In particular, for arbitrary weights there have been no results proven so far. The analysis of the alternative "Follow the Perturbed Leader" (FPL) algorithm from [KV03] (based on Hannan's algorithm) is easier. We derive loss bounds for adaptive learning rate and both finite expert classes with uniform weights and countable expert classes with arbitrary weights. For the former setup, our loss bounds match the best known results so far, while for the latter our results are (to our knowledge) new. KeywordsPrediction with Expert Advice, Follow the Perturbed Leader, general weights, adaptive learning rate, hierarchy of experts, expected and high probability bounds, general alphabet and loss, online sequential prediction,

show abstract

Bayesian optimization for maximum power point tracking in photovoltaic power plants

Abdelrahman

Berkenkamp

Poland

et al. 2016

View full text Add to dashboard Cite

The Critical Spectrum of a Strongly Continuous Semigroup

Nagel¹,

Poland²

2000

Advances in Mathematics

View full text Add to dashboard Cite

show abstract

Convergence of Discrete MDL for Sequential Prediction

Poland

Hutter

2004

View full text Add to dashboard Cite

We study the properties of the Minimum Description Length principle for sequence prediction, considering a two-part MDL estimator which is chosen from a countable class of models. This applies in particular to the important case of universal sequence prediction, where the model class corresponds to all algorithms for some fixed universal Turing machine (this correspondence is by enumerable semimeasures, hence the resulting models are stochastic). We prove convergence theorems similar to Solomonoff's theorem of universal induction, which also holds for general Bayes mixtures. The bound characterizing the convergence speed for MDL predictions is exponentially larger as compared to Bayes mixtures. We observe that there are at least three different ways of using MDL for prediction. One of these has worse prediction properties, for which predictions only converge if the MDL estimator stabilizes. We establish sufficient conditions for this to occur. Finally, some immediate consequences for complexity relations and randomness criteria are proven.

show abstract

Clustering Pairwise Distances with Missing Data: Maximum Cuts Versus Normalized Cuts

Poland

Zeugmann

2006

View full text Add to dashboard Cite

Abstract. Clustering algorithms based on a matrix of pairwise similarities (kernel matrix) for the data are widely known and used, a particularly popular class being spectral clustering algorithms. In contrast, algorithms working with the pairwise distance matrix have been studied rarely for clustering. This is surprising, as in many applications, distances are directly given, and computing similarities involves another step that is error-prone, since the kernel has to be chosen appropriately, albeit computationally cheap. This paper proposes a clustering algorithm based on the SDP relaxation of the max-k-cut of the graph of pairwise distances, based on the work of Frieze and Jerrum. We compare the algorithm with Yu and Shi's algorithm based on spectral relaxation of a norm-k-cut. Moreover, we propose a simple heuristic for dealing with missing data, i.e., the case where some of the pairwise distances or similarities are not known. We evaluate the algorithms on the task of clustering natural language terms with the Google distance, a semantic distance recently introduced by Cilibrasi and Vitányi, using relative frequency counts from WWW queries and based on the theory of Kolmogorov complexity.

show abstract

Asymptotics of Discrete MDL for Online Prediction

Poland

Hutter

2005

IEEE Trans. Inform. Theory

View full text Add to dashboard Cite

Abstract-Minimum description length (MDL) is an important principle for induction and prediction, with strong relations to optimal Bayesian learning. This paper deals with learning processes which are independent and identically distributed (i.i.d.) by means of two-part MDL, where the underlying model class is countable. We consider the online learning framework, i.e., observations come in one by one, and the predictor is allowed to update its state of mind after each time step. We identify two ways of predicting by MDL for this setup, namely, a static and a dynamic one. (A third variant, hybrid MDL, will turn out inferior.) We will prove that under the only assumption that the data is generated by a distribution contained in the model class, the MDL predictions converge to the true values almost surely. This is accomplished by proving finite bounds on the quadratic, the Hellinger, and the Kullback-Leibler loss of the MDL learner, which are, however, exponentially worse than for Bayesian prediction. We demonstrate that these bounds are sharp, even for model classes containing only Bernoulli distributions. We show how these bounds imply regret bounds for arbitrary loss functions. Our results apply to a wide range of setups, namely, sequence prediction, pattern classification, regression, and universal induction in the sense of algorithmic information theory among others.

show abstract

12 3 4 5 6 7

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Jan Poland

Model predictive control of a rotary cement kiln

BESS Control Strategies for Participating in Grid Frequency Regulation

Prediction with Expert Advice by Following the Perturbed Leader for General Weights

Bayesian optimization for maximum power point tracking in photovoltaic power plants

The Critical Spectrum of a Strongly Continuous Semigroup

Convergence of Discrete MDL for Sequential Prediction

Clustering Pairwise Distances with Missing Data: Maximum Cuts Versus Normalized Cuts

Asymptotics of Discrete MDL for Online Prediction

Contact Info

Product

Resources

About