On Ensemble Techniques for AIXI Approximation

Veness, Joel; Sunehag, Peter; Hutter, Marcus

doi:10.1007/978-3-642-35506-6_35

Cited by 3 publications

(3 citation statements)

References 12 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The creation of such an algorithm as a single piece of code is theoretically sound but at best intractable [2]. Various methods have been attempted to deal with the incomputabilityintractability question, to list a few [3], [4]. AGINAO builds its cognitive engine as a hierarchy of interconnected datastructures, named concepts, each with a built-in piece of executable code, named codelet.…”

Section: A the Aginao Self-programming Enginementioning

confidence: 99%

Evaluating actuators in a purely information-theory based reward model

Skaba¹

2013

2013 IEEE Symposium on Computational Intelligence for Human-Like Intelligence (CIHLI)

View full text Add to dashboard Cite

AGINAO builds its cognitive engine by applying selfprogramming techniques to create a hierarchy of interconnected codelets -the tiny pieces of code executed on a virtual machine. These basic processing units are evaluated for their applicability and fitness with a notion of reward calculated from selfinformation gain of binary partitioning of the codelet's input state-space. This approach, however, is useless for the evaluation of actuators. Instead, a model is proposed in which actuators are evaluated by measuring the impact that an activation of an effector, and consequently the feedback from the robot sensors, has on average reward received by the processing units.

show abstract

Section: A the Aginao Self-programming Enginementioning

confidence: 99%

Evaluating actuators in a purely information-theory based reward model

Skaba¹

2013

2013 IEEE Symposium on Computational Intelligence for Human-Like Intelligence (CIHLI)

View full text Add to dashboard Cite

show abstract

“…Even worse, the dependence on C is not an artifact of the analysis, but rather a failing of the EG algorithm, which becomes unstable when experts transition from predicting badly to predicting well. Another algorithm with near-linear running time is online gradient descent (OGD) by Zinkevich [2003] (and applied to this setting by Veness et al [2012a]), which runs in O(N log(N )) time using the fast simplex projection by Duchi et al [2008]. The regret of this algorithm also depends on the size of the maximum gradient of the loss, however, which leads to bound of the same order as Eq.…”

Section: Introductionmentioning

confidence: 99%

Soft-Bayes: Prod for Mixtures of Experts with Log-Loss

Orseau¹,

Lattimore²,

Legg³

2019

Preprint

View full text Add to dashboard Cite

We consider prediction with expert advice under the log-loss with the goal of deriving efficient and robust algorithms. We argue that existing algorithms such as exponentiated gradient, online gradient descent and online Newton step do not adequately satisfy both requirements. Our main contribution is an analysis of the Prod algorithm that is robust to any data sequence and runs in linear time relative to the number of experts in each round. Despite the unbounded nature of the log-loss, we derive a bound that is independent of the largest loss and of the largest gradient, and depends only on the number of experts and the time horizon. Furthermore we give a Bayesian interpretation of Prod and adapt the algorithm to derive a tracking regret.

show abstract

“…methods which: (p) make probabilistic predictions; (o) are strongly online; (w) work well in practice; (e) are efficient; (r) and have well understood regret/loss/redundancy properties. Methods satisfying these properties can be combined in a principled fashion using techniques such as those discussed by [VSH12,Mat13], giving rise to ensembles with clearly interpretable predictive capabilities.…”

Section: Introductionmentioning

confidence: 99%

Online Learning of k-CNF Boolean Functions

Veness,

Hutter

2014

Preprint

Self Cite

View full text Add to dashboard Cite

This paper revisits the problem of learning a k-CNF Boolean function from examples in the context of online learning under the logarithmic loss. In doing so, we give a Bayesian interpretation to one of Valiant's celebrated PAC learning algorithms, which we then build upon to derive two efficient, online, probabilistic, supervised learning algorithms for predicting the output of an unknown k-CNF Boolean function. We analyze the loss of our methods, and show that the cumulative log-loss can be upper bounded, ignoring logarithmic factors, by a polynomial function of the size of each example.

show abstract

On Ensemble Techniques for AIXI Approximation

Cited by 3 publications

References 12 publications

Evaluating actuators in a purely information-theory based reward model

Evaluating actuators in a purely information-theory based reward model

Soft-Bayes: Prod for Mixtures of Experts with Log-Loss

Online Learning of k-CNF Boolean Functions

Contact Info

Product

Resources

About