William Uther scite author profile

This paper introduces a principled approach for the design of a scalable general reinforcement learning agent. Our approach is based on a direct approximation of AIXI, a Bayesian optimality notion for general reinforcement learning agents. Previously, it has been unclear whether the theory of AIXI could motivate the design of practical algorithms. We answer this hitherto open question in the affirmative, by providing the first computationally feasible approximation to the AIXI agent. To develop our approximation, we introduce a new Monte-Carlo Tree Search algorithm along with an agent-specific extension to the Context Tree Weighting algorithm. Empirically, we present a set of encouraging results on a variety of stochastic and partially observable domains. We conclude by proposing a number of directions for future research.

show abstract

Mean Absolute Error

Fürnkranz¹,

Chan²,

Craw³

et al. 2011

View full text Add to dashboard Cite

Tf–idf

Uther¹,

Ciaramita²,

Berendt³

et al. 2011

View full text Add to dashboard Cite

CM-Pack’01: Fast Legged Robot Walking, Robust Localization, and Team Behaviors

Uther

Lenser

Bruce

et al. 2002

View full text Add to dashboard Cite

Probabilities on Sentences in an Expressive Logic

Hutter

Lloyd

et al. 2013

Journal of Applied Logic

View full text Add to dashboard Cite

Abstract1 Automated reasoning about uncertain knowledge has many applications. One difficulty when developing such systems is the lack of a completely satisfactory integration of logic and probability. We address this problem directly. Expressive languages like higherorder logic are ideally suited for representing and reasoning about structured knowledge. Uncertain knowledge can be modeled by using graded probabilities rather than binary truth-values. The main technical problem studied in this paper is the following: Given a set of sentences, each having some probability of being true, what probability should be ascribed to other (query) sentences? A natural wish-list, among others, is that the probability distribution (i) is consistent with the knowledge base, (ii) allows for a consistent inference procedure and in particular (iii) reduces to deductive logic in the limit of probabilities being 0 and 1, (iv) allows (Bayesian) inductive reasoning and (v) learning in the limit and in particular (vi) allows confirmation of universally quantified hypotheses/sentences. We translate this wish-list into technical requirements for a prior probability and show that probabilities satisfying all our criteria exist. We also give explicit constructions and several general characterizations of probabilities that satisfy some or all of the criteria and various (counter) examples. We also derive necessary and sufficient conditions for extending beliefs about finitely many sentences to suitable probabilities over all sentences, and in particular least dogmatic or least biased ones. We conclude with a brief outlook on how the developed theory might be used and approximated in autonomous reasoning agents. Our theory is a step towards a globally consistent and empirically satisfactory unification of probability and logic.

show abstract

Probabilistic modelling, inference and learning using logical theories

Lloyd

Uther

2008

Ann Math Artif Intell

View full text Add to dashboard Cite

This paper provides a study of probabilistic modelling, inference and learning in a logic-based setting. We show how probability densities, being functions, can be represented and reasoned with naturally and directly in higher-order logic, an expressive formalism not unlike the (informal) everyday language of mathematics. We give efficient inference algorithms and illustrate the general approach with a diverse collection of applications. Some learning issues are also considered.

show abstract

Playing soccer with legged robots

Veloso

Uther

Fijita³

et al.

View full text Add to dashboard Cite

Sony has provided a r emarkable platform for research and development in robotic agents, namely fully autonomous legged r obots. In this paper, we describe our work using Sony's legged r obots to participate at the RoboCup'98 legged r obot demonstration and competition. Robotic soccer represents a very challenging environment for research into systems with multiple robots that need to achieve concrete objectives, particularly in the presence of an adversary. Furthermore RoboCup'98 o ers an excellent opportunity for robot entertainment. We introduce t h e R oboCup context and brie y present Sony's legged r obot. We developed a vision-based navigation and a Bayesian localization algorithm. Team strategy is achieved through pre-de ned behaviors and learning by instruction.

show abstract

Model-based design: a report from the trenches of the DARPA Urban Challenge

et al. 2009

View full text Add to dashboard Cite

The impact of model-based design on the software engineering community is impressive, and recent research in model transformations, and elegant behavioral specifications of systems has the potential to revolutionize the way in which systems are designed. Such techniques aim to raise the level of abstraction at which systems are specified, to remove the burden of producing applicationspecific programs with general-purpose programming. For complex real-time systems, however, the impact of modeldriven approaches is not nearly so widespread. In this paper, we present a perspective of model-based design researchers who joined with software experts in robotics to enter the DARPA Urban Challenge, and to what extent modelbased design techniques were used. Further, we speculate on why, according to our experience and the testimonies of many teams, the full promises of model-based design were not widely realized for the competition. Finally, we

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

William Uther

A Monte-Carlo AIXI Approximation

Mean Absolute Error

Tf–idf

CM-Pack’01: Fast Legged Robot Walking, Robust Localization, and Team Behaviors

Probabilities on Sentences in an Expressive Logic

Probabilistic modelling, inference and learning using logical theories

Playing soccer with legged robots

Model-based design: a report from the trenches of the DARPA Urban Challenge

Contact Info

Product

Resources

About