This paper introduces a principled approach for the design of a scalable general reinforcement learning agent. Our approach is based on a direct approximation of AIXI, a Bayesian optimality notion for general reinforcement learning agents. Previously, it has been unclear whether the theory of AIXI could motivate the design of practical algorithms. We answer this hitherto open question in the affirmative, by providing the first computationally feasible approximation to the AIXI agent. To develop our approximation, we introduce a new Monte-Carlo Tree Search algorithm along with an agent-specific extension to the Context Tree Weighting algorithm. Empirically, we present a set of encouraging results on a variety of stochastic and partially observable domains. We conclude by proposing a number of directions for future research.
No abstract
No abstract
Abstract1 Automated reasoning about uncertain knowledge has many applications. One difficulty when developing such systems is the lack of a completely satisfactory integration of logic and probability. We address this problem directly. Expressive languages like higherorder logic are ideally suited for representing and reasoning about structured knowledge. Uncertain knowledge can be modeled by using graded probabilities rather than binary truth-values. The main technical problem studied in this paper is the following: Given a set of sentences, each having some probability of being true, what probability should be ascribed to other (query) sentences? A natural wish-list, among others, is that the probability distribution (i) is consistent with the knowledge base, (ii) allows for a consistent inference procedure and in particular (iii) reduces to deductive logic in the limit of probabilities being 0 and 1, (iv) allows (Bayesian) inductive reasoning and (v) learning in the limit and in particular (vi) allows confirmation of universally quantified hypotheses/sentences. We translate this wish-list into technical requirements for a prior probability and show that probabilities satisfying all our criteria exist. We also give explicit constructions and several general characterizations of probabilities that satisfy some or all of the criteria and various (counter) examples. We also derive necessary and sufficient conditions for extending beliefs about finitely many sentences to suitable probabilities over all sentences, and in particular least dogmatic or least biased ones. We conclude with a brief outlook on how the developed theory might be used and approximated in autonomous reasoning agents. Our theory is a step towards a globally consistent and empirically satisfactory unification of probability and logic.
This paper provides a study of probabilistic modelling, inference and learning in a logic-based setting. We show how probability densities, being functions, can be represented and reasoned with naturally and directly in higher-order logic, an expressive formalism not unlike the (informal) everyday language of mathematics. We give efficient inference algorithms and illustrate the general approach with a diverse collection of applications. Some learning issues are also considered.
Sony has provided a r emarkable platform for research and development in robotic agents, namely fully autonomous legged r obots. In this paper, we describe our work using Sony's legged r obots to participate at the RoboCup'98 legged r obot demonstration and competition. Robotic soccer represents a very challenging environment for research into systems with multiple robots that need to achieve concrete objectives, particularly in the presence of an adversary. Furthermore RoboCup'98 o ers an excellent opportunity for robot entertainment. We introduce t h e R oboCup context and brie y present Sony's legged r obot. We developed a vision-based navigation and a Bayesian localization algorithm. Team strategy is achieved through pre-de ned behaviors and learning by instruction.
The impact of model-based design on the software engineering community is impressive, and recent research in model transformations, and elegant behavioral specifications of systems has the potential to revolutionize the way in which systems are designed. Such techniques aim to raise the level of abstraction at which systems are specified, to remove the burden of producing applicationspecific programs with general-purpose programming. For complex real-time systems, however, the impact of modeldriven approaches is not nearly so widespread. In this paper, we present a perspective of model-based design researchers who joined with software experts in robotics to enter the DARPA Urban Challenge, and to what extent modelbased design techniques were used. Further, we speculate on why, according to our experience and the testimonies of many teams, the full promises of model-based design were not widely realized for the competition. Finally, we
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.