Designing reward functions is a challenging problem in AI and robotics. Humans usually have a difficult time directly specifying all the desirable behaviors that a robot needs to optimize. One common approach is to learn reward functions from collected expert demonstrations. However, learning reward functions from demonstrations introduces many challenges: some methods require highly structured models, e.g. reward functions that are linear in some predefined set of features, while others adopt less structured reward functions that on the other hand require tremendous amount of data. In addition, humans tend to have a difficult time providing demonstrations on robots with high degrees of freedom, or even quantifying reward values for given demonstrations. To address these challenges, we present a preference-based learning approach, where as an alternative, the human feedback is only in the form of comparisons between trajectories. Furthermore, we do not assume highly constrained structures on the reward function. Instead, we model the reward function using a Gaussian Process (GP) and propose a mathematical formulation to actively find a GP using only human preferences. Our approach enables us to tackle both inflexibility and data-inefficiency problems within a preferencebased learning framework. Our results in simulations and a user study suggest that our approach can efficiently learn expressive reward functions for robotics tasks.
Air transport systems evaluation infrastructure (IESTA) is an evaluation facility for air transport systems that is currently being developed by Onera in Toulouse, France. The project aims at building a generic simulation platform, designed to ease the integration of new or existing models in order to assess air transport concepts. The first IESTA application, Clean Airport, allows the assessment of the effects of innovative concepts with regard to air traffic noise and chemical pollution on the airports' surroundings. An effective simulation capability has been built and achieved by integrating Onera's expertise in physical modelling. This article gives an overview of the resulting model toolbox architecture, the first outputs and the validation walkthrough.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.