Real-world sequential decision-making tasks are generally complex, requiring trade-offs between multiple, often conflicting, objectives. Despite this, the majority of research in reinforcement learning and decision-theoretic planning either assumes only a single objective, or that multiple objectives can be adequately handled via a simple linear combination. Such approaches may oversimplify the underlying problem and hence produce suboptimal results. This paper serves as a guide to the application of multi-objective methods to difficult problems, and is aimed at researchers who are already familiar with single-objective reinforcement learning and planning methods who wish to adopt a multi-objective perspective on their research, as well as practitioners who encounter multi-objective decision problems in practice. It identifies the factors that may influence the nature of the desired solution, and illustrates by example how these influence the design of multi-objective decision-making systems for complex problems.
The recent paper “Reward is Enough” by Silver, Singh, Precup and Sutton posits that the concept of reward maximisation is sufficient to underpin all intelligence, both natural and artificial, and provides a suitable basis for the creation of artificial general intelligence. We contest the underlying assumption of Silver et al. that such reward can be scalar-valued. In this paper we explain why scalar rewards are insufficient to account for some aspects of both biological and computational intelligence, and argue in favour of explicitly multi-objective models of reward maximisation. Furthermore, we contend that even if scalar reward functions can trigger intelligent behaviour in specific cases, this type of reward is insufficient for the development of human-aligned artificial general intelligence due to unacceptable risks of unsafe or unethical behaviour.
The tactical systems and operational environment of modern fighter aircraft are becoming increasingly complex. Creating a realistic and relevant environment for pilot training using only live aircraft is difficult, impractical and highly expensive. The Live, Virtual and Constructive (LVC) simulation paradigm aims to address this challenge. LVC simulation means linking real aircraft, ground-based systems and soldiers (Live), manned simulators (Virtual) and computer controlled synthetic entities (Constructive). Constructive simulation enables realization of complex scenarios with a large number of autonomous friendly, hostile and neutral entities, which interact with each other as well as manned simulators and real systems. This reduces the need for personnel to act as role-players through operation of e.g. live or virtual aircraft, thus lowering the cost of training. Constructive simulation also makes it possible to improve the availability of training by embedding simulation capabilities in live aircraft, making it possible to train anywhere, anytime. In this paper we discuss how machine learning techniques can be used to automate the process of constructing advanced, adaptive behavior models for constructive simulations, to improve the autonomy of future training systems. We conduct a number of initial experiments, and show that reinforcement learning, in particular multi-agent and multi-objective deep reinforcement learning, allows synthetic pilots to learn to cooperate and prioritize among conflicting objectives in air combat scenarios. Though the results are promising, we conclude that further algorithm development is necessary to fully master the complex domain of air combat simulation.
Novel autonomous search and rescue systems, although powerful, still require a human decision-maker involvement. In this project, we focus on the human aspect of one such novel autonomous SAR system. Relying on the knowledge gained in a field study, as well as through the literature, we introduced several extensions to the system that allowed us to achieve a more user-centered interface. In the evaluation session with a rescue service specialist, we received positive feedback and defined potential directions for future work.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.