In Reinforcement Learning (RL), an agent is guided by the rewards it receives from the reward function. Unfortunately, it may take many interactions with the environment to learn from sparse rewards, and it can be challenging to specify reward functions that reflect complex reward-worthy behavior. We propose using reward machines (RMs), which are automata-based representations that expose reward function structure, as a normal form representation for reward functions. We show how specifications of reward in various formal languages, including LTL and other regular languages, can be automatically translated into RMs, easing the burden of complex reward function specification. We then show how the exposed structure of the reward function can be exploited by tailored q-learning algorithms and automated reward shaping techniques in order to improve the sample efficiency of reinforcement learning methods. Experiments show that these RM-tailored techniques significantly outperform state-of-the-art (deep) RL algorithms, solving problems that otherwise cannot reasonably be solved by existing approaches.
Multirotor drones have been one of the most important technological advances of the last decade. Their mechanics are simple compared to other types of drones and their possibilities in flight are greater. For example, they can take-off vertically. Their capabilities have therefore brought progress to many professional activities. Moreover, advances in computing and telecommunications have also broadened the range of activities in which drones may be used. Currently, artificial intelligence and information analysis are the main areas of research in the field of computing. The case study presented in this article employed artificial intelligence techniques in the analysis of information captured by drones. More specifically, the camera installed in the drone took images which were later analyzed using Convolutional Neural Networks (CNNs) to identify the objects captured in the images. In this research, a CNN was trained to detect cattle, however the same training process could be followed to develop a CNN for the detection of any other object. This article describes the design of the platform for real-time analysis of information and its performance in the detection of cattle.
Temporally extended goals are critical to the specification of a diversity of real-world planning problems. Here we examine the problem of non-deterministic planning with temporally extended goals specified in linear temporal logic (LTL), interpreted over either finite or infinite traces. Unlike existing LTL planners, we place no restrictions on our LTL formulae beyond those necessary to distinguish finite from infinite interpretations. We generate plans by compiling LTL temporally extended goals into problem instances described in the Planning Domain Definition Language that are solved by a state-of-the-art fully observable non-deterministic planner. We propose several different compilations based on translations of LTL to (Büchi) alternating or (Büchi) non-deterministic finite state automata, and evaluate various properties of the competing approaches. We address a diverse spectrum of LTL planning problems that, to this point, had not been solvable using AI planning techniques, and do so in a manner that demonstrates highly competitive performance.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.