We investigate the terminating grid exploration for autonomous myopic luminous robots. Myopic robots mean that they can observe nodes only within a certain fixed distance, and luminous robots mean that they have light devices that can emit colors. First, we prove that, in the semi-synchronous and asynchronous models, three myopic robots are necessary to achieve the terminating grid exploration if the visible distance is one. Next, we give fourteen algorithms for the terminating grid exploration in various assumptions of synchrony (fully-synchronous, semi-synchronous, and asynchronous models), visible distance, the number of colors, and a chirality. Six of them are optimal in terms of the number of robots.
This chapter describes solving multi-objective reinforcement learning (MORL) problems where there are multiple conflicting objectives with unknown weights. Previous model-free MORL methods take large number of calculations to collect a Pareto optimal set for each V/Q-value vector. In contrast, model-based MORL can reduce such a calculation cost than model-free MORLs. However, previous model-based MORL method is for only deterministic environments. To solve them, this chapter proposes a novel model-based MORL method by a reward occurrence probability (ROP) vector with unknown weights. The experimental results are reported under the stochastic learning environments with up to 10 states, 3 actions, and 3 reward rules. The experimental results show that the proposed method collects all Pareto optimal policies, and it took about 214 seconds (10 states, 3 actions, 3 rewards) for total learning time. In future research directions, the ways to speed up methods and how to use non-optimal policies are discussed.
In artificial intelligence and robotics, one of the important issues is to design human interface. There are two issues: One is the machine-centered interaction design. Another one is the human-centered interaction design. This research aims at the latter issue. This chapter presents the interactive learning system to assist positive change in the preference of a human toward the true preference. Then evaluation of the awareness effect is discussed. The system behaves passively to reflect the human intelligence by visualizing the traces of his/her behaviors. Experimental results showed that subjects are divided into two groups, heavy users and light users, and that there are different effects between them under the same visualizing condition. They also showed that the authors' system improves the efficiency for deciding the most preferred plan for both heavy users and light users. As future research directions, a probabilistic event and its basic recommendation way are discussed.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.