Abstract:In cooperation, the workers must know how co-workers behave. However, an agent's policy, which is embedded in a statistical machine learning model, is hard to understand, and requires much time and knowledge to comprehend. Therefore, it is difficult for people to predict the behavior of machine learning robots, which makes Human Robot Cooperation challenging. In this paper, we propose Instruction-based Behavior Explanation (IBE), a method to explain an autonomous agent's future behavior. In IBE, an agent can a… Show more
“…Policy explanations in human-agent interaction settings have been used to achieve transparency (Hayes and Shah 2017) and provide summaries of the policies (Amir and Amir 2018). Explanation in reinforcement learning has been explored, using interactive RL to generate explanations from instructions of a human (Fukuchi et al 2017) and to provide contrastive explanations (van der Waa et al 2018). Soft decision trees have been used to generate more interpretable policies (Coppens et al 2019), and reward decomposition has been utilized to provide minimum sufficient explanations in RL (Juozapaitis et al 2019).…”
Prominent theories in cognitive science propose that humans understand and represent the knowledge of the world through causal relationships. In making sense of the world, we build causal models in our mind to encode cause-effect relations of events and use these to explain why new events happen by referring to counterfactuals — things that did not happen. In this paper, we use causal models to derive causal explanations of the behaviour of model-free reinforcement learning agents. We present an approach that learns a structural causal model during reinforcement learning and encodes causal relationships between variables of interest. This model is then used to generate explanations of behaviour based on counterfactual analysis of the causal model. We computationally evaluate the model in 6 domains and measure performance and task prediction accuracy. We report on a study with 120 participants who observe agents playing a real-time strategy game (Starcraft II) and then receive explanations of the agents' behaviour. We investigate: 1) participants' understanding gained by explanations through task prediction; 2) explanation satisfaction and 3) trust. Our results show that causal model explanations perform better on these measures compared to two other baseline explanation models.
“…Policy explanations in human-agent interaction settings have been used to achieve transparency (Hayes and Shah 2017) and provide summaries of the policies (Amir and Amir 2018). Explanation in reinforcement learning has been explored, using interactive RL to generate explanations from instructions of a human (Fukuchi et al 2017) and to provide contrastive explanations (van der Waa et al 2018). Soft decision trees have been used to generate more interpretable policies (Coppens et al 2019), and reward decomposition has been utilized to provide minimum sufficient explanations in RL (Juozapaitis et al 2019).…”
Prominent theories in cognitive science propose that humans understand and represent the knowledge of the world through causal relationships. In making sense of the world, we build causal models in our mind to encode cause-effect relations of events and use these to explain why new events happen by referring to counterfactuals — things that did not happen. In this paper, we use causal models to derive causal explanations of the behaviour of model-free reinforcement learning agents. We present an approach that learns a structural causal model during reinforcement learning and encodes causal relationships between variables of interest. This model is then used to generate explanations of behaviour based on counterfactual analysis of the causal model. We computationally evaluate the model in 6 domains and measure performance and task prediction accuracy. We report on a study with 120 participants who observe agents playing a real-time strategy game (Starcraft II) and then receive explanations of the agents' behaviour. We investigate: 1) participants' understanding gained by explanations through task prediction; 2) explanation satisfaction and 3) trust. Our results show that causal model explanations perform better on these measures compared to two other baseline explanation models.
“…The instructions are then re-used by the system to generate natural-language explanations. Further work by Fukuchi et al ( 2017b ) then expanded on this to a situation where an agent dynamically changed policy.…”
Section: Discussionmentioning
confidence: 99%
“…The instructions are then re-used by the system to generate natural-language explanations. Further work by Fukuchi et al (2017b) then expanded on this to a situation where an agent dynamically changed policy. Hayes and Shah (2017) used code annotations to give humanreadable labels to functions representing actions and variables representing state space, and then used a separate Markov Decision Process (MDP) to construct a model of the domain and policy of the control software itself.…”
Research into Explainable Artificial Intelligence (XAI) has been increasing in recent years as a response to the need for increased transparency and trust in AI. This is particularly important as AI is used in sensitive domains with societal, ethical, and safety implications. Work in XAI has primarily focused on Machine Learning (ML) for classification, decision, or action, with detailed systematic reviews already undertaken. This review looks to explore current approaches and limitations for XAI in the area of Reinforcement Learning (RL). From 520 search results, 25 studies (including 5 snowball sampled) are reviewed, highlighting visualization, query-based explanations, policy summarization, human-in-the-loop collaboration, and verification as trends in this area. Limitations in the studies are presented, particularly a lack of user studies, and the prevalence of toy-examples and difficulties providing understandable explanations. Areas for future study are identified, including immersive visualization, and symbolic representation.
“…One of the targets in the BToM research field is modeling a human observer who attributes mental states to an actor while watching the actor's behavior. In a typical problem setting, an observer can observe the whole environment, including the actor in the environment, and attributes mental states such as the actor's belief b 1 (9) where o 1 is an observation that the observer infers the actor observes at time t. The probability of each variable can be calculated using a forward algorithm [16]. The PublicSelf model is based on the BToM concept.…”
Section: Bayesian Modeling Of Theory Of Mindmentioning
Most of agents that learn policy for tasks with reinforcement learning (RL) lack the ability to communicate with people, which makes human-agent collaboration challenging. We believe that, in order for RL agents to comprehend utterances from human colleagues, RL agents must infer the mental states that people attribute to them because people sometimes infer an interlocutor's mental states and communicate on the basis of this mental inference. This paper proposes PublicSelf model, which is a model of a person who infers how the person's own behavior appears to their colleagues. We implemented the PublicSelf model for an RL agent in a simulated environment and examined the inference of the model by comparing it with people's judgment. The results showed that the agent's intention that people attributed to the agent's movement was correctly inferred by the model in scenes where people could find certain intentionality from the agent's behavior.
CCS CONCEPTS• Computing methodologies → Theory of mind; KEYWORDS Reinforcement learning, Bayesian inference, Public self-awareness, Theory of mind, PublicSelf model, Human-agent interaction ACM Reference Format:
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.