Autonomous Self-Explanation of Behavior for Interactive Reinforcement Learning Agents

Fukuchi, Yosuke; Osawa, Masahiko; Yamakawa, Hiroshi; Imai, Michita

doi:10.1145/3125739.3125746

Cited by 26 publications

(22 citation statements)

References 10 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Policy explanations in human-agent interaction settings have been used to achieve transparency (Hayes and Shah 2017) and provide summaries of the policies (Amir and Amir 2018). Explanation in reinforcement learning has been explored, using interactive RL to generate explanations from instructions of a human (Fukuchi et al 2017) and to provide contrastive explanations (van der Waa et al 2018). Soft decision trees have been used to generate more interpretable policies (Coppens et al 2019), and reward decomposition has been utilized to provide minimum sufficient explanations in RL (Juozapaitis et al 2019).…”

Section: Related Workmentioning

confidence: 99%

Explainable Reinforcement Learning through a Causal Lens

Madumal

Miller

Sonenberg

et al. 2020

AAAI

193

167

View full text Add to dashboard Cite

Prominent theories in cognitive science propose that humans understand and represent the knowledge of the world through causal relationships. In making sense of the world, we build causal models in our mind to encode cause-effect relations of events and use these to explain why new events happen by referring to counterfactuals — things that did not happen. In this paper, we use causal models to derive causal explanations of the behaviour of model-free reinforcement learning agents. We present an approach that learns a structural causal model during reinforcement learning and encodes causal relationships between variables of interest. This model is then used to generate explanations of behaviour based on counterfactual analysis of the causal model. We computationally evaluate the model in 6 domains and measure performance and task prediction accuracy. We report on a study with 120 participants who observe agents playing a real-time strategy game (Starcraft II) and then receive explanations of the agents' behaviour. We investigate: 1) participants' understanding gained by explanations through task prediction; 2) explanation satisfaction and 3) trust. Our results show that causal model explanations perform better on these measures compared to two other baseline explanation models.

show abstract

Section: Related Workmentioning

confidence: 99%

Explainable Reinforcement Learning through a Causal Lens

Madumal

Miller

Sonenberg

et al. 2020

AAAI

193

167

View full text Add to dashboard Cite

show abstract

Section: Discussionmentioning

confidence: 99%

“…The instructions are then re-used by the system to generate natural-language explanations. Further work by Fukuchi et al (2017b) then expanded on this to a situation where an agent dynamically changed policy. Hayes and Shah (2017) used code annotations to give humanreadable labels to functions representing actions and variables representing state space, and then used a separate Markov Decision Process (MDP) to construct a model of the domain and policy of the control software itself.…”

Section: Policy Summarizationmentioning

confidence: 99%

Explainable AI and Reinforcement Learning—A Systematic Review of Current Approaches and Trends

Wells

Bednarz

2021

Front. Artif. Intell.

101

View full text Add to dashboard Cite

Research into Explainable Artificial Intelligence (XAI) has been increasing in recent years as a response to the need for increased transparency and trust in AI. This is particularly important as AI is used in sensitive domains with societal, ethical, and safety implications. Work in XAI has primarily focused on Machine Learning (ML) for classification, decision, or action, with detailed systematic reviews already undertaken. This review looks to explore current approaches and limitations for XAI in the area of Reinforcement Learning (RL). From 520 search results, 25 studies (including 5 snowball sampled) are reviewed, highlighting visualization, query-based explanations, policy summarization, human-in-the-loop collaboration, and verification as trends in this area. Limitations in the studies are presented, particularly a lack of user studies, and the prevalence of toy-examples and difficulties providing understandable explanations. Areas for future study are identified, including immersive visualization, and symbolic representation.

show abstract

“…One of the targets in the BToM research field is modeling a human observer who attributes mental states to an actor while watching the actor's behavior. In a typical problem setting, an observer can observe the whole environment, including the actor in the environment, and attributes mental states such as the actor's belief b 1 (9) where o 1 is an observation that the observer infers the actor observes at time t. The probability of each variable can be calculated using a forward algorithm [16]. The PublicSelf model is based on the BToM concept.…”

Section: Bayesian Modeling Of Theory Of Mindmentioning

confidence: 99%

Bayesian Inference of Self-intention Attributed by Observer

Fukuchi

Osawa

Yamakawa³

et al. 2018

Proceedings of the 6th International Conference on Human-Agent Interaction

Self Cite

View full text Add to dashboard Cite

Most of agents that learn policy for tasks with reinforcement learning (RL) lack the ability to communicate with people, which makes human-agent collaboration challenging. We believe that, in order for RL agents to comprehend utterances from human colleagues, RL agents must infer the mental states that people attribute to them because people sometimes infer an interlocutor's mental states and communicate on the basis of this mental inference. This paper proposes PublicSelf model, which is a model of a person who infers how the person's own behavior appears to their colleagues. We implemented the PublicSelf model for an RL agent in a simulated environment and examined the inference of the model by comparing it with people's judgment. The results showed that the agent's intention that people attributed to the agent's movement was correctly inferred by the model in scenes where people could find certain intentionality from the agent's behavior. CCS CONCEPTS• Computing methodologies → Theory of mind; KEYWORDS Reinforcement learning, Bayesian inference, Public self-awareness, Theory of mind, PublicSelf model, Human-agent interaction ACM Reference Format:

show abstract

Autonomous Self-Explanation of Behavior for Interactive Reinforcement Learning Agents

Cited by 26 publications

References 10 publications

Explainable Reinforcement Learning through a Causal Lens

Explainable Reinforcement Learning through a Causal Lens

Explainable AI and Reinforcement Learning—A Systematic Review of Current Approaches and Trends

Bayesian Inference of Self-intention Attributed by Observer

Contact Info

Product

Resources

About