On one hand, there has been a growing interest towards the application of AI-based learning and evolutionary programming for selfadaptation under uncertainty. On the other hand, self-explanation is one of the self-* properties that has been neglected. This is paradoxical as self-explanation is inevitably needed when using such techniques. In this paper, we argue that a self-adaptive autonomous system (SAS) needs an infrastructure and capabilities to be able to look at its own history to explain and reason why the system has reached its current state. The infrastructure and capabilities need to be built based on the right conceptual models in such a way that the system's history can be stored, queried to be used in the context of the decision-making algorithms.The explanation capabilities are framed in four incremental levels, from forensic self-explanation to automated history-aware (HA) systems. Incremental capabilities imply that capabilities at Level should be available for capabilities at Level + 1. We demonstrate our current reassuring results related to Level 1 and Level 2, using temporal graph-based models. Specifically, we explain how Level 1 supports forensic accounting after the system's execution. We also present how to enable on-line historical analyses while the self-adaptive system is running, underpinned by the capabilities provided by Level 2. An architecture which allows recording of temporal data that can be queried to explain behaviour has been presented, and the overheads that would be imposed by live analysis are discussed. Future research opportunities are envisioned. CCS CONCEPTS• Software and its engineering → System modeling languages; Integration frameworks; Model-driven software engineering.
Modern software systems are increasingly expected to show higher degrees of autonomy and self-management to cope with uncertain and diverse situations. As a consequence, autonomous systems can exhibit unexpected and surprising behaviours. This is exacerbated due to the ubiquity and complexity of Artificial Intelligence (AI)-based systems. This is the case of Reinforcement Learning (RL), where autonomous agents learn through trial-and-error how to find good solutions to a problem. Thus, the underlying decision-making criteria may become opaque to users that interact with the system and who may require explanations about the system’s reasoning. Available work for eXplainable Reinforcement Learning (XRL) offers different trade-offs: e.g. for runtime explanations, the approaches are model-specific or can only analyse results after-the-fact. Different from these approaches, this paper aims to provide an online model-agnostic approach for XRL towards trustworthy and understandable AI. We present ETeMoX, an architecture based on temporal models to keep track of the decision-making processes of RL systems. In cases where the resources are limited (e.g. storage capacity or time to response), the architecture also integrates complex event processing, an event-driven approach, for detecting matches to event patterns that need to be stored, instead of keeping the entire history. The approach is applied to a mobile communications case study that uses RL for its decision-making. In order to test the generalisability of our approach, three variants of the underlying RL algorithms are used: Q-Learning, SARSA and DQN. The encouraging results show that using the proposed configurable architecture, RL developers are able to obtain explanations about the evolution of a metric, relationships between metrics, and were able to track situations of interest happening over time windows.
Multi-agent systems deliver highly resilient and adaptable solutions for common problems in telecommunications, aerospace, and industrial robotics. However, achieving an optimal global goal remains a persistent obstacle for collaborative multi-agent systems, where learning affects the behaviour of more than one agent. A number of nonlinear function approximation methods have been proposed for solving the Bellman equation, which describe a recursive format of an optimal policy. However, how to leverage the value distribution based on reinforcement learning, and how to improve the efficiency and efficacy of such systems remain a challenge. In this work, we developed a reward-reinforced generative adversarial network to represent the distribution of the value function, replacing the approximation of Bellman updates. We demonstrated our method is resilient and outperforms other conventional reinforcement learning methods. This method is also applied to a practical case study: maximising the number of user connections to autonomous airborne base stations in a mobile communication network. Our method maximises the data likelihood using a cost function under which agents have optimal learned behaviours. This reward-reinforced generative adversarial network can be used as a generic framework for multi-agent learning at the system level.
Software is becoming more complex as it needs to deal with an increasing number of aspects in volatile environments. This complexity may cause behaviors that violate the imposed constraints. A goal of runtime service monitoring is to determine whether the service behaves as intended to potentially allow the correction of the behavior. It may be set up in advance the infrastructure to allow the detections of suspicious situations. However, there may also be unexpected situations to look for as they only become evident during data stream monitoring at runtime produced by te system. The access to historic data may be key to detect relevant situations in the monitoring infrastructure. Available technologies used for monitoring offer different trade-offs, e.g. in cost and flexibility to store historic information. For instance, Temporal Graphs (TGs) can store the long-term history of an evolving system for future querying, at the expense of disk space and processing time. In contrast, Complex Event Processing (CEP) can quickly react to incoming situations efficiently, as long as the appropriate event patterns have been set up in advance. This paper presents an architecture that integrates CEP and TGs for service monitoring through the data stream produced at runtime by a system. The pros and cons of the proposed architecture for extracting and treating the monitored data are analyzed. The approach is applied on the monitoring of Quality of Service (QoS) of a data-management network case study. It is demonstrated how the architecture provides rapid detection of issues, as well as the ability to access to historical data about the state of the system to allow for a comprehensive monitoring solution. CCS CONCEPTS• Software and its engineering → Maintaining software.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.