The goal of aggregating the base classifiers is to achieve an aggregated classifier that has a higher resolution than individual classifiers. Random forest is one of the types of ensemble learning methods that have been considered more than other ensemble learning methods due to its simple structure, ease of understanding, as well as higher efficiency than similar methods. The ability and efficiency of classical methods are always influenced by the data. The capabilities of independence from the data domain, and the ability to adapt to problem space conditions, are the most challenging issues about the different types of classifiers. In this paper, a method based on learning automata is presented, through which the adaptive capabilities of the problem space, as well as the independence of the data domain, are added to the random forest to increase its efficiency. Using the idea of reinforcement learning in the random forest has made it possible to address issues with data that have a dynamic behaviour. Dynamic behaviour refers to the variability in the behaviour of a data sample in different domains. Therefore, to evaluate the proposed method, and to create an environment with dynamic behaviour, different domains of data have been considered. In the proposed method, the idea is added to the random forest using learning automata. The reason for this choice is the simple structure of the learning automata and the compatibility of the learning automata with the problem space. The evaluation results confirm the improvement of random forest efficiency.
Human activity recognition (HAR) has been of interest in recent years due to the growing demands in many areas. Applications of HAR include healthcare systems to monitor activities of daily living (ADL) (primarily due to the rapidly growing population of the elderly), security environments for automatic recognition of abnormal activities to notify the relevant authorities, and improve human interaction with the computer. HAR research can be classified according to the data acquisition tools (sensors or cameras), methods (handcrafted methods or deep learning methods), and the complexity of the activity. In the healthcare system, HAR based on wearable sensors is a new technology that consists of three essential parts worth examining: the location of the wearable sensor, data preprocessing (feature calculation, extraction, and selection), and the recognition methods. This survey aims to examine all aspects of HAR based on wearable sensors, thus analyzing the applications, challenges, datasets, approaches, and components. It also provides coherent categorizations, purposeful comparisons, and systematic architecture. Then, this paper performs qualitative evaluations by criteria considered in this system on the approaches and makes available comprehensive reviews of the HAR system. Therefore, this survey is more extensive and coherent than recent surveys in this field.
a b s t r a c tLearning automata (LA) were recently shown to be valuable tools for designing Multi-Agent Reinforcement Learning algorithms and are able to control the stochastic games. In this paper, the concepts of stigmergy and entropy are imported into learning automata based multi-agent systems with the purpose of providing a simple framework for interaction and coordination in multi-agent systems and speeding up the learning process. The multi-agent system considered in this paper is designed to find optimal policies in Markov games. We consider several dummy agents that walk around in the states of the environment, make local learning automaton active, and bring information so that the involved learning automaton can update their local state. The entropy of the probability vector for the learning automata of the next state is used to determine reward or penalty for the actions of learning automata. The experimental results have shown that in terms of the speed of reaching the optimal policy, the proposed algorithm has better learning performance than other learning algorithms.
Markov games, as the generalization of Markov decision processes to the multi agent case, have long been used for modeling multi-agent systems. The Markov game view of MAS is considered as a sequence of games having to be played by multiple players while each game belongs to a different state of the environment. In this paper, several learning automata based multi-agent system algorithms for finding optimal policies in Markov games are proposed. In all of the proposed algorithms, each agent residing in every state of the environment is equipped with a learning automaton. Every joint-action of the set of learning automata in each state corresponds to moving to one of the adjacent states. Each agent moves from one state to another and tries to reach the goal state. The actions taken by learning automata along the path traversed by the agent are then rewarded or penalized based on the comparison of the average reward received by agent per move along the path with a dynamic threshold. In the second group of the proposed algorithms, the concept of entropy has been imported into learning automata based multi-agent systems to improve the performance of the algorithms. To evaluate the performance of the proposed algorithms, computer experiments have been conducted. The results of experiments have shown that the proposed algorithms perform better than the existing algorithms in terms of speed and accuracy of reaching the optimal policy.
Keywords: Markov games, Multi-Agent Systems, Learning Automata, Optimal Policy
1-INTRODUCTIONThere are several models proposed in the literatures for multi-agent systems (MAS) based on Markov models. One of these models is Markov Game (MG) which is the generalization of the Markov decision process (MDP)to the Multiple agent [1]. The Markov game view of MAS is considered as a sequence of games having to be played by multiple players while each game belongs to a different state of the environment [2]. In a MG, actions are the result of joint action selection of all agents, while rewards and the state transitions depend on these joint actions [3]. As a special case, when only is one state assumed, the Markov game is known as a repeated normal form game in game theory [1]. In addition, when only is one agent assumed, the Markov game is known as MDP. In multi-agent system research, two main perspectives are found in the literature; the cooperative and non-cooperative perspective. In cooperative MASs, the agents pursue a common goal and the agents can be built expect benevolent intentions from other agents [4]. In contrast, a non-cooperative MAS setting has non-aligned goals, and individual agents try to obtain only to maximize their own profits. In multi-agent systems, the need for learning and adaption is essentially caused by the fact that the environment of agent is dynamic and just empirically observable while the environment (the reward functions and the transition states) is unknown. Hence, the reinforcement learning methods may be applied in MAS to find an optimal policy in MG...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citationsācitations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.