Due to the lack of enough generalization in the statespace, common methods in Reinforcement Learning (RL) suffer from slow learning speed especially in the early learning trials. This paper introduces a model-based method in discrete statespaces for increasing the learning speed in terms of required experience (but not required computational time) by exploiting generalization in the experiences of the subspaces. A subspace is formed by choosing a subset of features in the original state representation (full-space). Generalization and faster learning in a subspace are due to many-to-one mapping of experiences from the full-space to each state in the subspace. Nevertheless, due to inherent perceptual aliasing in the subspaces, the policy suggested by each subspace does not generally converge to the optimal policy. Our approach, called Model Based Learning with Subspaces (MoBLeS), calculates confidence intervals of the estimated Q-values in the full-space and in the subspaces. These confidence intervals are used in the decision making, such that the agent benefits the most from the possible generalization while avoiding from detriment of the perceptual aliasing in the subspaces. Convergence of MoBLeS to the optimal policy is theoretically investigated. Additionally, we show through several experiments that MoBLeS improves the learning speed in the early trials.Index Terms-reinforcement learning, generalization in subspaces, learning speed, curse of dimensionality, value interval estimation arXiv:1710.08012v2 [stat.ML]
The human brain excels at constructing and using abstractions, such as rules, or concepts. Here, in two fMRI experiments, we demonstrate a mechanism of abstraction built upon the valuation of sensory features. Human volunteers learned novel association rules based on simple visual features. Reinforcement-learning algorithms revealed that, with learning, high-value abstract representations increasingly guided participant behaviour, resulting in better choices and higher subjective confidence. We also found that the brain area computing value signals - the ventromedial prefrontal cortex - prioritized and selected latent task elements during abstraction, both locally and through its connection to the visual cortex. Such a coding scheme predicts a causal role for valuation. Hence, in a second experiment, we used multivoxel neural reinforcement to test for the causality of feature valuation in the sensory cortex, as a mechanism of abstraction. Tagging the neural representation of a task feature with rewards evoked abstraction-based decisions. Together, these findings provide a novel interpretation of value as a goal-dependent, key factor in forging abstract representations.
Detecting the deadlock is one of the important problems in distributed systems. In this paper we proposed a distributed deadlock detection algorithm. In our algorithm the chance of phantom deadlocks detection is minimized by using a new approach and some improvements to resolution of deadlocks. Our algorithm can manage the simultaneous execution of the algorithm by nodes involved in deadlocks, prevents the detection of same deadlocks and minimize the number of useless messages in simultaneous execution of the algorithm by giving the priorities to the processes. In our proposed algorithm deadlocks are resolved as soon as they detected by its unique characteristic without creating and propagating of token to erase the memories of processes.
In this paper we propose a distributed deadlock detection algorithm on the basis of history-based edge chasing which resolves the deadlock as soon as detects it without waiting for the probe to return back. This action reduces the average persistence time of the deadlock in compare with other similar algorithms in distributed systems. Our proposed algorithm detects and resolves the deadlocks, whether the initiator directly or indirectly involves them and the useless messages in simultaneous execution of the algorithm are avoided by giving the priorities to the processes. It can also manage the simultaneous execution of the algorithm by its unique characteristic, using other nodes involved in deadlocks and prevents the detection of same deadlocks. We also minimized the information being carried in the probe offering a method to encode the information existed in probe in our suggested algorithm. Our algorithm is comparable with the best algorithms in case of its time complexity, the number of the messages and its efficiency.
Abstractions are critical for flexible behaviours and efficient learning. However, how the brain forgoes the sensory dimension to forge abstract entities remains elusive. Here, in two fMRI experiments, we demonstrate a mechanism of abstraction built upon valuation of task-relevant sensory features. Human volunteers learned hidden association rules between visual features. Computational modelling of participants' choice data with mixture-of-experts reinforcement learning algorithms revealed that, with learning, emerging high-value abstract representations increasingly guided behaviour. Moreover, the brain area encoding value signals - the ventromedial prefrontal cortex - also prioritized and selected latent task elements, both locally and through its connection to visual cortex. In a second experiment, we used multivoxel neural reinforcement to show how reward-tagging the neural sensory representation of a task's feature evoked abstraction-based decisions. Our findings redefine the logic of valuation as a goal-dependent, key factor in constructing the abstract representations that govern intelligent behaviour.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citationsâcitations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with đ for researchers
Part of the Research Solutions Family.