In many problems of decision making under uncertainty the system has to acquire knowledge of its environment and learn the optimal decision through its experience. Such problems may also involve the system having to arrive at the globally optimal decision. when at each instant only a subset of the entire set of possible alternatives is available. These problems can be successfully modelled and analysed by learning automata. In this paper an estimator learning algorithm. which maintains estimates of the reward characteristics of the random environment, is presented for an automaton with changing number of actions. A learning automaton using the new scheme is shown to be c-optimal. The simulation results demonstrate the fast convergence properties of the new algorithm. The results of this study can be extended to the design of other types of estimator algorithms with good convergence properties.
INDEX TERMS:Decision making under uncertainty, learning automata, random environments, estimator learning algorithms, variable action set, convergence of learning algorithms. e-optimality.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.