V.V. Phansalkar scite author profile

A feedforward network composed of units of teams of parameterized learning automata is considered as a model of a reinforcement learning system. The internal state vector of each learning automaton is updated using an algorithm consisting of a gradient-following term and a random perturbation term. It is shown that the algorithm weakly converges to a solution of the Langevin equation, implying that the algorithm globally maximizes an appropriate function. The algorithm is decentralized, and the units do not have any information exchange during updating. Simulation results on common payoff games and pattern recognition problems show that reasonable rates of convergence can be obtained.

show abstract

Local and Global Optimization Algorithms for Generalized Learning Automata

Phansalkar

Thathachar

1995

Neural Computation

View full text Add to dashboard Cite

This paper analyzes the long-term behavior of the REINFORCE and related algorithms (Williams 1986(Williams , 1988(Williams , 1992 for generalized learning automata (Narendra and Thathachar 1989) for the associative reinforcement learning problem (Barto and Anandan 1985). The learning system considered here is a feedforward connectionist network of generalized learning automata units. We show that REINFORCE is a gradient ascent algorithm but can exhibit unbounded behavior. A modified version of this algorithm, based on constrained optimization techniques, is suggested to overcome this disadvantage. The modified algorithm is shown to exhibit local optimization properties. A global version of the algorithm, based on constant temperature heat bath techniques, is also described and shown to converge to the global maximum. All algorithms are analyzed using weak convergence techniques.

show abstract

Convergence of teams and hierarchies of learning automata in connectionist systems

Thathachar

Phansalkar

1995

IEEE Trans. Syst., Man, Cybern.

View full text Add to dashboard Cite

Learning algorithms for feedforward connectionist systems in a reinforcement learning environment are developed and analyzed in this paper. The connectionist system is made of units of groups of learning automata. The learning algorithm used is the LR-I and the asymptotic behavior of this algorithm is approximated by an Ordinary Differential Equation (ODE) for low values of the learning parameter. This is done using weak convergence techniques. The reinforcement learning model is used to pose the goal of the system as a constrained optimization problem. It is shown that the ODE, and hence the algorithm exhibits local convergence properties, converging to local solutions of the related optimization problem. The three layer pattern recognition network is used as an example to show that the system does behave as predicted and reasonable rates of convergence are obtained. Simulations also show that the algorithm is robust to noise.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

V.V. Phansalkar

Decentralized learning of Nash equilibria in multi-person stochastic games with incomplete information

Analysis of the back-propagation algorithm with momentum

Learning the global maximum with parameterized learning automata

Local and Global Optimization Algorithms for Generalized Learning Automata

Convergence of teams and hierarchies of learning automata in connectionist systems

Contact Info

Product

Resources

About