Smoothed functional (SF) algorithm estimates the gradient of the stochastic optimization problem by convolution with a smoothening kernel. This process helps the algorithm to converge to a global minimum or a point close to it. We study a two-time scale SF based gradient search algorithm with Nesterov's acceleration for stochastic optimization problems. The main contribution of our work is to prove the convergence of this algorithm using the stochastic approximation theory. We propose a novel Lyapunov function to show the associated second-order ordinary differential equations' (o.d.e.) stability for a non-autonomous system. We compare our algorithm with other smoothed functional algorithms such as Quasi-Newton SF, Gradient SF and Jacobi Variant of Newton SF on two different optimization problems: first, on a simple stochastic function minimization problem, and second, on the problem of optimal routing in a queueing network. Additionally, we compared the algorithms on real weather data in a weather prediction task. Experimental results show that our algorithm performs significantly better than these baseline algorithms.
Reinforcement Learning (RL) enables an agent to learn control policies for achieving its long-term goals. One key parameter of RL algorithms is a discount factor that scales down future cost in the state’s current value estimate. This study introduces and analyses a transition-based discount factor in two model-free reinforcement learning algorithms: Q-learning and SARSA, and shows their convergence using the theory of stochastic approximation for finite state and action spaces. This causes an asymmetric discounting, favouring some transitions over others, which allows (1) faster convergence than constant discount factor variant of these algorithms, which is demonstrated by experiments on the Taxi domain and MountainCar environments; (2) provides better control over the RL agents to learn risk-averse or risk-taking policy, as demonstrated in a Cliff Walking experiment.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.