Abhinav Sharma scite author profile

Abhinav Sharma

3Publications

0Citation Statements Received

98Citation Statements Given

How they've been cited

How they cite others

Affiliations

Indian Institute of Information Technology Design and Manufacturing Jabalpur

Publications

Order By: Most citations

Multi-Time Scale Smoothed Functional With Nesterov’s Acceleration

et al. 2021

View full text Add to dashboard Cite

Smoothed functional (SF) algorithm estimates the gradient of the stochastic optimization problem by convolution with a smoothening kernel. This process helps the algorithm to converge to a global minimum or a point close to it. We study a two-time scale SF based gradient search algorithm with Nesterov's acceleration for stochastic optimization problems. The main contribution of our work is to prove the convergence of this algorithm using the stochastic approximation theory. We propose a novel Lyapunov function to show the associated second-order ordinary differential equations' (o.d.e.) stability for a non-autonomous system. We compare our algorithm with other smoothed functional algorithms such as Quasi-Newton SF, Gradient SF and Jacobi Variant of Newton SF on two different optimization problems: first, on a simple stochastic function minimization problem, and second, on the problem of optimal routing in a queueing network. Additionally, we compared the algorithms on real weather data in a weather prediction task. Experimental results show that our algorithm performs significantly better than these baseline algorithms.

show abstract

Transition Based Discount Factor for Model Free Algorithms in Reinforcement Learning

Sharma

Gupta²,

Lakshmanan

et al. 2021

Symmetry

View full text Add to dashboard Cite

Reinforcement Learning (RL) enables an agent to learn control policies for achieving its long-term goals. One key parameter of RL algorithms is a discount factor that scales down future cost in the state’s current value estimate. This study introduces and analyses a transition-based discount factor in two model-free reinforcement learning algorithms: Q-learning and SARSA, and shows their convergence using the theory of stochastic approximation for finite state and action spaces. This causes an asymmetric discounting, favouring some transitions over others, which allows (1) faster convergence than constant discount factor variant of these algorithms, which is demonstrated by experiments on the Taxi domain and MountainCar environments; (2) provides better control over the RL agents to learn risk-averse or risk-taking policy, as demonstrated in a Cliff Walking experiment.

show abstract

Stochastic Arrow-Hurwicz Algorithm for Path Selection and Rate Allocation in Self-Backhauled mmWave Networks

et al. 2022

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Abhinav Sharma

Multi-Time Scale Smoothed Functional With Nesterov’s Acceleration

Transition Based Discount Factor for Model Free Algorithms in Reinforcement Learning

Stochastic Arrow-Hurwicz Algorithm for Path Selection and Rate Allocation in Self-Backhauled mmWave Networks

Contact Info

Product

Resources

About