Thinh T. Doan scite author profile

We consider the problem of allocating a fixed amount of resource among nodes in a network when each node suffers a cost which is a convex function of the amount of resource allocated to it. We propose a new deterministic and distributed protocol for this problem. Our main result is that the associated convergence time for the global objective scales quadratically in the number of nodes on any sequence of time-varying undirected graphs satisfying a long-term connectivity condition.

show abstract

Convergence of the Iterates in Mirror Descent Methods

Doan

Bose

Nguyễn

et al. 2019

IEEE Control Syst. Lett.

View full text Add to dashboard Cite

We consider centralized and distributed mirror descent algorithms over a finite-dimensional Hilbert space, and prove that the problem variables converge to an optimizer of a possibly nonsmooth function when the step sizes are square summable but not summable. Prior literature has focused on the convergence of the function value to its optimum. However, applications from distributed optimization and learning in games require the convergence of the variables to an optimizer, which is generally not guaranteed without assuming strong convexity of the objective function. We provide numerical simulations comparing entropic mirror descent and standard subgradient methods for the robust regression problem. √k) for strongly convex or convex objective functions, respectively, [15], [17].

show abstract

Finite Sample Analysis of Two-Time-Scale Natural Actor-Critic Algorithm

Khodadadian¹,

Doan²,

Romberg³

et al. 2021

Preprint

View full text Add to dashboard Cite

Actor-critic style two-time-scale algorithms are very popular in reinforcement learning, and have seen great empirical success. However, their performance is not completely understood theoretically. In this paper, we characterize the global convergence of an online natural actor-critic algorithm in the tabular setting using a single trajectory. Our analysis applies to very general settings, as we only assume that the underlying Markov chain is ergodic under all policies (the so-called Recurrence assumption). We employ -greedy sampling in order to ensure enough exploration.For a fixed exploration parameter , we show that the natural actor critic algorithm is O( 1 T 1/4 + ) close to the global optimum after T iterations of the algorithm.By carefully diminishing the exploration parameter as the iterations proceed, we also show convergence to the global optimum at a rate of O(1/T 1/6 ).

show abstract

Finite-sample analysis of nonlinear stochastic approximation with applications in reinforcement learning

et al. 2022

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Thinh T. Doan

Convergence Rates of Distributed Gradient Methods Under Random Quantization: A Stochastic Approximation Approach

Distributed resource allocation on dynamic networks in quadratic time

Convergence of the Iterates in Mirror Descent Methods

Finite Sample Analysis of Two-Time-Scale Natural Actor-Critic Algorithm

Finite-sample analysis of nonlinear stochastic approximation with applications in reinforcement learning

Contact Info

Product

Resources

About