We consider the problem of allocating a fixed amount of resource among nodes in a network when each node suffers a cost which is a convex function of the amount of resource allocated to it. We propose a new deterministic and distributed protocol for this problem. Our main result is that the associated convergence time for the global objective scales quadratically in the number of nodes on any sequence of time-varying undirected graphs satisfying a long-term connectivity condition.
We consider centralized and distributed mirror descent algorithms over a finite-dimensional Hilbert space, and prove that the problem variables converge to an optimizer of a possibly nonsmooth function when the step sizes are square summable but not summable. Prior literature has focused on the convergence of the function value to its optimum. However, applications from distributed optimization and learning in games require the convergence of the variables to an optimizer, which is generally not guaranteed without assuming strong convexity of the objective function. We provide numerical simulations comparing entropic mirror descent and standard subgradient methods for the robust regression problem.
√k) for strongly convex or convex objective functions, respectively, [15], [17].
Actor-critic style two-time-scale algorithms are very popular in reinforcement learning, and have seen great empirical success. However, their performance is not completely understood theoretically. In this paper, we characterize the global convergence of an online natural actor-critic algorithm in the tabular setting using a single trajectory. Our analysis applies to very general settings, as we only assume that the underlying Markov chain is ergodic under all policies (the so-called Recurrence assumption). We employ -greedy sampling in order to ensure enough exploration.For a fixed exploration parameter , we show that the natural actor critic algorithm is O( 1 T 1/4 + ) close to the global optimum after T iterations of the algorithm.By carefully diminishing the exploration parameter as the iterations proceed, we also show convergence to the global optimum at a rate of O(1/T 1/6 ).
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.