We consider the problem of distributed learning, where a network of agents collectively aim to agree on a hypothesis that best explains a set of distributed observations of conditionally independent random processes. We propose a distributed algorithm and establish consistency, as well as a non-asymptotic, explicit and geometric convergence rate for the concentration of the beliefs around the set of optimal hypotheses. Additionally, if the agents interact over static networks, we provide an improved learning protocol with better scalability with respect to the number of nodes in the network.
A recent algorithmic family for distributed optimization, DIGing's, have been shown to have geometric convergence over time-varying undirected/directed graphs [1]. Nevertheless, an identical stepsize for all agents is needed. In this paper, we study the convergence rates of the Adapt-Then-Combine (ATC) variation of the DIGing algorithm under uncoordinated step-sizes. We show that the ATC variation of DIGing algorithm converges geometrically fast even if the step-sizes are different among the agents.In addition, our analysis implies that the ATC structure can accelerate convergence compared to the distributed gradient descent (DGD) structure which has been used in the original DIGing algorithm. A. Nedić (angelianedich@gmail.com) is with the ECEE Department, Arizona State University. A. Olshevsky and W. Shi ({alexols,wilburs}@bu.edu) are with the ECE Deparment, Boston University. C.A. Uribe (cauribe2@illinois.edu) is with the DRAFT 2where each function f i : R p → R is held privately by agent i to encode the agent's objective function, e.g. private data. Moreover, the complete systems seeks to solve the joint problem by exchanging information over a network. Such network might correspond to privacy settings or communication constraints.Several algorithms have been proposed for the solution of problems of the form (1) since the 1980s [9], [10]. Initial approaches for general and possibly time-varying graphs were based in distributed sub-gradients with extensions to handle stochasticity and asynchronous updates [11]-[13]. Such algorithms are flexible for the class of functions and graphs they can handle but are considerably slow. Even for strongly convex functions a diminishing step-size is required which hinders the possibility of linear rates [14]-[16]. Recent studies have achieved linear convergence rates for strongly convex function [1], [17]-[22]. Nonetheless, these methods require a careful selection of the step-sizes. Recently in [23], [24], the authors utilize the Adapt-Then-Combine strategy 1 to develop an augmented version of the distributed gradient method for distributed optimization over timeinvariant graphs. This algorithm is shown to converge for convex smooth objective functions for sufficiently small constant step-size. Moreover, no coordination on the step-sizes are needed. Additionally similar structures of the dynamic average consensus have been explored for more general classes of non-convex functions [26]. For non-convex problems the work in [27]-[29]develops a large class of distributed algorithms by utilizing varios "function-surrogate modules" thus providing a great flexibility in its use and rendering a new class of algorithms that subsumes many of the existing distributed algorithms. The authors in [23], [28] simultaneously proposed methods that track the gradient averages.In this paper we study the Adapt-Then-Combine Distributed Inexact Gradient Tracking (ATC-DIGing) algorithm for the solution of the optimization problem (1). Specifically, we show that geometric convergence rates 2 can sti...
We study the problem of distributed hypothesis testing with a network of agents where some agents repeatedly gain access to information about the correct hypothesis. The group objective is to globally agree on a joint hypothesis that best describes the observed data at all the nodes. We assume that the agents can interact with their neighbors in an unknown sequence of time-varying directed graphs.Following the pioneering work of Jadbabaie, Molavi, Sandroni, and Tahbaz-Salehi, we propose local learning dynamics which combine Bayesian updates at each node with a local aggregation rule of private agent signals. We show that these learning dynamics drive all agents to the set of hypotheses which best explain the data collected at all nodes as long as the sequence of interconnection graphs is uniformly strongly connected. Our main result establishes a non-asymptotic, explicit, geometric convergence rate for the learning dynamic.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.