On Linear Learning with Manycore Processors

Wszola, Eliza; Mendler-Dünner, Celestine; Jäggi, Martin; Püschel, Markus

doi:10.1109/hipc.2019.00032

Cited by 2 publications

(1 citation statement)

References 20 publications

(20 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Shared-memory architectures span compute resources such as many-core CPUs and hardware accelerators, which have many compute nodes that share the same physical memory space (Figure 1, left). Even though the nodes share the same physical memory, non-uniform memory access (NUMA) designs and deep cache hierarchies in today's architectures invalidate the assumption that nodes have immediate access to a memory region (see, e.g., [20][21][22][23][24][25] that revisit algorithms and take this issue into account). In shared-memory architectures, all the serialization primitives are in the same physical space.…”

Section: Architecturesmentioning

confidence: 99%

Advances in Asynchronous Parallel and Distributed Optimization

Assran¹,

Aytekin²,

Feyzmahdavian³

et al. 2020

Preprint

View full text Add to dashboard Cite

Motivated by large-scale optimization problems arising in the context of machine learning, there have been several advances in the study of asynchronous parallel and distributed optimization methods during the past decade. Asynchronous methods do not require all processors to maintain a consistent view of the optimization variables. Consequently, they generally can make more efficient use of computational resources than synchronous methods, and they are not sensitive to issues like stragglers (i.e., slow nodes) and unreliable communication links. Mathematical modeling of asynchronous methods involves proper accounting of information delays, which makes their analysis challenging. This article reviews recent developments in the design and analysis of asynchronous optimization methods, covering both centralized methods, where all processors update a master copy of the optimization variables, and decentralized methods, where each processor maintains a local copy of the variables. The analysis provides insights as to how the degree of asynchrony impacts convergence rates, especially in stochastic optimization methods.

show abstract