Cédric Renggli scite author profile

Cédric Renggli

5Publications

75Citation Statements Received

103Citation Statements Given

How they've been cited

How they cite others

103

Affiliations

ETH Zurich, University of Zurich

Publications

Order By: Most citations

The Convergence of Sparsified Gradient Methods

Alistarh¹,

Hoefler²,

Johansson³

et al. 2018

Preprint

View full text Add to dashboard Cite

Distributed training of massive machine learning models, in particular deep neural networks, via Stochastic Gradient Descent (SGD) is becoming commonplace. Several families of communication-reduction methods, such as quantization, largebatch methods, and gradient sparsification, have been proposed. To date, gradient sparsification methods-where each node sorts gradients by magnitude, and only communicates a subset of the components, accumulating the rest locally-are known to yield some of the largest practical gains. Such methods can reduce the amount of communication per step by up to three orders of magnitude, while preserving model accuracy. Yet, this family of methods currently has no theoretical justification. This is the question we address in this paper. We prove that, under analytic assumptions, sparsifying gradients by magnitude with local error correction provides convergence guarantees, for both convex and non-convex smooth objectives, for data-parallel SGD. The main insight is that sparsification methods implicitly maintain bounds on the maximum impact of stale updates, thanks to selection by magnitude. Our analysis and empirical validation also reveal that these methods do require analytical conditions to converge well, justifying existing heuristics.

show abstract

SparCML: High-Performance Sparse Communication for Machine Learning

Renggli¹,

Ashkboos²,

Aghagolzadeh³

et al. 2018

Preprint

View full text Add to dashboard Cite

Applying machine learning techniques to the quickly growing data in science and industry requires highly-scalable algorithms. Large datasets are most commonly processed "data parallel" distributed across many nodes. Each node's contribution to the overall gradient is summed using a global allreduce. This allreduce is the single communication and thus scalability bottleneck for most machine learning workloads. We observe that frequently, many gradient values are (close to) zero, leading to sparse of sparsifyable communications. To exploit this insight, we analyze, design, and implement a set of communication-efficient protocols for sparse input data, in conjunction with efficient machine learning algorithms which can leverage these primitives. Our communication protocols generalize standard collective operations, by allowing processes to contribute arbitrary sparse input data vectors. Our generic communication library, SPARCML 1 , extends MPI to support additional features, such as non-blocking (asynchronous) operations and low-precision data representations. As such, SPARCML and its techniques will form the basis of future highly-scalable machine learning frameworks.

show abstract

Which Model to Transfer? Finding the Needle in the Growing Haystack

Renggli

Pinto

Rimanic

et al. 2022

View full text Add to dashboard Cite

Observer Dependent Lossy Image Compression

Weber

Renggli

Gräbner

et al. 2021

View full text Add to dashboard Cite

Deep neural networks have recently advanced the state-of-the-art in image compression and surpassed many traditional compression algorithms. The training of such networks involves carefully trading off entropy of the latent representation against reconstruction quality. The term quality crucially depends on the observer of the images which, in the vast majority of literature, is assumed to be human. In this paper, we aim to go beyond this notion of compression quality and look at human visual perception and image classification simultaneously . To that end, we use a family of loss functions that allows to optimize deep image compression depending on the observer and to interpolate between human perceived visual quality and classification accuracy, enabling a more unified view on image compression. Our extensive experiments show that using perceptual loss functions to train a compression system preserves classification accuracy much better than traditional codecs such as BPG without requiring retraining of classifiers on compressed images. For example, compressing ImageNet to 0.25 bpp reduces Inception-ResNet classification accuracy by only 2%. At the same time, when using a human friendly loss function, the same compression system achieves competitive performance in terms of MS-SSIM. By combining these two objective functions, we show that there is a pronounced trade-off in compression quality between the human visual system and classification accuracy. Electronic supplementary material The online version of this chapter (10.1007/978-3-030-71278-5_10) contains supplementary material, which is available to authorized users.

show abstract

Building Continuous Integration Services for Machine Learning

Karlaš

Interlandi

Renggli

et al. 2020

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Cédric Renggli

The Convergence of Sparsified Gradient Methods

SparCML: High-Performance Sparse Communication for Machine Learning

Which Model to Transfer? Finding the Needle in the Growing Haystack

Observer Dependent Lossy Image Compression

Building Continuous Integration Services for Machine Learning

Contact Info

Product

Resources

About