L-GreCo: An Efficient and General Framework for Layerwise-Adaptive Gradient Compression

Alimohammadi, Mohammadreza; Markov, Ilia; Frantar, Elias; Alistarh, Dan

doi:10.48550/arxiv.2210.17357

Cited by 1 publication

(1 citation statement)

References 0 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…For instance, exploring data and model parallelism has led to significant advancements in the scalability and efficiency of training large neural networks [9]. Additionally, researchers have been investigating the impact of communication strategies, such as gradient compression [10] and decentralized optimization [11], to reduce the communication overhead and latency associated with the distributed training process. Furthermore, novel approaches, such as federated learning [12], have been proposed to enable collaborative learning among multiple devices while preserving data privacy.…”

Section: Introductionmentioning

confidence: 99%

Comparative Study on Distributed Lightweight Deep Learning Models for Road Pothole Detection

Tahir

Jung

2023

Sensors

View full text Add to dashboard Cite

This paper delves into image detection based on distributed deep-learning techniques for intelligent traffic systems or self-driving cars. The accuracy and precision of neural networks deployed on edge devices (e.g., CCTV (closed-circuit television) for road surveillance) with small datasets may be compromised, leading to the misjudgment of targets. To address this challenge, TensorFlow and PyTorch were used to initialize various distributed model parallel and data parallel techniques. Despite the success of these techniques, communication constraints were observed along with certain speed issues. As a result, a hybrid pipeline was proposed, combining both dataset and model distribution through an all-reduced algorithm and NVlinks to prevent miscommunication among gradients. The proposed approach was tested on both an edge cluster and Google cluster environment, demonstrating superior performance compared to other test settings, with the quality of the bounding box detection system meeting expectations with increased reliability. Performance metrics, including total training time, images/second, cross-entropy loss, and total loss against the number of the epoch, were evaluated, revealing a robust competition between TensorFlow and PyTorch. The PyTorch environment’s hybrid pipeline outperformed other test settings.

show abstract

Section: Introductionmentioning

confidence: 99%