Near Optimal Coded Data Shuffling for Distributed Learning

Attia, Mohamed Adel; Tandon, Ravi

doi:10.1109/tit.2019.2926704

Cited by 25 publications

(19 citation statements)

References 52 publications

(105 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Attia and Tandon presented a theoretic lower bound on the communication overhead for data shuffling as a function of the number of workers, number of data points, and the available storage per node. They proposed a coded communication scheme to show that the communication overhead is within a multiplicative factor according to the theoretic lower bound . Elmahdy and Mohajer considered the data shuffling problem that a master node communicates a set of files to a set of worker nodes through a shared link.…”

Section: Related Workmentioning

confidence: 99%

“…They proposed a coded communication scheme to show that the communication overhead is within a multiplicative factor according to the theoretic lower bound. 30 Elmahdy and Mohajer considered the data shuffling problem that a master node communicates a set of files to a set of worker nodes through a shared link. They proposed a deterministic and systematic coded shuffling scheme to find out the exact rate of cache files.…”

Section: Related Work Comparisonsmentioning

confidence: 99%

See 1 more Smart Citation

Performance enhancement for iterative data computing with in‐memory concurrent processing

Wen

Chen

Chiu

et al. 2019

Concurrency and Computation

View full text Add to dashboard Cite

Summary The big data era has resulted in the development of several data analysis tools. Spark is a type of in‐memory processing fitted iteration and interactive data mining tool. This tool possesses higher data‐processing performance than MapReduce, which is an offline storage mechanism. However, some disadvantages of in‐memory processing, such as massive in‐memory data requirements, cause cross‐node data transfer that result in a long computation time. The performance of the process can be improved if the in‐memory process is executed with fewer shuffle instructions. Therefore, this study aims to enhance the performance of iterative application through instruction replacement. Three empirical research cases with diverse datasets and iterations are used to modify the program. We adopt a strategy of downloading a small resilient distributed dataset and replacing the shuffle‐included instructions to shorten the processing time with an automated code replacement by using exhaustively code matching. The experimental results reveal an improvement of up to 39% in the execution time compared with the existing in‐memory processing programs with various dataset sizes.

show abstract

Section: Related Workmentioning

confidence: 99%

Section: Related Work Comparisonsmentioning

confidence: 99%

Performance enhancement for iterative data computing with in‐memory concurrent processing

Wen

Chen

Chiu

et al. 2019

Concurrency and Computation

View full text Add to dashboard Cite

show abstract

“…Inspired by the achievable and converse bounds for the single-bottleneck-link caching problem in [8]-[10], the authors in [11] then proposed a general coded data shuffling scheme, which was shown to be order optimality to within a factor of 2 under the constraint of uncoded storage. Also in [11], the authors improved the performance of the general coded shuffling scheme by introducing an aligned coded delivery, which was shown to be optimal under the constraint of uncoded storageRecently, inspired by the improved data shuffling scheme in [11], the authors in [12] proposed a linear coding scheme based on interference alignment, which achieves the optimal worstcase communication load under the constraint of uncoded storage for all system parameters. In addition, under the constraint of uncoded storage, the proposed coded data shuffling scheme in [12] was shown to be optimal for any shuffles (not just for the worst-case) when q = 1.…”

mentioning

confidence: 99%

mentioning

confidence: 99%

“…Recently, inspired by the improved data shuffling scheme in [11], the authors in [12] proposed a linear coding scheme based on interference alignment, which achieves the optimal worstcase communication load under the constraint of uncoded storage for all system parameters. In addition, under the constraint of uncoded storage, the proposed coded data shuffling scheme in [12] was shown to be optimal for any shuffles (not just for the worst-case) when q = 1.…”

mentioning

confidence: 99%

See 1 more Smart Citation

Fundamental Limits of Decentralized Data Shuffling

Wan

Tuninetti

et al. 2020

IEEE Trans. Inform. Theory

View full text Add to dashboard Cite

Data shuffling of training data among different computing nodes (workers) has been identified as a core element to improve the statistical performance of modern large scale machine learning algorithms.Data shuffling is often considered as one of the most significant bottlenecks in such systems due to the heavy communication load. Under a master-worker architecture (where a master has access to the entire dataset and only communication between the master and the workers is allowed) coding has been recently proved to considerably reduce the communication load. This work considers a different communication paradigm referred to as decentralized data shuffling, where workers are allowed to communicate with one another via a shared link. The decentralized data shuffling problem has two phases: workers communicate with each other during the data shuffling phase, and then workers update their stored content during the storage phase. For the case of uncoded storage (i.e., each worker directly A short version of this paper was presented scheme for which the master simply transmits the missing but required data to the workers by directly broadcasting the missing bits over the shared link.The centralized coded data shuffling scheme with coordinated (i.e., deterministic) uncoded storage update phase was originally proposed in [6], [7] to further reduce the communication load for the worst-case shuffles compared to [3]. The proposed schemes in [6], [7] are optimal under the constraint of uncoded storage for the cases where there is no extra memory for each worker (i.e., q = 1) or there are less than or equal to three workers in the systems. Inspired by the achievable and converse bounds for the single-bottleneck-link caching problem in [8]-[10], the authors in [11] then proposed a general coded data shuffling scheme, which was shown to be order optimality to within a factor of 2 under the constraint of uncoded storage. Also in [11], the authors improved the performance of the general coded shuffling scheme by introducing an aligned coded delivery, which was shown to be optimal under the constraint of uncoded storageRecently, inspired by the improved data shuffling scheme in [11], the authors in [12] proposed a linear coding scheme based on interference alignment, which achieves the optimal worstcase communication load under the constraint of uncoded storage for all system parameters. In addition, under the constraint of uncoded storage, the proposed coded data shuffling scheme in [12] was shown to be optimal for any shuffles (not just for the worst-case) when q = 1. B. Decentralized Data ShufflingAn important limitation of the centralized framework is the assumption that workers can only receive packets from the master. Since the entire data set is stored in a decentralized fashion across the workers at each epoch of the distributed learning algorithm, the master may not be needed in the data shuffling phase if workers can communicate with each other (e.g., [1]). In addition, the communication among workers can be much more ef...

show abstract

Fault‐tolerant quantum chemical calculations with improved machine‐learning models

Yuan,

Zhou,

et al. 2024

J Comput Chem

View full text Add to dashboard Cite

Easy and effective usage of computational resources is crucial for scientific calculations. Following our recent work of machine‐learning (ML) assisted scheduling optimization [J. Comput. Chem. 2023, 44, 1174], we further propose (1) the improved ML models for the better predictions of computational loads, and as such, more elaborate load‐balancing calculations can be expected; (2) the idea of coded computation, that is, the integration of gradient coding, in order to introduce fault tolerance during the distributed calculations; and (3) their applications together with re‐normalized exciton model with time‐dependent density functional theory (REM‐TDDFT) for calculating the excited states. Illustrated benchmark calculations include P38 protein, and solvent model with one or several excitable centers. The results show that the improved ML‐assisted coded calculations can further improve the load‐balancing and cluster utilization, owing primarily profit in fault tolerance that aims at the automated quantum chemical calculations for both ground and excited states.

show abstract

Near Optimal Coded Data Shuffling for Distributed Learning

Cited by 25 publications

References 52 publications

Performance enhancement for iterative data computing with in‐memory concurrent processing

Performance enhancement for iterative data computing with in‐memory concurrent processing

Fundamental Limits of Decentralized Data Shuffling

Fault‐tolerant quantum chemical calculations with improved machine‐learning models

Contact Info

Product

Resources

About