Improving the Performance of Distributed MXNet with RDMA

Li, Mingfan; Wen, Ke; Lin, Han; Xu, Jun; Wu, Zheng; An, Hong; Chi, Mengxian

doi:10.1007/s10766-018-00623-w

Cited by 12 publications

(3 citation statements)

References 15 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…MXNet is an open-source DL framework developed by Apache. It provides a wide range of tools and libraries for building and deploying DL models, including CNNs [174,175]. MXNet supports both high-level APIs, such as Gluon, and low-level APIs, which allow for greater control over the model architecture.…”

Section: Mxnetmentioning

confidence: 99%

Convolutional Neural Networks: A Survey

Krichen

2023

Computers

158

View full text Add to dashboard Cite

Artificial intelligence (AI) has become a cornerstone of modern technology, revolutionizing industries from healthcare to finance. Convolutional neural networks (CNNs) are a subset of AI that have emerged as a powerful tool for various tasks including image recognition, speech recognition, natural language processing (NLP), and even in the field of genomics, where they have been utilized to classify DNA sequences. This paper provides a comprehensive overview of CNNs and their applications in image recognition tasks. It first introduces the fundamentals of CNNs, including the layers of CNNs, convolution operation (Conv_Op), Feat_Maps, activation functions (Activ_Func), and training methods. It then discusses several popular CNN architectures such as LeNet, AlexNet, VGG, ResNet, and InceptionNet, and compares their performance. It also examines when to use CNNs, their advantages and limitations, and provides recommendations for developers and data scientists, including preprocessing the data, choosing appropriate hyperparameters (Hyper_Param), and evaluating model performance. It further explores the existing platforms and libraries for CNNs such as TensorFlow, Keras, PyTorch, Caffe, and MXNet, and compares their features and functionalities. Moreover, it estimates the cost of using CNNs and discusses potential cost-saving strategies. Finally, it reviews recent developments in CNNs, including attention mechanisms, capsule networks, transfer learning, adversarial training, quantization and compression, and enhancing the reliability and efficiency of CNNs through formal methods. The paper is concluded by summarizing the key takeaways and discussing the future directions of CNN research and development.

show abstract

Section: Mxnetmentioning

confidence: 99%

Convolutional Neural Networks: A Survey

Krichen

2023

Computers

158

View full text Add to dashboard Cite

show abstract

“…Under certain permissions, researchers may utilize the codes directly or construct new models based on the codes. Theano, fCaffe, gTensorFlow, and MXNet [87] data-intensive operations 140 times faster on GPUs than on CPUs. TensorFlow [88] is an open-source software framework that uses data flow graphs to do numerical computations.…”

Section: Hardware/software Tools Used With Deep Learningmentioning

confidence: 99%

Architecture of Deep Learning and Its Applications

2023

IJCCCE

View full text Add to dashboard Cite

Recently, Deep Learning (DL) has accomplished enormous prosperity in various areas, like natural language processing (NLP), image processing, different medical issues and computer vision. Both Machine Learning (ML) and DL as compared to traditional methods, can learn and make better and enhanced use of datasets for feature extraction. This paper is divided into three parts. The first part introduces a detailed information about different characteristics and learning types in terms of learning problems, hybrid learning problems, statistical inference and learning techniques; besides to an exhausted historical background about feature learning and DL. The second part is about the major architectures of DL with mathematical equations and clarified examples. These architectures include Autoencoders (AEs), Generative Adversarial Networks (GANs), Deep Belief Networks (DBNs), Convolutional Neural Networks (CNNs), Recurrent Neural Networks (RNNs) and Recursive Neural Networks. The third part of this work represents an overview with detailed explanation about different applications and use-cases. Finally, the fourth part is about hardware/ software tools used with DL. Index Terms— Deep learning, Machine Learning, Neural Network, Network architecture.

show abstract

“…Currently, the most common training mode in machine learning is the iterative convergence training mode, the current mainstream distributed implementation process is to iterate gradient descent for each worker, submit the obtained local gradient to the parameter server, enter the synchronization barrier until all workers complete the iteration, and then release the synchronization barrier for the next iteration [4]. As shown in Figure 1, this parameter communication strategy that adds a synchronization barrier to ensure global consistency when updating parameters is called the overall synchronization parallel strategy.…”

Section: Overall Synchronization Parallel Strategymentioning

confidence: 99%

A hybrid parallelization approach based on workers grouping algorithm

2023

ACE

View full text Add to dashboard Cite

As the volume of model data increases, traditional machine learning is not able to train models efficiently, so distributed machine learning is gradually used in large-scale data training. Currently, commonly used distributed machine learning algorithms are based on data parallelism, and often use an overall synchronous parallel strategy when passing data, but using this strategy makes the overall training speed limited by the computation speed of the slower workers in the cluster. While the asynchronous parallel strategy maximizes the computational speed of the cluster, there is a delay in updating the parameters of the global model, which may lead to excessive computational errors or non-convergence of the model. In this paper, the author combines these two data delivery methods by grouping workers together and using synchronous parallelism for the workers in the group and asynchronous parallelism for the components for training. The experiment shows that the hybrid parallelism strategy can reduce the training time with guaranteed correctness.

show abstract

Improving the Performance of Distributed MXNet with RDMA

Cited by 12 publications

References 15 publications

Convolutional Neural Networks: A Survey

Convolutional Neural Networks: A Survey

Architecture of Deep Learning and Its Applications

A hybrid parallelization approach based on workers grouping algorithm

Contact Info

Product

Resources

About