Practice and Experience in using Parallel and Scalable Machine Learning with Heterogenous Modular Supercomputing Architectures

Riedel, Morris; Sedona, Rocco; Barakat, Chadi; Einarsson, Pétur Helgi; Hassanian, Reza; Cavallaro, Gabriele; Book, Matthias; Neukirchen, Helmut; Lintermann, Andreas

doi:10.1109/ipdpsw52791.2021.00019

Cited by 8 publications

(5 citation statements)

References 32 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Parallel learning [60] • The objective is to accelerate the learning procedure and scale up the scheme.…”

Section: Methodsmentioning

confidence: 99%

Distributed Learning for 6G–IoT Networks: A Comprehensive Survey

Das¹,

Mudi²,

Rahman³

et al. 2022

Preprint

View full text Add to dashboard Cite

<p>Smart services based on the Internet of Things (IoT) are likely to grow in popularity in the forthcoming years, necessitating the improvement of fifth-generation (5G) cellular networks upgrade of future networks from their present state. Despite the fact that the 5G cellular networks may manage a diversity of IoT services, they may not be able to fully meet the requirements of emerging smart applications due to their limitations that, in many cases, could be overcome by applying artificial intelligence (AI). Therefore, sixth–generation (6G) wireless technologies are being developed to address the limitations of 5G networks. Traditional machine learning (ML) techniques are driven in a centralized way. However, the huge volume of produced wireless data, the confidentiality concerns, and the growing computing competencies of wireless edge devices have led to the exposure of a promising solution in a decentralized way which is called distributed learning. This paper provides a comprehensive analysis of distributed learning (e.g., federated learning (FL), multi–agent reinforcement learning (MARL)–based FL framework) and how to deploy in an effective and efficient way for wireless networks. Moreover, we describe a timely comprehensive review of the role of FL in facilitating 6G enabling technologies, such as mobile edge computing, network slicing, satellite communications, terahertz links, blockchain, and semantic communications. Also, we identify and discuss several open research issues related to FL–empowered 6G wireless networks. In particular, we focus on FL for enabling an extensive range of smart services and applications. For each application, the main motivation for using FL along with the associated challenges and detailed examples for use scenarios are given. Regarding the AI techniques, we consider MARL–based FL framework tailored to the needs of future wireless networks for ensuring fast convergence and high model accuracy of large state and action spaces. Particularly, to manage the fast varying radio channels and limited radio resources (e.g., transmission power and radio spectrum) in a cellular communication environment, this article proposes a robust MARL–based FL framework to enable local users to perform distributed power allocation, mode selection, resource allocation, and interference management. Finally, the paper outlines several prospective upcoming research topics, aimed to create constructive incorporation of MARL–based FL framework for future wireless networks.</p>

show abstract

“…Parallel learning [60] • The objective is to accelerate the learning procedure and scale up the scheme.…”

Section: Methodsmentioning

confidence: 99%

Distributed Learning for 6G–IoT Networks: A Comprehensive Survey

Das¹,

Mudi²,

Rahman³

et al. 2022

Preprint

View full text Add to dashboard Cite

show abstract

“…In the GRU model, kernel_initializer is glorot_uni f orm, and the learning rate is 0.001. Since the model training runs on the JUWELS-BOOSTER [33] and DEEP-DAM [21] machines, a distribution strategy from the TensorFlow interface to distribute the training across multiple GPU with custom training loops is applied [34]. The training has been set up to use 1 to 4 GPU on one node.…”

Section: Forecasting Model Set Up and Parallel Computingmentioning

confidence: 99%

“…However, the prediction model only relies on the velocity and location time series, and the training does not include parameters such as particle size, turbulence intensity, gravity, and strain rate. The parallel computing machines JUWELS-BOOSTER and DEEP-DAM [21] from the Jülich Supercomputer Centre are used to accelerate the GRU model training process. Hence, this manuscript is organized as follows.…”

Section: Introductionmentioning

confidence: 99%

Turbulent Flow Prediction-Simulation: Strained Flow with Initial Isotropic Condition Using a GRU Model Trained by an Experimental Lagrangian Framework, with Emphasis on Hyperparameter Optimization

Hassanian,

Aach,

Lintermann

et al. 2024

Fluids

Self Cite

View full text Add to dashboard Cite

This study presents a novel approach to using a gated recurrent unit (GRU) model, a deep neural network, to predict turbulent flows in a Lagrangian framework. The emerging velocity field is predicted based on experimental data from a strained turbulent flow, which was initially a nearly homogeneous isotropic turbulent flow at the measurement area. The distorted turbulent flow has a Taylor microscale Reynolds number in the range of 100 < Reλ < 152 before creating the strain and is strained with a mean strain rate of 4 s−1 in the Y direction. The measurement is conducted in the presence of gravity consequent to the actual condition, an effect that is usually neglected and has not been investigated in most numerical studies. A Lagrangian particle tracking technique is used to extract the flow characterizations. It is used to assess the capability of the GRU model to forecast the unknown turbulent flow pattern affected by distortion and gravity using spatiotemporal input data. Using the flow track’s location (spatial) and time (temporal) highlights the model’s superiority. The suggested approach provides the possibility to predict the emerging pattern of the strained turbulent flow properties observed in many natural and artificial phenomena. In order to optimize the consumed computing, hyperparameter optimization (HPO) is used to improve the GRU model performance by 14–20%. Model training and inference run on the high-performance computing (HPC) JUWELS-BOOSTER and DEEP-DAM systems at the Jülich Supercomputing Centre, and the code speed-up on these machines is measured. The proposed model produces accurate predictions for turbulent flows in the Lagrangian view with a mean absolute error (MAE) of 0.001 and an R2 score of 0.993.

show abstract

“…The MinMaxScaler is a type of scaler that scales the minimum and maximum values to be 0 and 1, respectively [30]. Since the modeling was implemented on the DEEP-DAM module [31] parallel computing machine, we have applied a distributed strategy application programming interface from the TensorFlow platform abstraction to distribute the training across multiple custom training loops [32]. The strategy has been set up with one to four GPUs on one node.…”

Section: Lstm and Gru Model Set Upmentioning

confidence: 99%

Deep Learning Forecasts a Strained Turbulent Flow Velocity Field in Temporal Lagrangian Framework: Comparison of LSTM and GRU

Hassanian

Helgadóttir

Riedel

2022

Fluids

Self Cite

View full text Add to dashboard Cite

The subject of this study presents an employed method in deep learning to create a model and predict the following period of turbulent flow velocity. The applied data in this study are extracted datasets from simulated turbulent flow in the laboratory with the Taylor microscale Reynolds numbers in the range of 90 < Rλ< 110. The flow has been seeded with tracer particles. The turbulent intensity of the flow is created and controlled by eight impellers placed in a turbulence facility. The flow deformation has been conducted via two circular flat plates moving toward each other in the center of the tank. The Lagrangian particle-tracking method has been applied to measure the flow features. The data have been processed to extract the flow properties. Since the dataset is sequential, it is used to train long short-term memory and gated recurrent unit model. The parallel computing machine DEEP-DAM module from Juelich supercomputer center has been applied to accelerate the model. The predicted output was assessed and validated by the rest of the data from the experiment for the following period. The results from this approach display accurate prediction outcomes that could be developed further for more extensive data documentation and used to assist in similar applications. The mean average error and R2 score range from 0.001–0.002 and 0.9839–0.9873, respectively, for both models with two distinct training data ratios. Using GPUs increases the LSTM performance speed more than applications with no GPUs.

show abstract

Practice and Experience in using Parallel and Scalable Machine Learning with Heterogenous Modular Supercomputing Architectures

Cited by 8 publications

References 32 publications

Distributed Learning for 6G–IoT Networks: A Comprehensive Survey

Distributed Learning for 6G–IoT Networks: A Comprehensive Survey

Turbulent Flow Prediction-Simulation: Strained Flow with Initial Isotropic Condition Using a GRU Model Trained by an Experimental Lagrangian Framework, with Emphasis on Hyperparameter Optimization

Deep Learning Forecasts a Strained Turbulent Flow Velocity Field in Temporal Lagrangian Framework: Comparison of LSTM and GRU

Contact Info

Product

Resources

About