HPC as a Service in Education

Rajaei, Hassan; Jamalian, Saba

doi:10.18260/p.25497

Cited by 1 publication

(2 citation statements)

References 17 publications

(21 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…It is worth noting that not every program can be converted in parallel, and there are cases where the parallel version exhibits poor performance compared to the serial version. With the latest computing architecture and parallel languages, tremendous improvements have been reported for machine learning, AI, a lightweight matrix-based key management protocol for IoT networks, graphics, computational photography, and computer vision by exploiting parallelization [3][4][5][6][7].…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Improved Matrix Multiplication by Changing Loop Order

Almurayh

2022

Mobile Information Systems

View full text Add to dashboard Cite

Matrix multiplication has been implemented in various programming languages, and improved performance has been reported in many articles under various settings. Matrix multiplication is of paramount interest to machine learning, a lightweight matrix-based key management protocol for IoT networks, animation, and so on. There has always been a need and an interest for improved performance in terms of algorithm implementation. In this work, the authors compared the run times of matrix multiplication in popular languages such as C++, Java, and Python. This analysis showed that Python’s implementation was poor while Java was relatively slower compared to the C++ implementation. All the aforementioned languages use a row-major scheme, and hence, there are many cache misses encountered when implemented through simple looping. In contrast, the authors show that by changing the loop order, more performance gains are possible. Moreover, we evaluated the performance of matrix multiplication by comparing the execution time under various loop settings. The authors observed tremendous performance gains due to better spatial locality. In addition, the authors also implemented a parallel version of the same algorithm using OpenMP with eight logical cores and achieved a speed-up of seven times compared to the serial implementation.

show abstract

Section: Introductionmentioning

confidence: 99%

“…In this paper, we limit our analysis to a multicore system by observing the program behavior and changing loop orderings [4][5][6][7]. We apply a parallel construct for matrix multiplication to gain better performance.…”

Section: Introductionmentioning

confidence: 99%

Improved Matrix Multiplication by Changing Loop Order

Almurayh

2022

Mobile Information Systems

View full text Add to dashboard Cite

show abstract

HPC as a Service in Education

Cited by 1 publication

References 17 publications

Improved Matrix Multiplication by Changing Loop Order

Improved Matrix Multiplication by Changing Loop Order

Contact Info

Product

Resources

About