Yehia Arafa scite author profile

The last decade has seen a shift in the computer systems industry where heterogeneous computing has become prevalent. Graphics Processing Units (GPUs) are now present in supercomputers to mobile phones and tablets. GPUs are used for graphics operations as well as general-purpose computing (GPGPUs) to boost the performance of compute-intensive applications. However, the percentage of undisclosed characteristics beyond what vendors provide is not small. In this paper, we introduce a very low overhead and portable analysis for exposing the latency of each instruction executing in the GPU pipeline(s) and the access overhead of the various memory hierarchies found in GPUs at the micro-architecture level. Furthermore, we show the impact of the various optimizations the CUDA compiler can perform over the various latencies. We perform our evaluation on seven different high-end NVIDIA GPUs from five different generations/architectures: Kepler, Maxwell, Pascal, Volta, and Turing. The results in this paper can help architects to have an accurate characterization of the latencies of these GPUs, which will help in modeling the hardware accurately. Also, software developers can perform informed optimizations to their applications.

show abstract

Fast, accurate, and scalable memory modeling of GPGPUs using reuse profiles

Arafa

Badawy

Chennupati

et al. 2020

View full text Add to dashboard Cite

Demystifying the Nvidia Ampere Architecture through Microbenchmarking and Instruction-level Analysis

Abdelkhalik

Arafa

Santhi

et al. 2022

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Yehia Arafa

Verified instruction-level energy consumption measurement for NVIDIA GPUs

PPT-GPU: Scalable GPU Performance Modeling

Low Overhead Instruction Latency Characterization for NVIDIA GPGPUs

Fast, accurate, and scalable memory modeling of GPGPUs using reuse profiles

Demystifying the Nvidia Ampere Architecture through Microbenchmarking and Instruction-level Analysis

Contact Info

Product

Resources

About