Abstract. The paper presents methods for instrumentation and measurement of applications' memory allocation behavior over time. It provides some background about possible performance problems related to memory allocation as well as to memory allocator libraries. Then, different methods for data acquisition and representation are discussed. Finally, memory allocation tracing integrated in VampirTrace is demonstrated with a real-world HPC example application from aerodynamical simulation and optimization.
Performance analysis of applications on modern high-end Petascale systems is increasingly challenging due to the rising complexity and quantity of the computing units. This paper presents a performance analysis study with the Vampir performance analysis tool suite that examines the application behavior as well as the fundamental system properties.The study is done on the ORNL's Cray XT4 system Jaguar consisting of more than 30,000 CPU cores. We analyze the FLASH simulation code that is designed to scale towards tens of thousands of CPU cores. This situation makes it very complex to apply existing performance analysis tools. Yet, the study reveals two classes of performance problems that become relevant with very high CPU counts: MPI communication and scalable I/O. For both, solutions are presented and verified. Finally, the paper proposes improvements and extensions for event tracing tools in order to allow scalability of the tools towards higher degrees of parallelism.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.