This paper gives an overview about the Score-P performance measurement infrastructure which is being jointly developed by leading HPC performance tools groups. It motivates the advantages of the joint undertaking from both the developer and the user perspectives, and presents the design and components of the newly developed Score-P performance measurement infrastructure. Furthermore, it contains first evaluation results in comparison with existing performance tools and presents an outlook to the long-term cooperative development of the new system.
We have optimized and parallelized the GENEHUNTER-TWOLOCUS program that allows to perform linkage analysis with two trait loci in the multimarker context. The optimization of the serial program, before parallelization, results in a speedup of a factor of more than 10. The parallelization affects the twolocus-score calculation, which is predominant in terms of computation time. We obtain perfect speedup, that is, the computation time decreases exactly by a factor of the number of processors. In addition, twolocus LOD and NPL scores are now calculated for varying genetic positions of both disease loci, not just one locus varied and the position of the other disease locus fixed, as before. This results in easily interpretable 3-D plots. We have reanalyzed a pedigree with hypercholesterolemia using our new version of GENEHUNTER-TWOLOCUS. Whereas originally, two individuals had to be discarded due to excessive computation-time demands, the entire 17-bit pedigree could now be analyzed as a whole. We obtain a two-trait-locus LOD score of 5.49 under a multiplicative model, compared to LOD scores of 3.08 and 2.87 under a heterogeneity and additive model, respectively. This further increases evidence for linkage to both 1p36.1 -p35 and 13q22 -q32 regions, and corroborates the hypothesis that the two genes act in a multiplicative way on LDL cholesterol level. Furthermore, we compare the computation times for two-traitlocus analysis needed by the programs GENEHUNTER-TWOLOCUS, TLINKAGE, and SUPERLINK. Altogether, our algorithmic improvements of GENEHUNTER-TWOLOCUS allow researchers to analyze complex diseases under realistic two-trait-locus models with pedigrees of reasonable size and using many markers.
The rapidly growing number of cores on modern supercomputers imposes scalability demands not only on applications but also on the software tools needed for their development. At the same time, increasing application and system complexity makes the optimization of parallel codes more difficult, creating a need for scalable performance-analysis technology with advanced functionality. However, delivering such an expensive technology can hardly be accomplished by single tool developers and requires higher degrees of collaboration within the HPC community. The unified performance-measurement system Score-P is a joint effort of several academic performance-tool builders, funded under the BMBF program HPC-Software für skalierbare Parallelrechner in the SILC project (Skalierbare Infrastruktur zur automatischen Leistungsanalyse paralleler Codes). It is being developed with the objective of creating a common basis for several complementary optimization tools in the service of enhanced scalability, improved interoperability, and reduced maintenance cost.
Abstract.Version 3.0 of the OpenMP specification introduced the task construct for the explicit expression of dynamic task parallelism. Although automated load-balancing capabilities make it an attractive parallelization approach for programmers, the difficulty of integrating this new dimension of parallelism into traditional models of performance data has so far prevented the emergence of appropriate performance tools. Based on our earlier work, where we have introduced instrumentation for task-based programs, we present initial concepts for analyzing the data delivered by this instrumentation. We define three typical performance problems related to tasking and show how they can be visually explored using event traces. Special emphasis is placed on the event model used to capture the execution of task instances and on how the time consumed by the program is mapped onto tasks in the most meaningful way. We illustrate our approach with practical examples.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.