Dounia Khaldi scite author profile

We introduce a new parallelization framework for scientific computing based on BDSC, an efficient automatic scheduling algorithm for parallel programs in the presence of resource constraints on the number of processors and their local memory size. BDSC extends Yang and Gerasoulis's Dominant Sequence Clustering (DSC) algorithm; it uses sophisticated cost models and addresses both shared and distributed parallel memory architectures. We describe BDSC, its integration within the PIPS compiler infrastructure and its application to the parallelization of four well-known scientific applications: Harris, ABF, equake and IS. Our experiments suggest that BDSC's focus on efficient resource management leads to significant parallelization speedups on both shared and distributed memory systems, improving upon DSC results, as shown by the comparison of the sequential and parallelized versions of these four applications running on both OpenMP and MPI frameworks.

show abstract

Optimizing GPU Register Usage: Extensions to OpenACC and Compiler Optimizations

Tian

Khaldi

Eachempati

et al. 2016

View full text Add to dashboard Cite

Task Parallelism and Data Distribution: An Overview of Explicit Parallel Programming Languages

Khaldi

Jouvelot

Ancourt

et al. 2013

View full text Add to dashboard Cite

Abstract. Programming parallel machines as effectively as sequential ones would ideally require a language that provides high-level programming constructs to avoid the programming errors frequent when expressing parallelism. Since task parallelism is considered more error-prone than data parallelism, we survey six popular and efficient parallel language designs that tackle this difficult issue: Cilk, Chapel, X10, Habanero-Java, OpenMP and OpenCL. Using as single running example a parallel implementation of the computation of the Mandelbrot set, this paper describes how the fundamentals of task parallel programming, i.e., collective and point-to-point synchronization and mutual exclusion, are dealt with in these languages. We discuss how these languages allocate and distribute data over memory. Our study suggests that, even though there are many keywords and notions introduced by these languages, they all boil down, as far as control issues are concerned, to three key task concepts: creation, synchronization and atomicity. Regarding memory models, these languages adopt one of three approaches: shared memory, message passing and PGAS (Partitioned Global Address Space). The paper is designed to give users and language and compiler designers an upto-date comparative overview of current parallel languages.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Dounia Khaldi

Towards Automatic HBM Allocation Using LLVM: A Case Study with Knights Landing

A Comparative Survey of the HPC and Big Data Paradigms: Analysis and Experiments

Parallelizing with BDSC, a resource-constrained scheduling algorithm for shared and distributed memory systems

Optimizing GPU Register Usage: Extensions to OpenACC and Compiler Optimizations

Task Parallelism and Data Distribution: An Overview of Explicit Parallel Programming Languages

Contact Info

Product

Resources

About