OpenMP directives are the de-facto standard for shared-memory parallel programming. However, OpenMP does not guarantee the correctness of the parallel execution of a given loop if runtime data dependences arise. Consequently, many highlyparallel regions cannot be safely parallelized with OpenMP due to the possibility of a dependence violation. In this paper, we propose to augment OpenMP capabilities, by adding Thread-Level Speculation (TLS) support. Our contribution is threefold. First, we have defined a new speculative clause for variables inside parallel loops. This clause ensures that all accesses to these variables will be carried out according to sequential semantics. Second, we have created a new, software-based TLS runtime library to ensure correctness in the parallel execution of OpenMP loops that include speculative variables. Third, we have developed a new GCC plugin, which seamlessly translates our OpenMP speculative clause into calls to our TLS runtime engine. The result is the ATLaS C Compiler framework, which takes advantage of TLS techniques to expand OpenMP functionalities, and guarantees the sequential semantics of any parallelized loop.
The role of the compiler is fundamental to exploit the hardware capabilities of a system running a particular application, minimizing the sequential execution time and, in some cases, offering the possibility of parallelizing part of the code automatically. This paper relies on the SPEC CPU2006 v1.1 benchmark suite to evaluate the performance of the code generated by three widely-used compilers (Intel C++/Fortran Compiler 11.0, Sun Studio 12 and GCC 4.3.2). Performance is measure in terms of base speed for reference problem sizes. Both sequential and automatic parallel performance obtained is analyzed, using different hardware architectures and configurations. The study includes a detailed description of the different problems that arise while compiling SPEC CPU2006 benchmarks with these tools, an information difficult to obtain elsewhere.Having in mind that performance is a moving target in the field of compilers, our evaluation shows that the sequential code generated by both Sun and Intel compilers for the SPEC CPU2006 integer benchmarks present a similar performance, while the floating-point code generated by Intel compiler is faster than its competitors. With respect to the auto-parallelization options offered by Intel and Sun compilers, our study shows that their benefits only apply to some floating-point benchmarks, with an average speedup of 1.2× with four processors. Meanwhile, the GCC suite evaluated is not capable of compiling the SPEC CPU2006 benchmark with auto-parallelization options enabled.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.