This paper introduces SPar, an internal C++ Domain-Specific Language (DSL) that supports the development of classic stream parallel applications. The DSL uses standard C++ attributes to introduce annotations tagging the notable components of stream parallel applications: stream sources and stream processing stages. A set of tools process SPar code (C++ annotated code using the SPar attributes) to generate FastFlow C++ code that exploits the stream parallelism denoted by SPar annotations while targeting shared memory multi-core architectures. We outline the main SPar features along with the main implementation techniques and tools. Also, we show the results of experiments assessing the feasibility of the entire approach as well as SPar's performance and expressiveness
This paper presents a parallel programming methodology that ensures easy programming, e ciency, and portability of programs to di erent m a c hines belonging to the class of the generalpurpose, distributed memory, MIMD architectures. The methodology is based on the de nition of a new, high-level, explicitly parallel language, called P 3 L, and of a set of static tools that automatically adapt the program features for each target architecture. P 3 L does not require programmers to specify process activations, the actual parallelism degree, scheduling, or interprocess communications, i.e. all those features that need to be adjusted to harness each speci c target machine. Parallelism is, on the other hand, expressed in a structured and qualitative w ay, b y hierarchical composition of a restricted set of language constructs, corresponding to those forms of parallelism that are frequently encountered in parallel applications, and that can e ciently be implemented.The e cient portability o f P 3 L applications is guaranteed by the compiler along with the novel structure of the support. The compiler automatically adapts the program features for each speci c architecture, accessing the costs (in terms of performance) of the low-level mechanisms exported by t h e a r c hitecture itself. In our methodology, these costs, along with other features of the architecture, are viewed through an abstract machine, whose mechanism interface is used by the compiler to produce the nal object code.1
This article presents an extension of the Fractal component model targeted at programming applications to be run on computing grids: the Grid Component Model (GCM). First, to address the problem of deployment of components on the Grid, deployment strategies have been de ned. Then, as Grid applications often result from the composition of a lot of parallel (sometimes identical) components, composition mechanisms to support collective communications on a set of components are introduced. Finally, because of the constantly evolving environment and requirements for Grid applications, the GCM de nes a set of features intended to support component autonomicity. All these aspects are developed in this paper with the challenging objective to ease the programming of Grid applications, while allowing GCM components to also be the unit of deployment and management
In this work we present Lithium, a pure Java structured parallel programming environment based on skeletons (common, reusable and efficient parallelism exploitation patterns). Lithium is implemented as a Java package and represents both the first skeleton based programming environment in Java and the first complete skeleton based Java environment exploiting macro-data flow implementation techniques.Lithium supports a set of user code optimizations which are based on skeleton rewriting techniques. These optimizations improve both absolute performance and resource usage with respect to original user code. Parallel programs developed using the library run on any network of workstations provided the workstations support plain JRE. The paper describes the library implementation, outlines the optimization techniques used and eventually presents the performance results obtained on both synthetic and real applications.
FastFlow is a programming framework specifically targeting cache-coherent shared-memory multicores. It is implemented as a stack of C++ template libraries built on top of lock-free (and memory fence free) synchronization mechanisms. Its philosophy is to combine programmability with performance. In this paper a new FastFlow programming methodology aimed at supporting parallelization of existing sequential code via offloading onto a dynamically created software accelerator is presented. The new methodology has been validated using a set of simple micro-benchmarks and some real applications.
The use of efficient synchronization mechanisms is crucial for implementing fine grained parallel programs on modern shared cache multicore architectures. In this paper we study this problem by considering Single-Producer/Single-Consumer (SPSC) coordination using unbounded queues. A novel unbounded SPSC algorithm capable of reducing the row synchronization latency and speeding up Producer-Consumer coordination is presented. The algorithm has been extensively tested on a sharedcache multi-core platform and a sketch proof of correctness is presented. The queues proposed have been used as basic building blocks to implement the FastFlow parallel framework, which has been demonstrated to offer very good performance for fine-grain parallel applications.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.