In this paper, we present OmpSs, a programming model based on OpenMP and StarSs, that can also incorporate the use of OpenCL or CUDA kernels. We evaluate the proposal on different architectures, SMP, GPUs, and hybrid SMP/GPU environments, showing the wide usefulness of the approach. The evaluation is done with six different benchmarks, Matrix Multiply, BlackScholes, Perlin Noise, Julia Set, PBPI and FixedGrid. We compare the results obtained with the execution of the same benchmarks written in OpenCL or OpenMP, on the same architectures. The results show that OmpSs greatly outperforms both environments. With the use of OmpSs the programming environment is more flexible than traditional approaches to exploit multiple accelerators, and due to the simplicity of the annotations, it increases programmer's productivity.
Programming models for multicore and many-core systems are listed as one of the main challenges in the near future for computing research. These programming models should be able to exploit the underlying platform, but also should have good programmability to enable programmer productivity. With respect to the heterogeneity and hierarchy of the underlying platforms, the programming models should take them into account but they should also enable the programmer to be unaware of the complexity of the hardware. In this paper we present an extension of the StarSs syntax to support task hierarchy. A motivation for such a hierarchical approach is presented through experimentation with CellSs. A prototype implementation of such a hierarchical task-based programming model that combines a first task level with SMPSs and a second task level with CellSs is presented. The preliminary results obtained when executing a matrix multiplication and a Cholesky factorization show the viability and potential of the approach and the current issues raised.
Increasing availability of comprehensive experimental datasets and of high-performance computing resources are driving rapid growth in scale, complexity, and biological realism of computational models in neuroscience. To support construction and simulation, as well as sharing of such large-scale models, a broadly applicable, flexible, and high-performance data format is necessary. To address this need, we have developed the Scalable Open Network Architecture TemplAte (SONATA) data format. It is designed for memory and computational efficiency and works across multiple platforms. The format represents neuronal circuits and simulation inputs and outputs via standardized files and provides much flexibility for adding new conventions or extensions. SONATA is used in multiple modeling and visualization tools, and we also provide reference Application Programming Interfaces and model examples to catalyze further adoption. SONATA format is free and open for the community to use and build upon with the goal of enabling efficient model building, sharing, and reproducibility.
Registro de acceso restringido Este recurso no está disponible en acceso abierto por política de la editorial. No obstante, se puede acceder al texto completo desde la Universitat Jaume I o si el usuario cuenta con suscripción. Registre d'accés restringit Aquest recurs no està disponible en accés obert per política de l'editorial. No obstant això, es pot accedir al text complet des de la Universitat Jaume I o si l'usuari compta amb subscripció. Restricted access item This item isn't open access because of publisher's policy. The full--text version is only available from Jaume I University or if the user has a running suscription to the publisher's contents.
Abstract-As new heterogeneous systems and hardware accelerators appear, high performance computers can reach a higher level of computational power. Nevertheless, this does not come for free: the more heterogeneity the system presents, the more complex becomes the programming task in terms of resource management.OmpSs is a task-based programming model and framework focused on the runtime exploitation of parallelism from annotated sequential applications. This paper presents a set of extensions to this framework: we show how the application programmer can expose different specialized versions of tasks (i.e. pieces of specific code targeted and optimized for a particular architecture) and how the system can choose between these versions at runtime to obtain the best performance achievable for the given application. From the results obtained in a multi-GPU system, we prove that our proposal gives flexibility to application's source code and can potentially increase application's performance.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.