perpétuo e sem limites geográficos, de arquivar e publicar esta dissertação através de exemplares impressos reproduzidos em papel ou de forma digital, ou por qualquer outro meio conhecido ou que venha a ser inventado, e de a divulgar através de repositórios científicos e de admitir a sua cópia e distribuição com objectivos educacionais ou de investigação, não comerciais, desde que seja dado crédito ao autor e editor. iv Eu dedico esta tese a todos aqueles que, de uma forma ou de outra, contribuiram para que ela se torna-se possível. Esta tese não é so minha, é também vossa. A huge Thank You to Soraia Assis, whose friendship got me by the rough times, and whose smile warmed my own. Without you this thesis would be but a dream.I thank my parents for all the sacrifices that they endured, so that I would be fortunate enough to study, and eventually reach this mark in my life.Thank you to all my family members and friends, whose names were not mention so far. They are the silent heroes, whose word is spoken throughout this thesis. Obrigado a todos! vii viii AbstractThe Graphics Processing Unit (GPU) is gaining popularity as a co-processor to the Central Processing Unit (CPU), due to its ability to surpass the latter's performance in certain application fields. Nonetheless, harnessing the GPU's capabilities is a non-trivial exercise that requires good knowledge of parallel programming. Thus, providing ways to extract such computational power has become an emerging research topic.In this context, there have been several proposals in the field of GPGPU (Generalpurpose Computation on Graphics Processing Unit) development. However, most of these still offer a low-level abstraction of the GPU computing model, forcing the developer to adapt application computations in accordance with the SPMD model, as well as to orchestrate the low-level details of the execution. On the other hand, the higher-level approaches have limitations that prevent the full exploitation of GPUs when the purpose goes beyond the simple offloading of a kernel.To this extent, our proposal builds on the recent trend of applying the notion of algorithmic patterns (skeletons) to GPU computing. We propose Marrow, a high-level algorithmic skeleton framework that expands the set of skeletons currently available in this field. Marrow's skeletons orchestrate the execution of OpenCL computations and introduce optimizations that overlap communication and computation, thus conjoining programming simplicity with performance gains in many application scenarios. Additionally, these skeletons can be combined (nested) to create more complex applications.
SUMMARYCurrent computational systems are heterogeneous by nature, featuring a combination of CPUs and graphics processing units (GPUs). As the latter are becoming an established platform for high-performance computing, the focus is shifting towards the seamless programming of these hybrid systems as a whole. The distinct nature of the architectural and execution models in place raises several challenges, as the best hardware configuration is behavior and workload dependent. In this paper, we address the execution of compound, multi-kernel, open computing language computations in multi-CPU/multi-GPU environments. We address how these computations may be efficiently scheduled onto the target hardware, and how the system may adapt itself to changes in the workload to process and to fluctuations in the CPU's load. An experimental evaluation attests the performance gains obtained by the conjoined use of the CPU and GPU devices, when compared with GPU-only executions, and also by the use of data-locality optimizations in CPU environments.
Abstract. We describe a reference implementation of a multi-threaded run-time system for a core programming language based on a process calculus. The core language features processes running in parallel and communicating through asynchronous messages as the fundamental abstractions. The programming style is fully declarative, focusing on the interaction patterns between processes. The parallelism, implicit in the syntax of the programs, is effectively extracted by the language compiler and explored by the run-time system.
Abstract. This paper outlines a general picture of our ongoing work under EU Mobius and Sensoria projects on a type-based compilation and execution framework for a class of multicore CPUs. Our focus is to harness the power of concurrency and asynchrony in one of the major forms of multicore CPUs based on distributed, non-coherent memory, through the use of type-directed compilation. The key idea is to regard explicit asynchronous data transfer among local caches as typed communication among processes. By typing imperative processes with a variant of session types, we obtain both type-safe and efficient compilation into processes distributed over multiple cores with local memories.
Abstract-The parallel nature of the multi-core architectural design can only be fully exploited by concurrent applications. This status quo pushed productivity to the forefront of the language design concerns. The community is demanding for new solutions in the design, compilation, and implementation of concurrent languages, making this research area one of great importance and impact. To that extent this paper proposes the expression of data parallelism at subroutine level. The calling of a subroutine in this context spawns several execution flows, each operating on distinct partitions of the input dataset. Such computations can be expressed by simply annotating sequential subroutines with data distribution and reduction policies, delegating the management of the parallel execution to a dedicated runtime system. The paper overviews the key concepts of the model, illustrating them with some small programming examples, and describes a Java implementation built on top of the X10 [1] runtime system. A performance evaluation attests that this approach can provide good performance gains without burdening the programmer with the writing of specialized code.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.