International audienceToday, it is possible to associate multiple CPUs and multiple GPUs in a single shared memory architecture. Using these resources efficiently in a seamless way is a challenging issue. In this paper, we propose a parallelization scheme for dynamically balancing work load between multiple CPUs and GPUs. Most tasks have a CPU and GPU implementation, so they can be executed on any processing unit. We rely on a two level scheduling associating a traditional task graph partitioning and a work stealing guided by processor affinity and heterogeneity. These criteria are intended to limit inefficient task migrations between GPUs, the cost of memory transfers being high, and to favor mapping small tasks on CPUs and large ones on GPUs to take advantage of heterogeneity. This scheme has been implemented to support the SOFA physics simulation engine. Experiments show that we can reach speedups of 22 with 4 GPUs and 29 with 4 CPU cores and 4 GPUs. CPUs unload GPUs from small tasks making these GPUs more efficient, leading to a "cooperative speedup" greater than the sum of the speedups separatly obtained on 4 GPUs and 4 CPUs
Parallel I/O in cluster computing is one of the most important issues to be tackled as clusters grow larger and larger. Many solutions have been proposed for the problem and, while effective in terms of performance, they usually represent a considerable amount of hacking into a "traditional" Beowulf cluster installation. In this paper, we investigate a parallel solution based on NFS, which reduces the level of intrusion in the file server installation, keeps the client side untouched, and still provides an improved level of performance and scalability for parallel applications. We compare our proposal to other existing file systems using known benchmarks, and demonstrate that it is a valid alternative for general-purpose cluster computing.
The leveraging of existing storage space in a cluster is a desirable characteristic of a parallel file system. While undoubtedly an advantage from the point of view of resource management, this possibility may face the administrator with a wide variety of alternatives for configuring the file server, whose optimal layout is not always easy to devise. Given the diversity of parameters such as the number of processors on each node and the capacity and topology of the network, decisions regarding the locality of server components like metadata servers and I/O servers have a direct impact on performance and scalability. In this paper, we explore the capabilities of the dNFSp file system on a large cluster installation, observing how scalable the system behaves in different scenarios and comparing it to a dedicated parallel file system. Our obtained results show that the design of dNFSp allows for a scalable and resource-saving configuration for clusters with a large number of nodes.
Este artigo apresenta um modelo de alocação de servidores temporários para o armazenamento de dados no sistema de arquivos dNFSp. Esta funcionalidade permite a adaptação do sistema distribuído às necessidades de desempenho e armazenamento de diferentes aplicações. Será apresentado um modelo de tratamento do dinamismo de servidores de dados, assim como a implementação deste modelo seguida de uma avaliação de desempenho.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.