We describe a new suite of computational benchmarks that models applications featuring multiple levels of parallelism. Such parallelism is often available in realistic flow computations on systems of grids, but had not previously been captured in benchmarks. The new suite, named NPB Multi-Zone, is extended from the NAS Parallel Benchmarks suite, and involves solving the application benchmarks LU, BT and SP on collections of loosely coupled discretization meshes. The solutions on the meshes are updated independently, but after each time step they exchange boundary value information. This strategy provides relatively easily exploitable coarse-grain parallelism between meshes. Three reference implementations are available: one serial, one hybrid using the Message Passing Interface (MPI) and OpenMP, and another hybrid using a shared memory multi-level programming model (SMP+OpenMP). We examine the effectiveness of hybrid parallelization paradigms in these implementations on three different parallel computers. We also use an empirical formula to investigate the performance characteristics of the multi-zone benchmarks.
Many-core chips are changing the way high-performance computing systems are built and programmed. As it is becoming increasingly difficult to maintain cache coherence across many cores, manufacturers are exploring designs that do not feature any cache coherence between cores. Communications on such chips are naturally implemented using message passing, which makes them resemble clusters, but with an important difference. Special hardware can be provided that supports very fast on-chip communications, reducing latency and increasing bandwidth. We present one such chip, the Single-Chip Cloud Computer (SCC). This is an experimental processor, created by Intel Labs. We describe two communication libraries available on SCC: RCCE and Rckmb. RCCE is a light-weight, minimal library for writing message passing parallel applications. Rckmb provides the data link layer for running network services such as TCP/IP. Both utilize SCC's non-cache-coherent shared memory for transferring data between cores without needing to go off-chip. In this paper we describe the design and implementation of RCCE and Rckmb. To compare their performance, we consider simple benchmarks run with RCCE, and MPI over TCP/IP.
The creation of parameter study suites has recently become a more challenging problem as the parameter studies have now become multi-tiered and the computational environment has become a supercomputer grid. The parameter spaces are vast, the individual problem sizes are getting larger, and researchers are now seeking to combine several successive stages of parameterization and computation. Simultaneously, grid-based computing offers great resource opportunity but at the expense of great difficulty of use. We present an approach to this problem which stresses intuitive visual design tools for parameter study creation and complex process specification, and also offers programming-free access to grid-based supercomputer resources and process automation. Motivation and BackgroundOnly a decade ago, the solution of the partial differential equations required for the evaluation of aerospace vehicle flow-fields typically involved a single discretization zone and was performed on a single processor of a high-speed compute engine that was usually situated locally. These compute tasks were so costly in CPU cycles that the notion of performing parameter studies was usually ignored. Now, however, the flow-solvers are typically parallel codes, and the problems to be solved involve large numbers of interrelated discretization grids. The compute engines are frequently large parallel machines with multi-gigabyte memories and terrabyte disk farms. Researchers have available the resources not only of their own laboratories but also those of other laboratories and the shared resources at large computer centers which are accessible via fast networks. Parameter studies are now quite feasible and are being performed on a regular basis by many researchers who require solution information throughout a given aerospace vehicle flight regime. The difficulties now have shifted to the manual creation of these parameter studies and to the difficulties associated with launching and managing the large number of jobs required by these studies. Modern aerospace flow-solvers frequently require large sets of discretization grids which describe the geometry of the aerospace vehicle.Subsequently, these solvers produce as output large collections of date files. Up to now, most parameter studies were performed with 2-dimensional flow solvers, but researchers are now starting to use 3-dimensional solvers for their parameter studies.Recent developments in grid-based "metacomputing" such as Globus [ret] and Legion [ref] have created opportunities for running parameter studies on remote networked high-performance compute servers which constitute a shared resource for participants. But these opportunities come at a price, and that price is the proliferation of job control language to support these capabilities. This has placed an onus on users of these "Information Power Grids" who are typically engineering and research code users but who are not generally well prepared or enthusiastic about learning or creating the requisite control language scripts for m...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.