P.H. Worley scite author profile

This report is the PICL user's guide, lt contains an overview of PICL and how it is used. Examples in C and Fortran are included. PICL is a subroutine library that can be used to develop parallel programs that are portable across several distributed-memory multiprocessors. P1CL provides a portable syntax for key communication primitives and related system calls. It also provides portable routines to perform certain widely-used, high-level communication operations, such as global broadcast and global summation. Finally, PICL provides execution tracing that can be used to monitor performance or to aid in debugging.-V-This document contains examples and information needed for straightibrward use of most of PICL's basic features. Full documentation of all PICL options and the various ways the library can be used is contained in a separate report [1]. The library is made up of three distinct sets of routines: a set of low-level communication and system primitives described in section 2, a set of high-level global communication routines whose use is described in section 3, and a set of routines for ,, invoking and controlling the execution tracing facility, whi(:h is described iii s(,(:tiozl 4. Each section __ntains examl)les in C sllowing typical uses oi' the resl)ectiw,_ routines. IIi. addition, tile Appendix contains FORTRAN versions of tile examples and instructions for obtaining PICL and ParaGraph. 2. Low-Level Routines The 12 low-level communication and system interface routines, described ill Table 1, provide a portable syntax for message-passing programs. Tile PICL programming model assumes that the multiI)rocessor can send messagt:s between arbitrarily chosen pairs of processors. The time required to send a message between two processors is a function of the interprocessor communication network, a.nd user will need to be aware of such machine dependencies in order to write efficient programs. Our model distinguishes one processor, the host, from the rest. The user has access to the remaining processors, called node processors (or simply nodes), through the host. Typically, an application code consists of one program that runs on the host ,, and another program that run,,',,on each of the nodes. The host program calls PICL routines to allocate node processors, load the node progi'am (or l_r0gra.ms) onto the nodes, send input data required by the node programs, a.nd receive results fronl thf, nodes.

show abstract

Parallel algorithms for the spectral transform method

Foster

Worley

1994

View full text Add to dashboard Cite

The spectral transform method is a standard numerical technique for solving partial differentiM equations on a sphere and is widely used in atmospheric circulation models. Recent research has identified several promising algorithms for implementing this method on massively parallel computers; however, no detailed comparison of the different algorithms has previously been attempted. In this paper, we describe these different parallel algorithms and report on computational experiments that we have conducted to evaluate their efficiency on parallel computers. The experiments used a testbed code that solves the nonLinear shallow water equations on a sphere; considerable care was taken to ensure that the experiments provide a [air comparison of the different algorithms and that the " results are relevant to global models. We focus on hypercube-and mesh-connected multicomputers with cut-through routing, such as the Intel iPSC/860, DELTA, and Paragon, and the nCUBE/2, but also indicate how the results extend to other parallel computer architectures. The results of this study are relevant not only to the spectral transform method but also to multidimensional FFTs and other parallel transforms.

show abstract

Performance of parallel computers for spectral atmospheric models

Foster¹,

Toonen

Worley

1995

View full text Add to dashboard Cite

Introducing FACETS, the Framework Application for Core-Edge Transport Simulations

Cary

Candy

Cohen

et al. 2007

J. Phys.: Conf. Ser.

View full text Add to dashboard Cite

The FACETS (Framework Application for Core-Edge Transport Simulations) project began in January 2007 with the goal of providing core to wall transport modeling of a tokamak fusion reactor. This involves coupling previously separate computations for the core, edge, and wall regions. Such a coupling is primarily through connection regions of lower dimensionality. The project has started developing a component-based coupling framework to bring together models for each of these regions. In the first year, the core model will be a 1 ½ dimensional model (1D transport across flux surfaces coupled to a 2D equilibrium) with fixed equilibrium. The initial edge model will be the fluid model, UEDGE, but inclusion of kinetic models is planned for the out years. The project also has an embedded Scientific Application Partnership that is examining embedding a full-scale turbulence model for obtaining the crosssurface fluxes into a core transport code.

show abstract

Performance Evaluation of the Cray X1 Distributed Shared-Memory Architecture

Dunigan¹,

Vetter²,

White³

et al. 2005

IEEE Micro

View full text Add to dashboard Cite

Abstract-The Cray X1 supercomputer is a distributed shared memory vector multiprocessor, scalable to 4096 processors and up to 65 terabytes of memory. The X1's hierarchical design uses the basic building block of the multi-streaming processor (MSP), which is capable of 12.8 GF/s for 64-bit operations. The distributed shared memory (DSM) of the X1 presents a 64-bit global address space that is directly addressable from every MSP with an interconnect bandwidth per computation rate of one byte per floating point operation. Our results show that this high bandwidth and low latency for remote memory accesses translates into improved application performance on important applications, such as an Eulerian gyrokinetic-Maxwell solver. Furthermore, this architecture naturally supports programming models like the Cray shmem API, Unified Parallel C (UPC), and Co-Array FORTRAN (CAF), and it is imperative to select the appropriate models to exploit these features as our benchmarks demonstrate.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

P.H. Worley

A user's guide to PICL a portable instrumented communication library

Parallel algorithms for the spectral transform method

Performance of parallel computers for spectral atmospheric models

Introducing FACETS, the Framework Application for Core-Edge Transport Simulations

Performance Evaluation of the Cray X1 Distributed Shared-Memory Architecture

Contact Info

Product

Resources

About