Computational grids and grid middleware offer unprecedented computational power and storage capacity, and thus, have opened the possibility of solving problems that were previously not possible on even the largest single computational resources. These opportunities notwithstanding, the development of grid applications that run efficiently remains a challenge due to the inherent heterogeneity of networks and system architectures inherent in such environments. We present grid solutions to two grand challenge problems in computational mechanics. To study the scalability of our solutions we implemented both as MPI applications and ran them on the TeraGrid using NEKTAR and MPICH-G2. We present the results of our study which demonstrate near linear scalability in both applications when run across multiple TeraGrid sites and at a scale of hundreds or processors.
Grid ComputingThe National Science Foundation's TeraGrid (TG) (http://www.teragrid.org) integrates the most powerful open resources in the US, which at present amount to about 50 teraflops in processing power and 1.5 petabytes of online storage connected with 40 Gb/s network. Unlike conventional supercomputers, it offers the opportunity for potentially unlimited scalability. The key question that computational scientists are faced with, however, is how to adapt their application to such complex and heterogeneous network effectively. We are, indeed, at a crossroads in parallel scientific computing, similar to what computational scientists went through about fifteen years ago. The emergence of parallel software, (e.g., MPI and OpenMP), and also of domain decomposition algorithms and corresponding freeware, (e.g., METIS) [14], made parallel computing available to the wider scientific community and allowed first-principles simulations of turbulence at very fine scales, of blood flow in the human heart [15], and of global climate at just a few km-level resolution.On the other hand, simulations designed to capture detailed physicochemical, mechanical or biological processes have demonstrated quite different characteristics [2,4,5,17,18]. Some applications are computation intensive, requiring extremely powerful computing systems. Others are data intensive [1, 3, 16], necessitating creation or mining multi-terabyte data archives to extract scientific insight. Large-scale biological and physical simulations are extremely computation intensive, and are usually characterized by tightly-coupled computations and communications. To efficiently and effectively harness the power of grid computing, it is necessary to design and adapt applications to exploit ensembles of supercomputers and match application requirements and characteristics with grid resources.The challenges in the development of such gridenabled applications lie primarily in the high degree of system heterogeneity and dynamic behavior in architecture and performance of the Grid environment. For example, a grid may have a highly heterogeneous and unbalanced communication network, whose bandwidth and latency charac...