Given a type of resource such as disk units, extra memory modules, connections to the host processor, or software modules, we consider the problem of distributing the resource units to processors in a hypercube computer so that certain performance requirements are met at minimal cost. Typical requirements include the condition that every processor is within a given distance of a resource unit, that every processor is within a given distance of each of several resources, and that every m-dimensional subcube contains a resource unit. The latter is particularly important in a multiuser system in which different users are given their own subcubes. In this setting, we also consider the problem of meeting the performance requirements at minimal cost when the subcube allocation system cannot allocate all possible subcubes and the requirements apply only to allocable subcubes. We also analyze the problem of partitioning processors with resources into different classes, requiring that every processor is within a given distance of, or in a subcube of given dimension with, a member of each class. Efficient constructive techniques for distributing or partitioning a resource are given for several performance requirements, along with upper and lower bounds on the total number of resource units required.
We consider the problem of determining the minimum number of faulty processors, κ(n, m), and of faulty links, λ(n, m), in an n-dimensional hypercube computer so that every m-dimensional subcube is faulty. Best known lower bounds for κ(n, m) and λ(n, m) are proved, several new recursive inequalities and new upper bounds are established, their asymptotic behavior for fixed m and for fixed n − m are analyzed, and their exact values are determined for small n and m. Most of the methods employed show how to construct sets of faults attaining the bounds. An extensive survey of related work is also included, showing connections to resource allocation, k-independent sets, and exhaustive testing.
Input/Output is a big obstacle to effective use of tenflopsscale computing systems, Motivated by earlier parallel I/O meaurements on an Intel TFLOPS machine, we conduct studies to determine the sensitivity of parallel I/O performance on multi-progmmmed mesh-connected machines with respect to number of I/O nodes, number of compute nodes, network link bandwidth, I/O node bandwidth, spatial layout of jobs, and read or write demands of applications.Our extensive simulations and analytical modeling yield important insights into the limitations on parallel I/O performance due to network contention, and into the possible gains in parallel I/O performance that can be achieved by tuning the spatial layout of jobs.Applying these results, we devise a new processor allocation strategy that is sensitive to parallel I/O traffic and the resulting network contention.In performance evaluations driven by synthetic workloads and by a real workload trace captured at the San Diego Supercomputing Center, the new strategy improves the average response time of parallel I/O intensive jobs by up to a factor of 4.5.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.