SUMMARYThe paper presents basic design decisions taken in the implementation of a novel distributed program design framework Program Execution Governed by Asynchronous SUpervision of States in Distributed Applications (PEGASUS DA). This framework supports application program execution control design based on evolved automated program global states monitoring. The framework provides to a programmer a readyto-use infrastructure for defining and handling local and global application states to be used as the basis for program execution control decisions. The paper presents how the provided infrastructure can be used for automated construction of strongly consistent application global states. Also, the use of global states for graphically supported specification of distributed program execution control is covered for clusters of multicore processors based on multithreading and message passing. Both architecture and implementation solutions applied for PEGASUS DA are discussed. Especially, multivariant algorithms for construction of program strongly consistent global states and methods for their use in the design of distributed program global execution control are shown. The use of PEGASUS DA is illustrated with an example of a traveling salesman problem solved by the branch and bound method.
New architectural solutions for parallel systems built of bus-based shared memory processor clusters are presented. A new paradigm is proposed for interprocessor communication, called communication on the fly. With this paradigm, processors can be dynamically switched between clusters at program run-time to bring intheir caches data that can be read by many processors in a cluster at the same time they are written to the cluster memory. A cache controlled macro data flow program execution paradigm is also proposed. Programs are structured into tasks for which all required data are brought to the processor data cache before task execution. A new graph representation of programs is introduced, which enables modeling of functioning of data caches, memories, bus arbiters, processor switching between clusters and parallel reads of data on the fly. This representation is used for realistic simulation of a numerical algorithm execution based on distribution of parallel tasks between dynamic SMP clusters and on communication on the fly. Performance evaluation results are presented for different configurations of the programs and shared memory clusters in the system.
,QWURGXFWLRQScalability of shared memory systems can be much improved by application of cluster-based system architecture. Such architecture has become quite common today [1,2]. Some systems are based on shared memory processor clusters with inter-cluster communication done by message passing [3][4][5][6][7]. Communication between clusters is done through different networks such as FastEthernet, GigaBit Ethernet or Myrinet. Other shared memory cluster systems constitute CC-NUMA distributed-shared memory systems. Different interconnection means are used there to implement intra and inter-cluster communication. In GigaMax system of Encore Computer Corporation [9], bus-based shared memory clusters communicated through a global bus. In Stanford DASH [10], bus-based processor clusters are interconnected using two-dimensional meshes. In Convex Exemplar [11], shared memory clusters based on crossbar switches are interconnected by a multiple ring network. The intra-cluster communication and inter-cluster communication can have different latencies, the former being usually much lower for small clusters. To optimally map a parallel program structure to system structure, areas of intensive inter-process communication in programs should be mapped into shared memory clusters. In current implementations, the size of clusters is fixed. The physical number of processors in clusters and the optimal cluster sizes requested by programs can be different. In such cases, the fixed system structure can decrease the efficiency of program execution. This paper describes a cluster-based shared memory system architecture oriented towards much more efficient computations and communication during parallel program execution than in existing systems. These goals are achieved by dynamic reconfiguration of shared memory processor clusters, a new paradigm of data cache behaviour and a new typ...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.