Gyungho Lee scite author profile

1995

Array dataow information plays an important role for successful automatic parallelization of Fortran programs. This paper proposes a pow erful symbolic array dataow analysis to support arra ypriv atizationand loop parallelization for programs with arbitrary control ow graphs and acyclic call graphs. Our scheme summarizes arra y access information using guarded array regions and propagates such regions over a Hierarchical Supergraph HSG. The use of guards allo wsus to use the information in IF conditions to sharpen the arra y datao w analysis and thereby to handle dicult cases which elude other existing techniques. The guarded array regions retain the simplicit y of set operations for regular arra yregions in common cases, and they enhance regular arra yregions in complicated cases b y using guards to handle complex symbolic expressions

Reducing coherence overhead in shared-bus multiprocessors

Cho

1996

To reduce the overhead of cache coherence enforcement in shared-bus multiprocessors, we propose a self-invalidation technique as an extension to write-invalidate protocols. The technique speculatively identifies cache blocks to be invalidated and dynamically determines when to invalidate them locally. We also consider enhancing our selfinvalidation scheme by incorporating read snarfing, to reduce the cache misses due to incorrect prediction. We evaluate our self-invalidation scheme by simulating SPLASH-2 benchmark programs that exhibit various reference patterns, under a realistic shared-bus multiprocessor model. We discuss the effectiveness and hardware complexity of self-invalidation and its enhancement with read snarfing in our extended protocol.

Relaxing the inclusion property in cache only memory architecture

Kong

1996

Intelligent congestion control in ATM networks

Park

An assessment of COMA multiprocessors

In Cache Only Memory Architecture (COMA) for distributed shared memory multiprocessors, the physical location of a datum is completely decoupled from its &ress by organizing the memory local to each node as a cache for shared address space. As in traditional cachecoherent multiprocessors, the overhead involved in coherence enforcement strongly affects the performance. The overhead seems more pronounced in a COMA machine than the traditional multiprocessors because of its relatively huge size of the local memory acting as cache and the level of memory hierarchy at which coherence needs to be enforced. Using trace driven simulations of the Perfect Club Benchmark Suite, this paper studies the change in the miss ratio and the network trafic with the two coherence policies, updateand invalidate. Our study shows that the two policies provide dislinct characteristics which reveal different opportunity of improving COMA multiprocessors for dinerent choice of coherence policy.1063-7133/95 $4.00 0 1995 IEEE