Array dataow information plays an important role for successful automatic parallelization of Fortran programs. This paper proposes a pow erful symbolic array dataow analysis to support arra ypriv atizationand loop parallelization for programs with arbitrary control ow graphs and acyclic call graphs. Our scheme summarizes arra y access information using guarded array regions and propagates such regions over a Hierarchical Supergraph HSG. The use of guards allo wsus to use the information in IF conditions to sharpen the arra y datao w analysis and thereby to handle dicult cases which elude other existing techniques. The guarded array regions retain the simplicit y of set operations for regular arra yregions in common cases, and they enhance regular arra yregions in complicated cases b y using guards to handle complex symbolic expressions
To reduce the overhead of cache coherence enforcement in shared-bus multiprocessors, we propose a self-invalidation technique as an extension to write-invalidate protocols. The technique speculatively identifies cache blocks to be invalidated and dynamically determines when to invalidate them locally. We also consider enhancing our selfinvalidation scheme by incorporating read snarfing, to reduce the cache misses due to incorrect prediction. We evaluate our self-invalidation scheme by simulating SPLASH-2 benchmark programs that exhibit various reference patterns, under a realistic shared-bus multiprocessor model. We discuss the effectiveness and hardware complexity of self-invalidation and its enhancement with read snarfing in our extended protocol.
In Cache Only Memory Architecture (COMA) for distributed shared memory multiprocessors, the physical location of a datum is completely decoupled from its &ress by organizing the memory local to each node as a cache for shared address space. As in traditional cachecoherent multiprocessors, the overhead involved in coherence enforcement strongly affects the performance. The overhead seems more pronounced in a COMA machine than the traditional multiprocessors because of its relatively huge size of the local memory acting as
cache and the level of memory hierarchy at which coherence needs to be enforced. Using trace driven simulations of the Perfect Club Benchmark Suite, this paper studies the change in the miss ratio and the network trafic with the two coherence policies, updateand invalidate. Our study shows that the two policies provide dislinct characteristics which reveal different opportunity of improving COMA multiprocessors for dinerent choice of coherence policy.1063-7133/95 $4.00 0 1995 IEEE
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.