Graph-specific computing with the support of dedicated accelerator has greatly boosted the graph processing in both efficiency and energy. Nevertheless, their data conflict management is still sequential in essential when some vertex needs a large number of conflicting updates at the same time, leading to prohibitive performance degradation. This is particularly true for processing natural graphs.In this paper, we have the insight that the atomic operations for the vertex updating of many graph algorithms (e.g., BFS, PageRank and WCC) are typically incremental and simplex. This hence allows us to parallelize the conflicting vertex updates in an accumulative manner. We architect a novel graphspecific accelerator that can simultaneously process atomic vertex updates for massive parallelism on the conflicting data access while ensuring the correctness. A parallel accumulator is designed to remove the serialization in atomic protection for conflicting vertex updates through merging their results in parallel. Our implementation on Xilinx Virtex UltraScale+ XCVU9P with a wide variety of typical graph algorithms shows that our accelerator achieves an average throughput by 2.36 GTEPS as well as up to 3.14x performance speedup in comparison with state-of-the-art ForeGraph (with single-chip version).Input: Graph G = (V , E), root vertex r Output: Distance of each ∈ V , dis[ ] 1 Q ← r ; 2 dis[r ] = 0; 3 while Q is not empty do 4
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.