Keval Vora scite author profile

Large-scale parallel graph analytics involves executing iterative algorithms (e.g., PageRank, Shortest Paths, etc.) that are both data-and compute-intensive. In this work we construct faster versions of iterative graph algorithms from their original counterparts using input graph reduction. A large input graph is transformed into a small graph using a sequence of input reduction transformations. Savings in execution time are achieved using our two phased processing model that e↵ectively runs the original iterative algorithm in two phases: first, using the reduced input graph to gain savings in execution time; and second, using the original input graph along with the results from the first phase for computing precise results. We propose several input reduction transformations and identify the structural and non-structural properties that they guarantee, which in turn are used to ensure the correctness of results while using our two phased processing model. We further present a unified input reduction algorithm that e ciently applies a non-interfering sequence of simple local input reduction transformations. Our experiments show that our transformation techniques enable significant reductions in execution time (1.25⇥-2.14⇥) while achieving precise final results for most of the algorithms. For cases where precise results cannot be achieved, the relative error remains very small (at most 0.065).

show abstract

Synergistic Analysis of Evolving Graphs

Vora

Gupta

2016

ACM Trans. Archit. Code Optim.

View full text Add to dashboard Cite

Evolving graph processing involves repeating analyses, which are often iterative, over multiple snapshots of the graph corresponding to different points in time. Since the snapshots of an evolving graph share a great number of vertices and edges, traditional approaches that process these snapshots one at a time without exploiting this overlap contain much wasted effort on both data loading and computation, making them extremely inefficient. In this article, we identify major sources of inefficiencies and present two optimization techniques to address them. First, we propose a technique for amortizing the fetch cost by merging fetching of values for different snapshots of the same vertex. Second, we propose a technique for amortizing the processing cost by feeding values computed by earlier snapshots into later snapshots. We have implemented these optimizations in two distributed graph processing systems, namely, GraphLab and ASPIRE. Our experiments with multiple real evolving graphs and algorithms show that, on average fetch amortization speeds up execution of GraphLab and ASPIRE by 5.2× and 4.1×, respectively. Amortizing the processing cost yields additional average speedups of 2× and 7.9×, respectively.

show abstract

Enabling Work-Efficiency for High Performance Vertex-Centric Graph Analytics on GPUs

Khorasani

Vora

Gupta

et al. 2017

View full text Add to dashboard Cite

CoRAL

Vora

Tian

Gupta

et al. 2017

View full text Add to dashboard Cite

Existing distributed asynchronous graph processing systems employ checkpointing to capture globally consistent snapshots and rollback all machines to most recent checkpoint to recover from machine failures. In this paper we argue that recovery in distributed asynchronous graph processing does not require the entire execution state to be rolled back to a globally consistent state due to the relaxed asynchronous execution semantics. We define the properties required in the recovered state for it to be usable for correct asynchronous processing and develop CoRAL, a lightweight checkpointing and recovery algorithm. First, this algorithm carries out confined recovery that only rolls back graph execution states of the failed machines to affect recovery. Second, it relies upon lightweight checkpoints that capture locally consistent snapshots with a reduced peak network bandwidth requirement. Our experiments using real-world graphs show that our technique recovers from failures and finishes processing 1.5× to 3.2× faster compared to the traditional asynchronous checkpointing and recovery mechanism when failures impact 1 to 6 machines of a 16 machine cluster. Moreover, capturing locally consistent snapshots significantly reduces intermittent high peak bandwidth usage required to save the snapshots-the average reduction in 99th percentile bandwidth ranges from 22% to 51% while 1 to 6 snapshot replicas are being maintained.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Keval Vora

CuSha

Efficient Processing of Large Graphs via Input Reduction

Synergistic Analysis of Evolving Graphs

Enabling Work-Efficiency for High Performance Vertex-Centric Graph Analytics on GPUs

CoRAL

Contact Info

Product

Resources

About