Kanakagiri, Raghavendra scite author profile

Kanakagiri, Raghavendra

4Publications

9Citation Statements Received

371Citation Statements Given

How they've been cited

How they cite others

238

371

Affiliations

Publications

Order By: Most citations

SISA: Set-Centric Instruction Set Architecture for Graph Mining on Processing-in-Memory Systems

Besta¹,

Raghavendra²,

Kwasniewski³

et al. 2021

Preprint

View full text Add to dashboard Cite

Simple graph algorithms such as PageRank have recently been the target of numerous hardware accelerators. Yet, there also exist much more complex graph mining algorithms for problems such as clustering or maximal clique listing. These algorithms are memory-bound and thus could be accelerated by hardware techniques such as Processing-in-Memory (PIM). However, they also come with non-straightforward parallelism and complicated memory access patterns. In this work, we address this with a simple yet surprisingly powerful observation: operations on sets of vertices, such as intersection or union, form a large part of many complex graph mining algorithms, and can offer rich and simple parallelism at multiple levels. This observation drives our cross-layer design, in which we (1) expose set operations using a novel programming paradigm, (2) express and execute these operations efficiently with carefully designed set-centric ISA extensions called SISA, and (3) use PIM to accelerate SISA instructions. The key design idea is to alleviate the bandwidth needs of SISA instructions by mapping set operations to two types of PIM: in-DRAM bulk bitwise computing for bitvectors representing high-degree vertices, and near-memory logic layers for integer arrays representing low-degree vertices. Set-centric SISA-enhanced algorithms are efficient and outperform hand-tuned baselines, offering more than 10× speedup over the established Bron-Kerbosch algorithm for listing maximal cliques. We deliver more than 10 SISA set-centric algorithm formulations, illustrating SISA's wide applicability.

show abstract

Communication-Efficient Jaccard Similarity for High-Performance Distributed Genome Comparisons

Besta¹,

Raghavendra²,

Mustafa³

et al. 2019

Preprint

View full text Add to dashboard Cite

The Jaccard similarity index is an important measure of the overlap of two sets, widely used in machine learning, computational genomics, information retrieval, and many other areas. We design and implement SimilarityAtScale, the first communication-efficient distributed algorithm for computing the Jaccard similarity among pairs of large datasets. Our algorithm provides an efficient encoding of this problem into a multiplication of sparse matrices. Both the encoding and sparse matrix product are performed in a way that minimizes data movement in terms of communication and synchronization costs. We apply our algorithm to obtain similarity among all pairs of a set of large samples of genomes. This task is a key part of modern metagenomics analysis and an evergrowing need due to the increasing availability of high-throughput DNA sequencing data. The resulting scheme is the first to enable accurate Jaccard distance derivations for massive datasets, using largescale distributed-memory systems. We package our routines in a tool, called GenomeAtScale, that combines the proposed algorithm with tools for processing input sequences. Our evaluation on real data illustrates that one can use GenomeAtScale to effectively employ tens of thousands of processors to reach new frontiers in large-scale genomic and metagenomic analysis. While GenomeAtScale can be used to foster DNA research, the more general underlying SimilarityAtScale algorithm may be used for high-performance distributed similarity computations in other data analytics application domains.

show abstract

Motif Prediction with Graph Neural Networks

Besta¹,

Grob²,

Miglioli³

et al. 2021

Preprint

View full text Add to dashboard Cite

Link prediction is one of the central problems in graph mining. However, recent studies highlight the importance of higher-order network analysis, where complex structures called motifs are the first-class citizens. We first show that existing link prediction schemes fail to effectively predict motifs. To alleviate this, we establish a general motif prediction problem and we propose several heuristics that assess the chances for a specified motif to appear. To make the scores realistic, our heuristics consider -among others -correlations between links, i.e., the potential impact of some arriving links on the appearance of other links in a given motif. Finally, for highest accuracy, we develop a graph neural network (GNN) architecture for motif prediction. Our architecture offers vertex features and sampling schemes that capture the rich structural properties of motifs. While our heuristics are fast and do not need any training, GNNs ensure highest accuracy of predicting motifs, both for dense (e.g., k-cliques) and for sparse ones (e.g., k-stars). We consistently outperform the best available competitor by more than 10% on average and up to 32% in area under the curve. Importantly, the advantages of our approach over schemes based on uncorrelated link prediction increase with the increasing motif size and complexity. We also successfully apply our architecture for predicting more arbitrary clusters and communities, illustrating its potential for graph mining beyond motif analysis.

show abstract

Parallel Minimum Spanning Forest Computation using Sparse Matrix Kernels

Tim¹,

Raghavendra²,

Solomonik³

2021

Preprint

View full text Add to dashboard Cite

Formulations of graph algorithms using sparse linear algebra have yielded highly scalable distributed algorithms for problems such as connectivity and shortest path computation. We develop the first formulation of the Awerbuch-Shiloach parallel minimum spanning forest (MSF) algorithm using linear algebra primitives. We introduce a multilinear kernel that operates on an adjacency matrix and two vectors. This kernel updates graph vertices by simultaneously using information from both adjacent edges and vertices. In addition, we explore optimizations to accelerate the shortcutting step in the Awerbuch-Shiloach algorithm. We implement this MSF algorithm with Cyclops, a distributed-memory library for generalized sparse tensor algebra. We analyze the parallel scalability of our implementation on the Stampede2 supercomputer.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.