Ansh Nagda scite author profile

We use a variety of techniques to optimize the single-threaded operation of:C = C + A * BA number of optimizations techniques yielded significant speedups, including multi-level blocking, copy optimizations, and loop adjustments. However, we also tried a number of other optimizations that did not improve our program, including prefetching. In this report, we describe each of our optimizations in our final submission, present results with evidence that they work, and describe attempted optimizations that did not noticeably improve overall our performance. Unless otherwise noted, all of our benchmarks operate on matrices of size 1024x1024.In our attempts at optimizing matrix multiplication, we find that taking full advantage of spatial locality (through techniques like loop rearrangement or copy optimizations) and instruction-level parallelism greatly improves the performance of our operation. Furthermore, techniques like prefetching can be a double-edged sword; it can help performance when used correctly but can slow down performance as well. Finally, we find that the compiler adds optimizations out-of-the-box, like loop-unrolling, that help with performance underneath.

show abstract

Optimizing Matrix Multiplication On NERSC’s High Performance Computer Cori

Jiang¹,

Lin²,

Nagda³

2022

Preprint

View full text Add to dashboard Cite

show abstract

Near-linear Size Hypergraph Cut Sparsifiers

Chen

Khanna

Nagda

2020

Preprint

View full text Add to dashboard Cite

Cuts in graphs are a fundamental object of study, and play a central role in the study of graph algorithms. The problem of sparsifying a graph while approximately preserving its cut structure has been extensively studied and has many applications. In a seminal work, Benczr and Karger (1996) showed that given any n-vertex undirected weighted graph G and a parameter ε ∈ (0, 1), there is a near-linear time algorithm that outputs a weighted subgraph G ′ of G of size Õ(n/ε 2 ) such that the weight of every cut in G is preserved to within a (1±ε)-factor in G ′ . The graph G ′ is referred to as a (1 ± ε)-approximate cut sparsifier of G.A natural question is if such cut-preserving sparsifiers also exist for hypergraphs. Kogan and Krauthgamer (2015) initiated a study of this question and showed that given any weighted hypergraph H where the cardinality of each hyperedge is bounded by r, there is a polynomialtime algorithm to find a (1 ± ε)-approximate cut sparsifier of H of size Õ( nr ε 2 ). Since r can be as large as n, in general, this gives a hypergraph cut sparsifier of size Õ(n 2 /ε 2 ), which is a factor n larger than the Benczr-Karger bound for graphs. It has been an open question whether or not Benczr-Karger bound is achievable on hypergraphs. In this work, we resolve this question in the affirmative by giving a new polynomial-time algorithm for creating hypergraph sparsifiers of size Õ(n/ε 2 ).

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Ansh Nagda

Near-linear Size Hypergraph Cut Sparsifiers

Optimizing Matrix Multiplication On NERSC’s High Performance Computer Cori

Optimizing Matrix Multiplication On NERSC’s High Performance Computer Cori

Near-linear Size Hypergraph Cut Sparsifiers

Contact Info

Product

Resources

About