Proceedings of the ACM International Conference on Parallel Architectures and Compilation Techniques 2020 # Accelerating Sparse CNN Inference on GPUs with Performance-Aware Weight Pruning

**Abstract:** Weight pruning is a popular technique to reduce the size and computation complexity of the Convolutional Neural Networks (CNNs). Despite its success in reducing the model size, weight pruning has brought limited benefit to the CNN inference performance, due to the irregularity introduced in the sparse convolution operations. In this work, we aim to improve the performance of sparse convolutions on GPUs by mitigating the irregularity. We find that the existing performance optimization techniques for sparse matr…

Help me understand this report

Search citation statements

Paper Sections

Select...

3

1

1

Citation Types

0

15

0

Year Published

2022

2023

Publication Types

Select...

4

1

1

Relationship

1

5

Authors

Journals

(16 citation statements)

0

15

0

“…To combine the benefits of structural and unstructured pruning, hybrid pruning strategies have been introduced to pursue more general structural spares patterns which are also capable of acceleration. For example, convolution kernels with half regular sparsity or pattern-based structural sparsity (Ma et al, 2020) or vector-wise (Zhu et al, 2019) and group-wise (Rumi et al, 2020) regular sparsity.…”

confidence: 99%

“…To combine the benefits of structural and unstructured pruning, hybrid pruning strategies have been introduced to pursue more general structural spares patterns which are also capable of acceleration. For example, convolution kernels with half regular sparsity or pattern-based structural sparsity (Ma et al, 2020) or vector-wise (Zhu et al, 2019) and group-wise (Rumi et al, 2020) regular sparsity.…”

confidence: 99%

“…In other words, the dense matrices in identified structural patterns have a restricted shape where one dimension must align with the kernel size n, i.e., the continued product of the number of input channels, channel height, and weight. Motivated by Rumi et al (2020), we introduce a regrouping strategy (Figure 2) to create more fine-grained group-wise structural patterns with flexible shapes for remaining dense matrices.…”

confidence: 99%