Activation Density driven Energy-Efficient Pruning in Training

Foldy-Porto, Timothy; Venkatesha, Yeshwanth; Panda, Priyadarshini

doi:10.48550/arxiv.2002.02949

Cited by 1 publication

(4 citation statements)

References 10 publications

(24 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…There is relative lack of research on networks that have both forms of sparsity (exceptions are [1,94]). In some scenarios networks trained without explicit activation sparsity end up with highly sparse activations anyway [8,19,20,33]. This is encouraging because it suggests that sparse activations may naturally be an optimal outcome.…”

Section: Deploying Complex Sparse-sparse Systemsmentioning

confidence: 84%

“…In our implementation we use k-WTA to achieve activation sparsity. Another approach is to remove entire channels in convolutional layers during training through a structured pruning process [19,44,73]. In [19] they notice that activations naturally become sparse during training and use a measure of sparsity to gradually prune channels.…”

Section: Accelerating Sparse Network On Other Platformsmentioning

confidence: 99%

“…Another approach is to remove entire channels in convolutional layers during training through a structured pruning process [19,44,73]. In [19] they notice that activations naturally become sparse during training and use a measure of sparsity to gradually prune channels. In [73] the method is extended to incorporate mixed precision quantization based on activation sparsity and then evaluated on a hardware simulation platform.…”

Section: Accelerating Sparse Network On Other Platformsmentioning

confidence: 99%

“…As a result in principle all phases of training should see significant speedup through Complementary Sparsity. Structured pruning steps [19,44,73], as discussed earlier, can also be folded in to further accelerate training. These techniques may require highly flexible circuitry, and as such, FPGAs may serve as a better platform than GPUs for developing these ideas.…”

Section: Future Directionsmentioning

confidence: 99%

See 3 more Smart Citations

Two sparsities are better than one: unlocking the performance benefits of sparse–sparse networks

Hunter¹,

Spracklen²,

Ahmad³

2022

Neuromorph. Comput. Eng.

View full text Add to dashboard Cite

In principle, sparse neural networks should be significantly more efficient than traditional dense networks. Neurons in the brain exhibit two types of sparsity; they are sparsely interconnected and sparsely active. These two types of sparsity, called weight sparsity and activation sparsity, when combined, offer the potential to reduce the computational cost of neural networks by two orders of magnitude. Despite this potential, today’s neural networks deliver only modest performance benefits using just weight sparsity, because traditional computing hardware cannot efficiently process sparse networks. In this article we introduce Complementary Sparsity, a novel technique that significantly improves the performance of dual sparse networks on existing hardware. We demonstrate that we can achieve high performance running weight-sparse networks, and we can multiply those speedups by incorporating activation sparsity. Using Complementary Sparsity, we show up to 100X improvement in throughput and energy efficiency performing inference on FPGAs. We analyze scalability and resource tradeoffs for a variety of kernels typical of commercial convolutional networks such as ResNet-50 and MobileNetV2. Our results with Complementary Sparsity suggest that weight plus activation sparsity can be a potent combination for efficiently scaling future AI models.

show abstract