2020
DOI: 10.1609/aaai.v34i04.5938
|View full text |Cite
|
Sign up to set email alerts
|

Structured Sparsification of Gated Recurrent Neural Networks

Abstract: One of the most popular approaches for neural network compression is sparsification — learning sparse weight matrices. In structured sparsification, weights are set to zero by groups corresponding to structure units, e. g. neurons. We further develop the structured sparsification approach for the gated recurrent neural networks, e. g. Long Short-Term Memory (LSTM). Specifically, in addition to the sparsification of individual weights and neurons, we propose sparsifying the preactivations of gates. This makes s… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
8
0

Year Published

2020
2020
2022
2022

Publication Types

Select...
5
3

Relationship

1
7

Authors

Journals

citations
Cited by 10 publications
(8 citation statements)
references
References 14 publications
(31 reference statements)
0
8
0
Order By: Relevance
“…Earlier works have focused on inducing sparsity in standard feed-forward neural networks. Yet, Bayesian pruning methods have also been successfully applied to recurrent neural networks (RNNs) [Kodryan et al 2019;Lobacheva et al 2018]. Lobacheva et al [2018] use Sparse VD to prune individual weights of an LSTM or follow the approach from Louizos et al [2017] to sparsify neurons or gates and show results on text classification or language modeling problems.…”
Section: Variational Selection Schemesmentioning
confidence: 99%
“…Earlier works have focused on inducing sparsity in standard feed-forward neural networks. Yet, Bayesian pruning methods have also been successfully applied to recurrent neural networks (RNNs) [Kodryan et al 2019;Lobacheva et al 2018]. Lobacheva et al [2018] use Sparse VD to prune individual weights of an LSTM or follow the approach from Louizos et al [2017] to sparsify neurons or gates and show results on text classification or language modeling problems.…”
Section: Variational Selection Schemesmentioning
confidence: 99%
“…In future work, we intend to investigate other types of priors over the network parameters (e.g., sparse priors (Lobacheva et al, 2017)). We would also like to explicitly quantify the uncertainty captured in our framework under different sampling strategies or MCMC-SG methods (e.g., similar to Mc-Clure and Kriegeskorte (2016); Teye et al (2018)).…”
Section: Discussionmentioning
confidence: 99%
“…One more drawback of neural networks is that they are slower than classic ML algorithms such as linear models or boosting, and require more memory to store parameters. But there are several techniques aimed at reducing the time and memory complexity of the trained models [13,[22][23][24].…”
Section: Neural Networkmentioning
confidence: 99%