Interspeech 2019 2019
DOI: 10.21437/interspeech.2019-2811
|View full text |Cite
|
Sign up to set email alerts
|

ShrinkML: End-to-End ASR Model Compression Using Reinforcement Learning

Abstract: End-to-end automatic speech recognition (ASR) models are increasingly large and complex to achieve the best possible accuracy. In this paper, we build an AutoML system that uses reinforcement learning (RL) to optimize the per-layer compression ratios when applied to a state-of-the-art attention based end-to-end ASR model composed of several LSTM layers. We use singular value decomposition (SVD) low-rank matrix factorization as the compression method. For our RL-based Au-toML system, we focus on practical consi… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
14
0

Year Published

2020
2020
2022
2022

Publication Types

Select...
4
3
1

Relationship

4
4

Authors

Journals

citations
Cited by 20 publications
(14 citation statements)
references
References 19 publications
0
14
0
Order By: Relevance
“…In this manner, we remove both the excessive overhead of fine-tuning and the need for labelled data availability, which is crucial for real-world, privacy-aware applications (Wainwright et al, 2012;Shokri & Shmatikov, 2015). Finally, other model compression methods (Fang et al, 2018;Wang et al, 2019a;Dudziak et al, 2019) remain orthogonal to FjORD. System Heterogeneity.…”
Section: Related Workmentioning
confidence: 99%
“…In this manner, we remove both the excessive overhead of fine-tuning and the need for labelled data availability, which is crucial for real-world, privacy-aware applications (Wainwright et al, 2012;Shokri & Shmatikov, 2015). Finally, other model compression methods (Fang et al, 2018;Wang et al, 2019a;Dudziak et al, 2019) remain orthogonal to FjORD. System Heterogeneity.…”
Section: Related Workmentioning
confidence: 99%
“…NAS aims at automatically designing neural network architectures [15][16][17][18]. For example, He et al [17] proposed the AutoML for Model Compression (AMC) method and introduced reinforcement learning to learn the optimal parameters of pruning; Dudziak et al [18] used reinforcement learning to select the per-layer compression ratios based on matrix approximation. However, such a method requires massive computation during the search.…”
Section: Related Workmentioning
confidence: 99%
“…Neural Architecture Search for Super-resolution. Recent SR works aim to build more efficient models using NAS, which has been vastly successful in a wide range of tasks such as image classification [57,55,38], language modeling [56], and automatic speech recognition [12]. We mainly focus on previous works that adopt NAS for SR and refer the reader to Elsken et al [13] for a detailed survey on NAS.…”
Section: Background and Related Workmentioning
confidence: 99%