On the Efficient Representation and Execution of Deep Acoustic Models

Álvarez, Raziel; Prabhavalkar, Rohit; Bakhtin, Anton

doi:10.21437/interspeech.2016-128

Cited by 35 publications

(26 citation statements)

References 14 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…which defining the quantization range of both weight matrices W (1) and W (2) . However, our optimization goal is not effected by the choice of the other s j given the resulting r (1) j and r (2) j are smaller than r (1) i and r (2) i , respectively. To break the ties of solutions we decide to set ∀i : r (1) i = r (2) i .…”

Section: A Optimal Range Equalization Of Two Layersmentioning

confidence: 99%

“…However, our optimization goal is not effected by the choice of the other s j given the resulting r (1) j and r (2) j are smaller than r (1) i and r (2) i , respectively. To break the ties of solutions we decide to set ∀i : r (1) i = r (2) i . Thus the channel's ranges between both tensors are matched as closely as possible and the introduced quantization error is spread equally among both weight tensors.…”

Section: A Optimal Range Equalization Of Two Layersmentioning

confidence: 99%

See 1 more Smart Citation

Data-Free Quantization Through Weight Equalization and Bias Correction

Nagel¹,

Baalen²,

Blankevoort³

et al. 2019

2019 IEEE/CVF International Conference on Computer Vision (ICCV)

362

331

View full text Add to dashboard Cite

We introduce a data-free quantization method for deep neural networks that does not require fine-tuning or hyperparameter selection. It achieves near-original model performance on common computer vision architectures and tasks. 8-bit fixed-point quantization is essential for efficient inference in modern deep learning hardware architectures. However, quantizing models to run in 8-bit is a non-trivial task, frequently leading to either significant performance reduction or engineering time spent on training a network to be amenable to quantization. Our approach relies on equalizing the weight ranges in the network by making use of a scale-equivariance property of activation functions. In addition the method corrects biases in the error that are introduced during quantization. This improves quantization accuracy performance, and can be applied ubiquitously to almost any model with a straight-forward API call. For common architectures, such as the MobileNet family, we achieve state-of-the-art quantized model performance. We further show that the method also extends to other computer vision architectures and tasks such as semantic segmentation and object detection.

show abstract

Section: A Optimal Range Equalization Of Two Layersmentioning

confidence: 99%

Section: A Optimal Range Equalization Of Two Layersmentioning

confidence: 99%

Data-Free Quantization Through Weight Equalization and Bias Correction

Nagel¹,

Baalen²,

Blankevoort³

et al. 2019

2019 IEEE/CVF International Conference on Computer Vision (ICCV)

362

331

View full text Add to dashboard Cite

show abstract

“…The CTC acoustic model (AM) consists of 5 layers of 500 LSTM cells, that predict context-independent phonemes as output targets. The system is heavily compressed, both by quantization [43], and by the application of low-rank projection layers with 200 units between consecutive LSTM layers [44]. The AM consists of 4.6 million parameters in total.…”

Section: Model Detailsmentioning

confidence: 99%

Streaming small-footprint keyword spotting using sequence-to-sequence models

Prabhavalkar

Rao

et al. 2017

2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU)

Self Cite

View full text Add to dashboard Cite

We develop streaming keyword spotting systems using a recurrent neural network transducer (RNN-T) model: an all-neural, end-toend trained, sequence-to-sequence model which jointly learns acoustic and language model components. Our models are trained to predict either phonemes or graphemes as subword units, thus allowing us to detect arbitrary keyword phrases, without any out-ofvocabulary words. In order to adapt the models to the requirements of keyword spotting, we propose a novel technique which biases the RNN-T system towards a specific keyword of interest.Our systems are compared against a strong sequence-trained, connectionist temporal classification (CTC) based "keyword-filler" baseline, which is augmented with a separate phoneme language model. Overall, our RNN-T system with the proposed biasing technique significantly improves performance over the baseline system.

show abstract

“…However, the ASR accuracy is significantly lost when the model is compressed heavily into even lower bits or the network structure becomes more complex. Therefore, refining with quantization is important for both very low-bit quantization [177], [178] and vector quantization [179] so that the training and testing are consistent.…”

Section: Acoustic Models With Efficient Decodingmentioning

confidence: 99%

Recent progresses in deep learning based acoustic models

2017

IEEE/CAA J. Autom. Sinica

161

View full text Add to dashboard Cite

In this paper, we summarize recent progresses made in deep learning based acoustic models and the motivation and insights behind the surveyed techniques. We first discuss acoustic models that can effectively exploit variable-length contextual information, such as recurrent neural networks (RNNs), convolutional neural networks (CNNs), and their various combination with other models. We then describe acoustic models that are optimized end-to-end with emphasis on feature representations learned jointly with rest of the system, the connectionist temporal classification (CTC) criterion, and the attention-based sequenceto-sequence model. We further illustrate robustness issues in speech recognition systems, and discuss acoustic model adaptation, speech enhancement and separation, and robust training strategies. We also cover modeling techniques that lead to more efficient decoding and discuss possible future directions in acoustic model research. 1

show abstract

On the Efficient Representation and Execution of Deep Acoustic Models

Cited by 35 publications

References 14 publications

Data-Free Quantization Through Weight Equalization and Bias Correction

Data-Free Quantization Through Weight Equalization and Bias Correction

Streaming small-footprint keyword spotting using sequence-to-sequence models

Recent progresses in deep learning based acoustic models

Contact Info

Product

Resources

About