2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2016
DOI: 10.1109/icassp.2016.7472780
|View full text |Cite
|
Sign up to set email alerts
|

Highway long short-term memory RNNS for distant speech recognition

Abstract: In this paper, we extend the deep long short-term memory (DL-STM) recurrent neural networks by introducing gated direct connections between memory cells in adjacent layers. These direct links, called highway connections, enable unimpeded information flow across different layers and thus alleviate the gradient vanishing problem when building deeper LSTMs. We further introduce the latency-controlled bidirectional LSTMs (BLSTMs) which can exploit the whole history while keeping the latency under control. Efficien… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

4
216
1

Year Published

2016
2016
2020
2020

Publication Types

Select...
5
3
1

Relationship

1
8

Authors

Journals

citations
Cited by 264 publications
(221 citation statements)
references
References 18 publications
4
216
1
Order By: Relevance
“…A recent study in advanced acoustic modeling using deep long short-term memory (LSTM) recurrent neural networks reported significant improvement for AMI's single distant microphone (SDM) task with 47.7% WER, even though it does not consider multi-channel inputs [51]. This work may not be directly compared with our results since it used sequence discriminative training with dropout and DNN to force align the training data to generate labels for LSTM training.…”
Section: Multi-channel Integration In Acoustic Modelingcontrasting
confidence: 47%
“…A recent study in advanced acoustic modeling using deep long short-term memory (LSTM) recurrent neural networks reported significant improvement for AMI's single distant microphone (SDM) task with 47.7% WER, even though it does not consider multi-channel inputs [51]. This work may not be directly compared with our results since it used sequence discriminative training with dropout and DNN to force align the training data to generate labels for LSTM training.…”
Section: Multi-channel Integration In Acoustic Modelingcontrasting
confidence: 47%
“…Highway Connections To alleviate the vanishing gradient problem when training deep BiLSTMs, we use gated highway connections (Zhang et al, 2016;Srivastava et al, 2015). We include transform gates r t to control the weight of linear and non-linear transformations between layers (See Figure 1).…”
Section: Deep Bilstm Modelmentioning
confidence: 99%
“…Following Zhou and Xu (2015), we treat SRL as a BIO tagging problem and use deep bidirectional LSTMs. However, we differ by (1) simplifying the input and output layers, (2) introducing highway connections (Srivastava et al, 2015;Zhang et al, 2016), (3) using recurrent dropout (Gal and Ghahramani, 2016), (4) decoding with BIOconstraints, and (5) ensembling with a product of experts. Our model gives a 10% relative error reduction over previous state of the art on the test sets of CoNLL 2005 and 2012.…”
Section: Introductionmentioning
confidence: 99%
“…This includes context-sensitive-chunk BLSTM (CSC-BLSTM) [25] and latency-controlled BLSTM (LC-BLSTM) [26]. Figure 2 shows the differences among these approaches.…”
Section: Local Window Blstmmentioning
confidence: 99%