2017
DOI: 10.48550/arxiv.1708.04469
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Comparison of Decoding Strategies for CTC Acoustic Models

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
7
0

Year Published

2017
2017
2023
2023

Publication Types

Select...
4
2
1

Relationship

3
4

Authors

Journals

citations
Cited by 8 publications
(7 citation statements)
references
References 0 publications
0
7
0
Order By: Relevance
“…We used a RNN based LM, trained on graphemes as described in [22]. It featured 1 hidden layer with 1024 LSTM cells.…”
Section: Grapheme Based Rnn Lmmentioning
confidence: 99%
“…We used a RNN based LM, trained on graphemes as described in [22]. It featured 1 hidden layer with 1024 LSTM cells.…”
Section: Grapheme Based Rnn Lmmentioning
confidence: 99%
“…Except for the modeling unit, these models are very similar to conventional acoustic models and perform well when combined with an external LM during decoding (beam search). [23,24].…”
Section: Introductionmentioning
confidence: 99%
“…To evaluate our setup, we used the same decoding procedure as in [3] and greedily search the best path without an external language model and evaluated our systems by computing the token error rate (TER) as primary measure. In addition, we trained a character based neural network language model for English on the training utterances, as described in [50], so that for the recognition of English we could also measure a word error rate (WER) by decoding the network outputs with this language model. As the language model is only trained on only a small amount of data, the word error rate obtained with it should indicate whether the improvements in TER of the pure CTC model measured on English also lead to a better word level speech recognition system.…”
Section: Discussionmentioning
confidence: 99%