Proceedings of the Workshop on Human Language Technology - HLT '94 1994
DOI: 10.3115/1075812.1075905
|View full text |Cite
|
Sign up to set email alerts
|

A one pass decoder design for large vocabulary recognition

Abstract: To achieve reasonable accuracy in large vocabulary speech recognition systems, it is important to use detailed acoustic models together with good long span language models. For example, in the Wall Street Journal (WSJ) task both cross-word triphones and a trigram language model are necessary to achieve state-of-the-art performance. However, when using these models, the size of a pre-compiled recognition network can make a standard Viterbi search infeasible and hence, either multiple-pass or asynchronous stack … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
55
0

Year Published

1995
1995
2014
2014

Publication Types

Select...
5
5

Relationship

2
8

Authors

Journals

citations
Cited by 78 publications
(55 citation statements)
references
References 14 publications
(10 reference statements)
0
55
0
Order By: Relevance
“…Efficient use of language models in speech recognizers [20,19,17] requires that the context dependent states representing different histories during search can be appropriately shared among multiple paths. This applies to both conventional back-off n-gram and feedforward NNLMs.…”
Section: History Context Clustering For Rnnlmsmentioning
confidence: 99%
“…Efficient use of language models in speech recognizers [20,19,17] requires that the context dependent states representing different histories during search can be appropriately shared among multiple paths. This applies to both conventional back-off n-gram and feedforward NNLMs.…”
Section: History Context Clustering For Rnnlmsmentioning
confidence: 99%
“…To deal with this, a number of different architectural approaches have evolved. For Viterbi decoding, the search space can either be constrained by maintaining multiple hypotheses in parallel [173,191,192] or it can be expanded dynamically as the search progresses [7,69,130,132]. Alternatively, a completely different approach can be taken where the breadth-first approach of the Viterbi algorithm is replaced by a depth-first search.…”
Section: Decoding and Lattice Generation 213mentioning
confidence: 99%
“…However, for continuous speech recognition systems using higher order language models the linguistic state cannot be determined locally and the word boundaries are uncertain. Several solutions based on creating copies of the PPT for each unique linguistic context solve this problem [8,9,10], however these approaches create redundant sub-tree computations, the number of which correspond to the number of active linguistic contexts. A computation is redundant when a sub-tree instance is dominated by another instance of that sub-tree.…”
Section: Re-entrant Vs Non Re-entrant Treesmentioning
confidence: 99%