1995
DOI: 10.1073/pnas.92.22.9956
|View full text |Cite
|
Sign up to set email alerts
|

State of the art in continuous speech recognition.

Abstract: In the past decade, tremendous advances in the state of the art of automatic speech recognition by machine have taken place. A reduction in the word error rate by more than a factor of 5 and an increase in recognition speeds by several orders of magnitude (brought about by a combination of faster recognition search algorithms and more powerful computers), have combined to make high-accuracy, speakerindependent, continuous speech recognition for large vocabularies possible in real time, on off-the-shelf worksta… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
27
0

Year Published

1998
1998
2022
2022

Publication Types

Select...
5
2
2

Relationship

0
9

Authors

Journals

citations
Cited by 59 publications
(29 citation statements)
references
References 19 publications
0
27
0
Order By: Relevance
“…In order to make our choice we carried out a rationale decision making process by utilizing the Questions, Options and Criteria (QOC) technique [8]. For this purpose we defined the following important criteria for every grammatical property: Learnability; defines whether the grammatical marking in question would be easy to learn or not, Expected recognition accuracy; defines the effect the grammatical marking would have on the anticipated word error rate given that the more constrained a grammar (lower perplexity) is the better it would be for recognition [9], Vocabulary size; describes the effect the grammatical marking would have on increasing or decreasing the vocabulary size, Expressive Ability of the language; defines whether using the grammatical marking in question would actually enable speakers to express more concepts then they would have been unable to do so otherwise, Efficiency; simply relates the grammatical marking to how many words would be required to communicate any solitary meaning, Acknowledgement within Natural and Artificial Languages; states the popularity of the particular grammatical marking amongst each type of languages. Appropriate weights were assigned to the criteria based on importance, for e.g.…”
Section: Grammar Designmentioning
confidence: 99%
“…In order to make our choice we carried out a rationale decision making process by utilizing the Questions, Options and Criteria (QOC) technique [8]. For this purpose we defined the following important criteria for every grammatical property: Learnability; defines whether the grammatical marking in question would be easy to learn or not, Expected recognition accuracy; defines the effect the grammatical marking would have on the anticipated word error rate given that the more constrained a grammar (lower perplexity) is the better it would be for recognition [9], Vocabulary size; describes the effect the grammatical marking would have on increasing or decreasing the vocabulary size, Expressive Ability of the language; defines whether using the grammatical marking in question would actually enable speakers to express more concepts then they would have been unable to do so otherwise, Efficiency; simply relates the grammatical marking to how many words would be required to communicate any solitary meaning, Acknowledgement within Natural and Artificial Languages; states the popularity of the particular grammatical marking amongst each type of languages. Appropriate weights were assigned to the criteria based on importance, for e.g.…”
Section: Grammar Designmentioning
confidence: 99%
“…HMMs are used to model the acoustic properties of the recognition tokens, which may be entire words, sub-word units or combinations of words. In the HMM framework, speech is modeled as a two-step probabilistic process (Rabiner and Huang, 1993;Makhoul and Schartz, 1994). In the first step, speech is modeled as a sequence of acoustic states.…”
Section: The Asr Frameworkmentioning
confidence: 99%
“…HMMs are doubly stochastic because both transition and output probabilities come into play. The Viterbi search algorithm is used to find the sequence of states most likely to generate a given output sequence, without searching all possible sequences (Makhoul and Schwartz 1994). Most phonetic HMMs are 3-state left-to-right models as in Figure 5 which represent the beginning, middle and end stages of an utterance.…”
Section: Hidden Markov Models (Hmm)mentioning
confidence: 99%