8th European Conference on Speech Communication and Technology (Eurospeech 2003) 2003
DOI: 10.21437/eurospeech.2003-266
|View full text |Cite
|
Sign up to set email alerts
|

Automatic induction of n-gram language models from a natural language grammar

Abstract: This paper details our work in developing a technique which can automatically generate class n-gram language models from natural language (NL) grammars in dialogue systems. The procedure eliminates the need for double maintenance of the recognizer language model and NL grammar. The resulting language model adopts the standard class n-gram framework for computational efficiency. Moreover, both the n-gram classes and training sentences are enhanced with semantic/syntactic tags defined in the NL grammar, such tha… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
0
0

Year Published

2006
2006
2008
2008

Publication Types

Select...
3
2
1

Relationship

1
5

Authors

Journals

citations
Cited by 12 publications
(1 citation statement)
references
References 11 publications
0
0
0
Order By: Relevance
“…The acoustic models for the speech recognition system, SUMMIT, are trained with a combination of two data corpora, "Yinhe" [8] and "MAT2000" [9], both of which contain Mandarin Chinese speech data from native speakers. The class n-gram language model is trained [10,11] by parsing a corpus using TINA [12]. Since we do not yet have a training corpus from users, we make use of the English corpus from CityBrowser I.…”
Section: Speech Recognition and Synthesismentioning
confidence: 99%
“…The acoustic models for the speech recognition system, SUMMIT, are trained with a combination of two data corpora, "Yinhe" [8] and "MAT2000" [9], both of which contain Mandarin Chinese speech data from native speakers. The class n-gram language model is trained [10,11] by parsing a corpus using TINA [12]. Since we do not yet have a training corpus from users, we make use of the English corpus from CityBrowser I.…”
Section: Speech Recognition and Synthesismentioning
confidence: 99%