2006
DOI: 10.1109/tsa.2005.853208
|View full text |Cite
|
Sign up to set email alerts
|

Prosody dependent speech recognition on radio news corpus of American English

Abstract: Abstract-Does prosody help word recognition? This paper proposes a novel probabilistic framework in which word and phoneme are dependent on prosody in a way that reduces word error rates (WER) relative to a prosody-independent recognizer with comparable parameter count. In the proposed prosody-dependent speech recognizer, word and phoneme models are conditioned on two important prosodic variables: the intonational phrase boundary and the pitch accent. An information-theoretic analysis is provided to show that … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
32
0

Year Published

2007
2007
2022
2022

Publication Types

Select...
4
2

Relationship

0
6

Authors

Journals

citations
Cited by 38 publications
(32 citation statements)
references
References 23 publications
(39 reference statements)
0
32
0
Order By: Relevance
“…The word to be predicted is more likely to be ``witch" instead of ``which" if an accent is predicted from the current word-prosody context. In the results reported by (Chen et al, 2006), a prosody dependent language model can significantly improve word recognition accuracy over a prosody independent language model, given the same acoustic model. N-gram models can be conveniently used for prosody dependent language modeling.…”
Section: [ŵ ] = Arg Max P(o | Wp) P(wp) = Arg Max P(o | Qh ) P(qhmentioning
confidence: 97%
See 4 more Smart Citations
“…The word to be predicted is more likely to be ``witch" instead of ``which" if an accent is predicted from the current word-prosody context. In the results reported by (Chen et al, 2006), a prosody dependent language model can significantly improve word recognition accuracy over a prosody independent language model, given the same acoustic model. N-gram models can be conveniently used for prosody dependent language modeling.…”
Section: [ŵ ] = Arg Max P(o | Wp) P(wp) = Arg Max P(o | Qh ) P(qhmentioning
confidence: 97%
“…In (Chen et al 2006), the prosody variable p m takes 8 possible values composed by 2 discrete prosodic variables: a variable a that marks a word as either ``a'' (pitch-accented) or ``u'' (pitch-unaccented), and a variable b that marks a word as ``i,m,f,o'' (phrase-initial, phrasemedial, phrase-final, one-word phrase) according to its position in an intonational phrase. Thus, in this scheme, a prosody-dependent word transcription may contain prosodydependent word tokens of the form w ab .…”
Section: [ŵ ] = Arg Max P(o | Wp) P(wp) = Arg Max P(o | Qh ) P(qhmentioning
confidence: 99%
See 3 more Smart Citations