2007
DOI: 10.1109/tasl.2007.905178
|View full text |Cite
|
Sign up to set email alerts
|

Robust Speech Rate Estimation for Spontaneous Speech

Abstract: In this paper, we propose a direct method for speech rate estimation from acoustic features without requiring any automatic speech transcription. We compare various spectral and temporal signal analysis and smoothing strategies to better characterize the underlying syllable structure to derive speech rate. The proposed algorithm extends the methods of spectral subband correlation by including temporal correlation and the use of prominent spectral subbands for improving the signal correlation essential for syll… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

1
101
0

Year Published

2007
2007
2024
2024

Publication Types

Select...
5
1

Relationship

0
6

Authors

Journals

citations
Cited by 93 publications
(104 citation statements)
references
References 34 publications
1
101
0
Order By: Relevance
“…This method can produce spurious peaks in non-speech and unvoiced regions, so a pitch detector is applied to the waveform and all peaks corresponding to unvoiced and non-speech segments are removed. In [4], it was reported that this method correctly placed nuclei in 80.6% of the syllables in a hand transcribed test set. In [3,4], peaks that fall below a minimum threshold are rejected and the result is a binary feature.…”
Section: Use Of Estimated Syllable Nucleimentioning
confidence: 98%
See 4 more Smart Citations
“…This method can produce spurious peaks in non-speech and unvoiced regions, so a pitch detector is applied to the waveform and all peaks corresponding to unvoiced and non-speech segments are removed. In [4], it was reported that this method correctly placed nuclei in 80.6% of the syllables in a hand transcribed test set. In [3,4], peaks that fall below a minimum threshold are rejected and the result is a binary feature.…”
Section: Use Of Estimated Syllable Nucleimentioning
confidence: 98%
“…In [4], it was reported that this method correctly placed nuclei in 80.6% of the syllables in a hand transcribed test set. In [3,4], peaks that fall below a minimum threshold are rejected and the result is a binary feature. For our experiments we do not make a hard decision, instead we retain all the maxima points and use the actual height value as a feature.…”
Section: Use Of Estimated Syllable Nucleimentioning
confidence: 98%
See 3 more Smart Citations