Acoustic analysis and recognition of whispered speech

Itoh,; Takeda,; Itakura,

doi:10.1109/icassp.2002.1005758

Cited by 12 publications

(10 citation statements)

References 0 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…Thus, for in vitro measurements, the first resonance frequency of a cylindrical tube tends to increase with glottal P E R S P E C T I V E aperture and open quotient (Barney et al, 2007). As for in vivo measurements, first resonances have been shown to fall at higher frequencies for whispering than for normal speech, for the same vocal tract shape (the same vowel articulation) Emanuel, 1984a, 1984b;Matsuda and Kasuya, 1999;Itoh et al, 2002;Swerdlin et al, 2008). This can be explained by the larger aperture of the glottis in that mode of phonation (Solomon et al, 1989).…”

Section: Effect Of the Glottis Aperture On Vocal Tract Resonancesmentioning

confidence: 80%

“…However, as with the normal voice, the source function is still not known, so the frequencies of the formants may not coincide precisely with those of the resonance. The main drawbacks of these methods are that glottal aperture may be larger for these modes of phonation than for normal speech (discussed later), giving rise to an increase of the first vocal tract resonance frequencies for a similar tract configuration (Matsuda and Kasuya, 1999;Itoh et al, 2002), and that articulation may change from normal to whispered or creak phonations Emanuel, 1984a, 1984b), changing the resonance characteristics as well.…”

Section: Output Sound When Excited At the Glottismentioning

confidence: 99%

“…This area is larger than that used in normal speech, but small compared to that used in breathing. Airflow through it is turbulent, and has a mix of all frequencies (Itoh et al, 2002;Matsuda and Kasuya, 1999). The creak voice or vocal fry has an oscillating glottis, but the variations are not periodic (Hollien and Michel, 1968).…”

Section: Output Sound When Excited At the Glottismentioning

confidence: 99%

See 2 more Smart Citations

Vocal tract resonances in speech, singing, and playing musical instruments

2009

View full text Add to dashboard Cite

Section: Effect Of the Glottis Aperture On Vocal Tract Resonancesmentioning

confidence: 80%

Section: Output Sound When Excited At the Glottismentioning

confidence: 99%

Section: Output Sound When Excited At the Glottismentioning

confidence: 99%

See 1 more Smart Citation

Vocal tract resonances in speech, singing, and playing musical instruments

2009

View full text Add to dashboard Cite

“…Former studies show that without fundamental frequency, formant estimation becomes prominent in its analysis and recognition [2,3]. In real-world environments where background noise is present, the signal-to-noise (SNR) of whispered speech is lower [4].…”

Section: Introductionmentioning

confidence: 99%

An Algorithm for Formant Estimation of Whispered Speech

Chenghui

Zhao

Lü

et al. 2006

2006 8th International Conference on Signal Processing

View full text Add to dashboard Cite

Whispered speech, always in low SRN, is more difficult in its formant estimation. This paper proposes an algorithm for formant estimation of whispered speech. It includes three subroutines: calculation of the autocorrelation function (ACF) of whispered speech, segmentation of the ACF spectrum to get the coefficients of the inverse-filter, inverse-filter control (IFC) of the ACF spectrum and calculation of the formant frequencies. The tests are carried on Chinese whispered vowels, and the proposed algorithm is proved to be efficient. Especially in low SNR, it is superior to conventional methods for formant estimation of whispered speech.

show abstract

“…However, there are few researches on whispered speech. Reference [3] recognizes Japanese whispered speech by using Mel-frequency cepstral coefficient (MFCC) feature and Hidden Markov Models (HMM). It finally has a recognition rate of 68% which can be increased by 10% with maximum likelihood linear regression (MLLR) adaptive training approach.…”

Section: Introductionmentioning

confidence: 99%

A Primary Research on Gabor Tensor Sparse Features Representation for Whispered Speech Recognition

Chen¹,

Zhao²,

Yu³

et al. 2015

Proceedings of the 2015 International Conference on Electrical, Automation and Mechanical Engineering

View full text Add to dashboard Cite

----Due to differences between normal and whispered speech, traditional feature performed poorly for whispered recognition. In this paper, a novel approach for whispered speech feature representation is proposed based on Gabor filtering and tensor factorization. The sparse feature is extracted by processing the data samples in tensor structure. The simulation results indicate that our proposed feature is able to improve the whispered speech recognition performance.

show abstract

Acoustic analysis and recognition of whispered speech

Cited by 12 publications

References 0 publications

Vocal tract resonances in speech, singing, and playing musical instruments

Vocal tract resonances in speech, singing, and playing musical instruments

An Algorithm for Formant Estimation of Whispered Speech

A Primary Research on Gabor Tensor Sparse Features Representation for Whispered Speech Recognition

Contact Info

Product

Resources

About