1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258) 1999
DOI: 10.1109/icassp.1999.758138
|View full text |Cite
|
Sign up to set email alerts
|

On the limits of speech recognition in noise

Abstract: In this article, we consider the performance of speech recognition in noise and focus on its sensitivity to the acoustic feature set. In particular, we examine the perceived information reduction imposed on a speech signal using a feature extraction method commonly used for automatic speech recognition. We observe that the human recognition rates on noisy digit strings drop considerably as the speech signal undergoes the typical loss of phase and loss of frequency resolution. Steps are taken to ensure that hum… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

2
12
0

Year Published

2007
2007
2024
2024

Publication Types

Select...
5
3

Relationship

0
8

Authors

Journals

citations
Cited by 19 publications
(14 citation statements)
references
References 2 publications
2
12
0
Order By: Relevance
“…Experiments conducted by Peters et al (1999) demonstrate that these conclusions are not correct in case of noisy speech recordings. He suggests that information lost by the conventional acoustic analysis (phase and fine spectral resolution) may become crucial for intelligibility in case of speech distortions (reverberation, environment noise, etc.).…”
Section: Specific Methodologiesmentioning
confidence: 94%
See 1 more Smart Citation
“…Experiments conducted by Peters et al (1999) demonstrate that these conclusions are not correct in case of noisy speech recordings. He suggests that information lost by the conventional acoustic analysis (phase and fine spectral resolution) may become crucial for intelligibility in case of speech distortions (reverberation, environment noise, etc.).…”
Section: Specific Methodologiesmentioning
confidence: 94%
“…It has been suggested (Demuynck et al, 2004;Leonard, 1984;Peters et al, 1999) that conventional cepstral representation of speech may destroy important information by ignoring the phase (power spectrum estimation) and reducing the spectral resolution (Mel filter bank, LPC, cepstral liftering, etc. ).…”
Section: Specific Methodologiesmentioning
confidence: 99%
“…To account for this uncertainty in the probability that communication actually occurs we weight the SE values throughout the area by a probability-of-recognition term, PR; where PR = 0.5 at SE = 0 dB, and PR = 1.0 at SE = 18 dB. The upper value of 18 dB is assumed based on analogy to recognition thresholds in human speech (Pearsons et al 1977, Tafalla & Evans 1997, Peters et al 1999. We refer to this area weighted by a recognition function as a sender communication space.…”
Section: Communication Masking Terms and Algorithmmentioning
confidence: 99%
“…The recognition accuracy based on the majority of three listeners was 99.9%, indicating that signals resynthesized from the spectral envelope of short-time fragments of speech are sufficient in acoustically optimal conditions. Peters et al (1999) carried out a comparison of HSR and ASR recognition performance with unaltered and resynthesized speech. Feature vectors calculated from noisy digits were converted to audible signals based on an analytical processing scheme.…”
Section: Introductionmentioning
confidence: 99%