Chaogang Wei scite author profile

Amplitude modulation (AM) and frequency modulation (FM) are commonly used in communication, but their relative contributions to speech recognition have not been fully explored. To bridge this gap, we derived slowly varying AM and FM from speech sounds and conducted listening tests using stimuli with different modulations in normal-hearing and cochlear-implant subjects. We found that although AM from a limited number of spectral bands may be sufficient for speech recognition in quiet, FM significantly enhances speech recognition in noise, as well as speaker and tone recognition. Additional speech reception threshold measures revealed that FM is particularly critical for speech recognition with a competing voice and is independent of spectral resolution and similarity. These results suggest that AM and FM provide independent yet complementary contributions to support robust speech recognition under realistic listening situations. Encoding FM may improve auditory scene analysis, cochlear-implant, and audiocoding performance.auditory analysis ͉ cochlear implant ͉ neural code ͉ phase ͉ scene analysis A coustic cues in speech sounds allow a listener to derive not only the meaning of an utterance but also the speaker's identity and emotion. Most traditional research has taken a reductionist's approach in investigation of the minimal cues for speech recognition (1). Previous studies using either naturally produced whispered speech (2) or artificially synthesized speech (3, 4) have isolated and identified several important acoustic cues for speech recognition. For example, computers relying on primarily spectral cues and human cochlear-implant listeners relying on primarily temporal cues can achieve a high level of speech recognition in quiet (5-7). As a result, spectral and temporal acoustic cues have been interpreted as built-in redundancy mechanisms in speech recognition (8). However, this redundancy interpretation is challenged by the extremely poor performance of both computers and human cochlear implant users in realistic listening situations where noise is typically present (7, 9).The goal of this study was to delineate the relative contributions of spectral and temporal cues to speech recognition in realistic listening situations. We chose three speech perception tasks that are known to be notoriously difficult for computers and human cochlear-implant users, including speech recognition with a competing voice, speaker recognition, and Mandarin tone recognition. We approached the issue by extracting slowly varying amplitude modulation (AM) and frequency modulation (FM) from a number of frequency bands in speech sounds and testing their relative contributions to speech recognition in acoustic and electric hearing. The AM-only speech has been used in previous studies (3, 10) and is considered to be an acoustic simulation of the cochlear implant (5). Different from previous studies using relatively ''fast'' FM to track formant changes in speech production (4, 11) or fine structure in speech acoustics (12, 13), the ''...

show abstract

Mandarin tone recognition in cochlear-implant subjects

Wei

2004

View full text Add to dashboard Cite

Psychophysical Performance and Mandarin Tone Recognition in Noise by Cochlear Implant Users

Wei

Cao

Jin

et al. 2007

View full text Add to dashboard Cite

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Chaogang Wei

Speech recognition with amplitude and frequency modulations

Mandarin tone recognition in cochlear-implant subjects

Psychophysical Performance and Mandarin Tone Recognition in Noise by Cochlear Implant Users

Contact Info

Product

Resources

About