Keli Cao scite author profile

Amplitude modulation (AM) and frequency modulation (FM) are commonly used in communication, but their relative contributions to speech recognition have not been fully explored. To bridge this gap, we derived slowly varying AM and FM from speech sounds and conducted listening tests using stimuli with different modulations in normal-hearing and cochlear-implant subjects. We found that although AM from a limited number of spectral bands may be sufficient for speech recognition in quiet, FM significantly enhances speech recognition in noise, as well as speaker and tone recognition. Additional speech reception threshold measures revealed that FM is particularly critical for speech recognition with a competing voice and is independent of spectral resolution and similarity. These results suggest that AM and FM provide independent yet complementary contributions to support robust speech recognition under realistic listening situations. Encoding FM may improve auditory scene analysis, cochlear-implant, and audiocoding performance.auditory analysis ͉ cochlear implant ͉ neural code ͉ phase ͉ scene analysis A coustic cues in speech sounds allow a listener to derive not only the meaning of an utterance but also the speaker's identity and emotion. Most traditional research has taken a reductionist's approach in investigation of the minimal cues for speech recognition (1). Previous studies using either naturally produced whispered speech (2) or artificially synthesized speech (3, 4) have isolated and identified several important acoustic cues for speech recognition. For example, computers relying on primarily spectral cues and human cochlear-implant listeners relying on primarily temporal cues can achieve a high level of speech recognition in quiet (5-7). As a result, spectral and temporal acoustic cues have been interpreted as built-in redundancy mechanisms in speech recognition (8). However, this redundancy interpretation is challenged by the extremely poor performance of both computers and human cochlear implant users in realistic listening situations where noise is typically present (7, 9).The goal of this study was to delineate the relative contributions of spectral and temporal cues to speech recognition in realistic listening situations. We chose three speech perception tasks that are known to be notoriously difficult for computers and human cochlear-implant users, including speech recognition with a competing voice, speaker recognition, and Mandarin tone recognition. We approached the issue by extracting slowly varying amplitude modulation (AM) and frequency modulation (FM) from a number of frequency bands in speech sounds and testing their relative contributions to speech recognition in acoustic and electric hearing. The AM-only speech has been used in previous studies (3, 10) and is considered to be an acoustic simulation of the cochlear implant (5). Different from previous studies using relatively ''fast'' FM to track formant changes in speech production (4, 11) or fine structure in speech acoustics (12, 13), the ''...

show abstract

Chronic intracochlear electrical stimulation in the neonatally deafened cat. I: Expansion of central representation

Snyder

et al. 1990

View full text Add to dashboard Cite

Mandarin tone recognition in cochlear-implant subjects

2004

View full text Add to dashboard Cite

Psychophysical Performance and Mandarin Tone Recognition in Noise by Cochlear Implant Users

Wei

Cao

Jin

et al. 2007

View full text Add to dashboard Cite

show abstract

Chronic intracochlear electrical stimulation in the neonatally deafened cat. II: Temporal properties of neurons in the inferior colliculus

et al. 1991

View full text Add to dashboard Cite

Lexical tone and word recognition in noise of Mandarin-speaking children who use cochlear implants and hearing aids in opposite ears

Yuen

Cao

Wei

et al. 2009

Cochlear Implants International

View full text Add to dashboard Cite

The benefits of bimodal hearing (cochlear implant and hearing aid in opposite ears) in children are well documented in English-speaking populations (Ching et al., 2000; Holt et al., 2005) but not much evidence has been reported from populations using tonal languages. The lexical tones in tonal languages are heavily loaded with semantic and grammatical information, which are essentially represented by the fundamental frequency (F0) and low-order harmonics of the speech signal. This unique linguistic feature means that tonal language-speaking CI recipients may achieve more bimodal benefits than their non-tonal language peers may. Twenty Mandarin-speaking children using the Nucleus 24 cochlear implant system and a hearing aid on the non-implant ear were assigned to either one of the two groups to investigate the head-shadow and binaural redundancy effects. A computerized speech test - MAPPID-N (Yuen et al., 2007) was used to present Mandarin lexical tones in monosyllabic words, and disyllabic words with a four- and eight-alternative forced choice picture-identification task, respectively. Individualized signal-to-noise ratio was used to capture the speech scores in the 30%-70% range and was fixed throughout the CI alone and bimodal experimental conditions. Hearing aid fitting was optimized before the first test phase, which was followed by the second test phase after three months. Significant head shadow but not binaural redundancy benefits were observed, suggesting that subjects have not yet developed central binaural processing abilities to improve speech recognition when speech and noise are mixed, in the bimodal condition, in this group of Mandarin-speaking paediatric CI recipients. No subject experienced any degradation of performance in the bimodal versus the CI-only test condition. This may be the first study that demonstrated the bimodal benefits in CI paediatric recipients speaking tonal language, particularly in lexical tone perception. Hearing aid amplification for the non-implant ear should be a standard for the paediatric tonal-language CI population.

show abstract

Development of the computerized Mandarin Pediatric Lexical Tone and Disyllabic-word Picture Identification Test in Noise (MAPPID-N)

Yuen¹,

Luan²,

Li³

et al. 2009

Cochlear Implants Int.

View full text Add to dashboard Cite

MAPPID‐N was developed to assess the speech‐recognition abilities in noise of Mandarin‐speaking children on disyllabic words, and lexical tones in monosyllabic words, in a picture‐identification test format. Twenty‐six normal‐hearing children aged four to nine years listened repeatedly to the test materials where noise was spatially mixed with or separated from speech, in different signal‐to‐noise (SNR) ratios, to obtain performance‐SNR functions and SNR for 50% correct scores (SNR‐50%). SNR‐50% improved with age only when noise was spatially separated from speech but not when noise was mixed with speech, suggesting the improvement with age in the use of intensity and timing cues differences between the two ears. The homogeneity of the test items was improved by adjusting the intensity levels of individual test items to align their SNR‐50% to the mean SNR‐50% level. Copyright © 2009 John Wiley & Sons, Ltd.

show abstract

High Prevalence of the Connexin 26 (GJB2) Mutation in Chinese Cochlear Implant Recipients

Chen

Chen²,

Cao³

et al. 2009

ORL

View full text Add to dashboard Cite

Background: The GJB2 gene, mapping to chromosome 13q12, encodes a gap junction protein, connexin 26, and is responsible for certain forms of congenital deafness, such as DFNB1 and DFNA3. Mutations of this gene are responsible for about one half of severe autosomal recessive non-syndromic deafness. Methods: To determine whether GJB2 mutations are major causes of deafness in Chinese cochlear implant recipients, we enrolled 115 cochlear implant recipients for mutation screening. Results: The results showed that 36.5% (42/115) of all cochlear implant recipients and 41% (41/100) of non-syndromic deafness patients exhibit GJB2 mutations; only 1 inner ear malformation patient was detected with GJB2 mutations. The present study found 11 different variations in the GJB2 gene. Conclusion: The 235delC mutation was the most prevalent mutation, found in 18.3% (42/230 alleles) of all cochlear implant recipients and 21.0% (42/200 alleles) of the non-syndromic deafness group. Only 0.6% of GJB2 mutations were detected in the inner ear malformation group. The novel 187G→T mutations are likely to be pathological mutations.

show abstract

12 3

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Keli Cao

Speech recognition with amplitude and frequency modulations

Chronic intracochlear electrical stimulation in the neonatally deafened cat. I: Expansion of central representation

Mandarin tone recognition in cochlear-implant subjects

Psychophysical Performance and Mandarin Tone Recognition in Noise by Cochlear Implant Users

Chronic intracochlear electrical stimulation in the neonatally deafened cat. II: Temporal properties of neurons in the inferior colliculus

Lexical tone and word recognition in noise of Mandarin-speaking children who use cochlear implants and hearing aids in opposite ears

Development of the computerized Mandarin Pediatric Lexical Tone and Disyllabic-word Picture Identification Test in Noise (MAPPID-N)

High Prevalence of the Connexin 26 (GJB2) Mutation in Chinese Cochlear Implant Recipients

Contact Info

Product

Resources

About