In a previous study on plosives, the 3-Dimensional Deep Search (3DDS) method for the exploration of the necessary and sufficient cues for speech perception was introduced (Li et al., (2010). J. Acoust. Soc. Am. 127(4), 2599-2610). Here, this method is used to isolate the spectral cue regions for perception of the American English fricatives /∫, 3, s, z, f, v, θ, δ in time, frequency, and intensity. The fricatives are analyzed in the context of consonant-vowel utterances, using the vowel /α/. The necessary cues were found to be contained in the frication noise for /∫, 3, s, z, f, v/. 3DDS analysis isolated the cue regions of /s, z/ between 3.6 and 8 [kHz] and /∫, 3/ between 1.4 and 4.2 [kHz]. Some utterances were found to contain acoustic components that were unnecessary for correct perception, but caused listeners to hear non-target consonants when the primary cue region was removed; such acoustic components are labeled "conflicting cue regions." The amplitude modulation of the high-frequency frication region by the fundamental F0 was found to be a sufficient cue for voicing. Overall, the 3DDS method allows one to analyze the effects of natural speech components without initial assumptions about where perceptual cues lie in time-frequency space or which elements of production they correspond to.
This study describes procedures for constructing equal-loudness contours (ELCs) in units of phons from categorical loudness scaling (CLS) data and characterizes the impact of hearing loss on these estimates of loudness. Additionally, this study developed a metric, level-dependent loudness loss, which uses CLS data to specify the deviation from normal loudness perception at various loudness levels and as function of frequency for an individual listener with hearing loss. CLS measurements were made in 87 participants with hearing loss and 61 participants with normal hearing. An assessment of the reliability of CLS measurements was conducted on a subset of the data. CLS measurements were reliable. There was a systematic increase in the slope of the low-level segment of the CLS functions with increase in the degree of hearing loss. ELCs derived from CLS measurements were similar to standardized ELCs (International Organization for Standardization, ISO 226:2003). The presence of hearing loss decreased the vertical spacing of the ELCs, reflecting loudness recruitment and reduced cochlear compression. Representing CLS data in phons may lead to wider acceptance of CLS measurements. Like the audiogram that specifies hearing loss at threshold, level-dependent loudness loss describes deficit for suprathreshold sounds. Such information may have implications for the fitting of hearing aids.
Of increasing importance in the civilian and military population is the recognition of major depressive disorder at its earliest stages and intervention before the onset of severe symptoms. Toward the goal of more effective monitoring of depression severity, we introduce vocal biomarkers that are derived automatically from phonologically-based measures of speech rate. To assess our measures, we use a 35-speaker free-response speech database of subjects treated for depression over a 6-week duration. We find that dissecting average measures of speech rate into phone-specific characteristics and, in particular, combined phone-duration measures uncovers stronger relationships between speech rate and depression severity than global measures previously reported for a speech-rate biomarker. Results of this study are supported by correlation of our measures with depression severity and classification of depression state with these vocal measures. Our approach provides a general framework for analyzing individual symptom categories through phonological units, and supports the premise that speaking rate can be an indicator of psychomotor retardation severity.
The consonant recognition of 17 ears with sensorineural hearing loss is evaluated for 14 consonants /p, t, k, f, s, Ð , b, d, g, v, z, Z, m, n= þ =A=, under four speech-weighted noise conditions (0, 6, 12 dB SNR, quiet). One male and one female talker were chosen for each consonant, resulting in 28 total consonant-vowel test tokens. For a given consonant, tokens by different talkers were observed to systematically differ, in both the robustness to noise and/or the resulting confusion groups. Such within-consonant token differences were observed for over 60% of the tested consonants and all HI ears. Only when HI responses are examined on an individual token basis does one find that the error may be limited to a small subset of tokens with confusion groups that are restricted to fewer than three confusions on average. Averaging different tokens of the same consonant can raise the entropy of a listener's responses (i.e., the size of the confusion group), causing the listener to appear to behave in a less systematic way. Quantifying these token differences provides insight into HI perception of speech under noisy conditions and characterizes each listener's hearing impairment.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.