An attempt is made to develop a quantitative theory of intensity resolution that is applicable to a wide variety of experiments on discrimination, identification, and scaling. The theory is composed of a Thurstonian decision model, which separates sensitivity from response bias, and an internal-noise model, which separates sensory limitations from memory limitations. It is assumed that the subject has two memory operating modes, a sensory-trace mode and a context-coding mode, and that the use of these two modes is determined by the characteristics of the experiment. In one-interval paradigms, it is assumed that the context-coding mode is used, and the theory relates resolution to the total range of intensities in the stimulus set. In two-interval paradigms, it is assumed that the two modes are combined, and the theory relates resolution to both the total intensity range and the duration between the two intervals. The theory provides, among other things, a new interpretation of the 7 ± 2 phenomenon.
Sentences spoken "clearly" are significantly more intelligible than those spoken "conversationally" for hearing-impaired listeners in a variety of backgrounds [Picheny et al., J. Speech Hear. Res. 28, 96-103 (1985); Uchanski et al., ibid. 39, 494-509 (1996); Payton et al., J. Acoust. Soc. Am. 95, 1581-1592 (1994)]. While producing clear speech, however, talkers often reduce their speaking rate significantly [Picheny et al., J. Speech Hear. Res. 29, 434-446 (1986); Uchanski et al., ibid. 39, 494-509 (1996)]. Yet speaking slowly is not solely responsible for the intelligibility benefit of clear speech (over conversational speech), since a recent study [Krause and Braida, J. Acoust. Soc. Am. 112, 2165-2172 (2002)] showed that talkers can produce clear speech at normal rates with training. This finding suggests that clear speech has inherent acoustic properties, independent of rate, that contribute to improved intelligibility. Identifying these acoustic properties could lead to improved signal processing schemes for hearing aids. To gain insight into these acoustical properties, conversational and clear speech produced at normal speaking rates were analyzed at three levels of detail (global, phonological, and phonetic). Although results suggest that talkers may have employed different strategies to achieve clear speech at normal rates, two global-level properties were identified that appear likely to be linked to the improvements in intelligibility provided by clear/normal speech: increased energy in the 1000-3000-Hz range of long-term spectra and increased modulation depth of low frequency modulations of the intensity envelope. Other phonological and phonetic differences associated with clear/normal speech include changes in (1) frequency of stop burst releases, (2) VOT of word-initial voiceless stop consonants, and (3) short-term vowel spectra.
This paper reports the results of a series of experiments on tone pulses designed to test certain predictions of the preliminary theory of intensity resolution (Durlach and Braida, 1969) relevant to one-interval paradigms. Resolution was measured in identification and scaling experiments as a function of the range, number, and distribution of intensities, and the availability of feedback. Some of the results, such as those on the dependence of resolution on range and number of stimuli in absolute identification, support the theory. Other results, however, such as those comparing resolution in identification with resolution in magnitude estimation for a small common range, indicate that the theory is inadequate and needs to be revised.
The effect of articulating clearly on speech intelligibility is analyzed for ten normal-hearing and two hearing-impaired listeners in noisy, reverberant, and combined environments. Clear speech is more intelligible than conversational speech for each listener in every environment. The difference in intelligibility due to speaking style increases as noise and/or reverberation increase. The average difference in intelligibility is 20 percentage points for the normal-hearing listeners and 26 percentage points for the hearing-impaired listeners. Two predictors of intelligibility are used to quantify the environmental degradations: The articulation index (AI) and the speech transmission index (STI). Both are shown to predict, reliably, performance levels within a speaking style for normal-hearing listeners. The AI is unable to represent the reduction in intelligibility scores due to reverberation for the hearing-impaired listeners. Neither predictor can account for the difference in intelligibility due to speaking style.
Basic principles of the theory of harmony reflect physiological and anatomical properties of the auditory nervous system and related cognitive systems. This hypothesis is motivated by observations from several different disciplines, including ethnomusicology, developmental psychology, and animal behavior. Over the past several years, we and our colleagues have been investigating the vertical dimension of harmony from the perspective of neurobiology using physiological, psychoacoustic, and neurological methods. Properties of the auditory system that govern harmony perception include (1) the capacity of peripheral auditory neurons to encode temporal regularities in acoustic fine structure and (2) the differential tuning of many neurons throughout the auditory system to a narrow range of frequencies in the audible spectrum. Biologically determined limits on these properties constrain the range of notes used in music throughout the world and the way notes are combined to form intervals and chords in popular Western music. When a harmonic interval is played, neurons throughout the auditory system that are sensitive to one or more frequencies (partials) contained in the interval respond by firing action potentials. For consonant intervals, the fine timing of auditory nerve fiber responses contains strong representations of harmonically related pitches implied by the interval (e.g., Rameau's fundamental bass) in addition to the pitches of notes actually present in the interval. Moreover, all or most of the partials can be resolved by finely tuned neurons throughout the auditory system. By contrast, dissonant intervals evoke auditory nerve fiber activity that does not contain strong representations of constituent notes or related bass notes. Furthermore, many partials are too close together to be resolved. Consequently, they interfere with one another, cause coarse fluctuations in the firing of peripheral and central auditory neurons, and give rise to perception of roughness and dissonance. The effects of auditory cortex lesions on the perception of consonance, pitch, and roughness, combined with a critical reappraisal of published psychoacoustic data on the relationship between consonance and roughness, lead us to conclude that consonance is first and foremost a function of the pitch relationships among notes. Harmony in the vertical dimension is a positive phenomenon, not just a negative phenomenon that depends on the absence of roughness-a view currently held by many psychologists, musicologists, and physiologists.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.