Two interleaved melodies, with theory tones alternating as ABAB..., can be individually followed and identified if auditory stream segregation takes place. Stream segregation can occur if the tone conditions are favorable, for example, if the tones of the different melodies are in different octaves. Using an interleaved melody identification task, we have measured the extent to which 12 different tone conditions lead to stream segregation. The purpose of the experiment is to discover whether stream segregation is mediated entirely by channeling that is established in the auditory periphery or whether more complicated principles of source grouping are at work. Peripheral channels are defined as either tonotopic (frequency based) or lateral (localized left or right). The data show that peripheral channeling is of paramount importance, suggesting that a set of rather simple rules can predict whether two interleaved melodies will be perceived as segregated or not. The data reveal a secondary effect of tone duration. Otherwise, in the absence of peripheral channeling, the experiments find little or no stream segregation, even in those cases where individual tones should clearly evoke images of different sources. Additional experiments show that interleaved melody identification is made more difficult by a transposition that maximizes the number of melodic crossings, even though the transposition may place the interleaved melodies in different keys. An appendix develops an elementary mathematics of melodic crossings and contacts.
The smallest detectable interaural time difference (ITD) for sine tones was measured for four human listeners to determine the dependence on tone frequency. At low frequencies, 250-700 Hz, threshold ITDs were approximately inversely proportional to tone frequency. At mid-frequencies, 700-1000 Hz, threshold ITDs were smallest. At high frequencies, above 1000 Hz, thresholds increased faster than exponentially with increasing frequency becoming unmeasurably high just above 1400 Hz. A model for ITD detection began with a biophysically based computational model for a medial superior olive (MSO) neuron that produced robust ITD responses up to 1000 Hz, and demonstrated a dramatic reduction in ITD-dependence from 1000 to 1500 Hz. Rate-ITD functions from the MSO model became inputs to binaural display models-both place based and rate-difference based. A place-based, centroid model with a rigid internal threshold reproduced almost all features of the human data. A signal-detection version of this model reproduced the high-frequency divergence but badly underestimated low-frequency thresholds. A rate-difference model incorporating fast contralateral inhibition reproduced the major features of the human threshold data except for the divergence. A combined, hybrid model could reproduce all the threshold data.
The ability of a listener to detect a mistuned harmonic in an otherwise periodic tone is representative of the capacity to segregate auditory entities on the basis of steady-state signal cues. By use of a task in which listeners matched the pitch of a mistuned harmonic, this ability has been studied, in order to find dependences on mistuned harmonic number, fundamental frequency, signal level, and signal duration. The results considerably augment the data previously obtained from discrimination experiments and from experiments in which listeners counted apparent sources. Although previous work has emphasized the role of spectral resolution in the segregation process, the present work suggests that neural synchrony is an important consideration; our data show that listeners lose the ability to segregate mistuned harmonics at high frequencies where synchronous neural firing vanishes. The functional form of this loss is insensitive to the spacing of the harmonics. The matching experiment also permits the measurement of the pitches of mistuned harmonics. The data exhibit shifts of a form that argues against models of pitch shifts that are based entirely upon partial masking.
Listeners perceive the sounds of the real world to be externalized. The sound images are compact and correctly located in space. The experiments reported in this article attempted to determine the characteristics of signals appearing in the ear canals that are responsible for the perception of externalization. The experiments used headphones to gain experimental control, and they employed a psychophysical method whereby the measurement of externalization was reduced to discrimination. When the headphone signals were synthesized to best resemble real-world signals (the baseline synthesis) listeners could not distinguish between the virtual image created by the headphones and the real source. Externalization was then studied, using both discrimination and listener rating, by systematically modifying the baseline synthesis. It was found that externalization depends on the interaural phases of low-frequency components but not high-frequency components, as defined by a boundary near 1 kHz. By contrast, interaural level differences in all frequency ranges appear to be about equally important. Other experiments showed that externalization requires realistic spectral profiles in both ears; maintaining only the interaural difference spectrum is inadequate. It was also found that externalization does not depend on dispersion around the head; an optimum interaural time difference proved to be an adequate phase relationship.
We have studied the ability of human listeners to locate the origin of a sound in a room in a series of source azimuth identification experiments. All experiments were done in a small rectangular concert hall with variable geometry and acoustical properties. Subjects localized a 50-ms, 500-Hz sine pulse with an rms error of 3.3° (±0.6°) regardless of room reverberation time. Lowering the ceiling from 11.5 to 3.5 m decreased the error to 2.8° (±0.6°). Subjects localized broadband noise without attack transients with an rms error of 2.3° (±0.6°) if the reverberation time was 1 s. The error increased to 3.2° (±0.7°) if the reverberation time was 5 s. For complex tones without attack transients the localization error continuously increased as the intensity of spectral components decreased. Performance was nearly random for a 500-Hz sine tone, but was significantly better than random for a 5000-Hz sine tone. Our azimuth identification experiments revealed significant biases, as much as 2°; such biases are, of course, invisible in minimum audible angle experiments.
The technique used to record respiratory sounds was well tolerated by the horses, easy, and inexpensive. Spectrum analysis of respiratory sounds from exercising horses after experimental induction of LH or DDSP revealed unique sound patterns. If other conditions causing airway obstruction are also associated with unique sound patterns, spectrum analysis of respiratory sounds may prove to be useful in the diagnosis of airway abnormalities in horses.
For as long as we humans have lived on Earth, we have been able to use our ears to localize the sources of sounds. Our ability to localize warns us of danger and helps us sort out individual sounds from the usual cacophony of our acoustical world. Characterizing this ability in humans and other animals makes an intriguing physical, physiological, and psychological study (see figure 1).
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.