Cocktail parties and other natural auditory environments present organisms with mixtures of sounds. Segregating individual sound sources is thought to require prior knowledge of source properties, yet these presumably cannot be learned unless the sources are segregated first. Here we show that the auditory system can bootstrap its way around this problem by identifying sound sources as repeating patterns embedded in the acoustic input. Due to the presence of competing sounds, source repetition is not explicit in the input to the ear, but it produces temporal regularities that listeners detect and use for segregation. We used a simple generative model to synthesize novel sounds with naturalistic properties. We found that such sounds could be segregated and identified if they occurred more than once across different mixtures, even when the same sounds were impossible to segregate in single mixtures. Sensitivity to the repetition of sound sources can permit their recovery in the absence of other segregation cues or prior knowledge of sounds, and could help solve the cocktail party problem.auditory scene analysis | cocktail party problem | generative models of sound | natural sound statistics | sound segregation
Studies of pitch perception often involve measuring difference limens for complex tones ͑DLCs͒ that differ in fundamental frequency ͑F0͒. These measures are thought to reflect F0 discrimination and to provide an indirect measure of subjective pitch strength. However, in many situations discrimination may be based on cues other than the pitch or the F0, such as differences in the frequencies of individual components or timbre ͑brightness͒. Here, DLCs were measured for harmonic and inharmonic tones under various conditions, including a randomized or fixed lowest harmonic number, with and without feedback. The inharmonic tones were produced by shifting the frequencies of all harmonics upwards by 6.25%, 12.5%, or 25% of F0. It was hypothesized that, if DLCs reflect residue-pitch discrimination, these frequency-shifted tones, which produced a weaker and more ambiguous pitch than would yield larger DLCs than the harmonic tones. However, if DLCs reflect comparisons of component pitches, or timbre, they should not be systematically influenced by frequency shifting. The results showed larger DLCs and more scattered pitch matches for inharmonic than for harmonic complexes, confirming that the inharmonic tones produced a less consistent pitch than the harmonic tones, and consistent with the idea that DLCs reflect F0 pitch discrimination.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.