Four experiments investigated the acoustical correlates of similarity and categorization judgments of environmental sounds. In Experiment 1, similarity ratings were obtained from pairwise comparisons of recordings of 50 environmental sounds. A three-dimensional multidimensional scaling (MDS) solution showed three distinct clusterings of the sounds, which included harmonic sounds, discrete impact sounds, and continuous sounds. Furthermore, sounds from similar sources tended to be in close proximity to each other in the MDS space. The orderings of the sounds on the individual dimensions of the solution were well predicted by linear combinations of acoustic variables, such as harmonicity, amount of silence, and modulation depth. The orderings of sounds also correlated significantly with MDS solutions for similarity ratings of imagined sounds and for imagined sources of sounds, obtained in Experiments 2 and 3--as was the case for free categorization of the 50 sounds (Experiment 4)--although the categorization data were less well predicted by acoustic features than were the similarity data.
Three experiments tested listeners' ability to identify 70 diverse environmental sounds using limited spectral information. Experiment 1 employed low- and high-pass filtered sounds with filter cutoffs ranging from 300 to 8000 Hz. Listeners were quite good (>50% correct) at identifying the sounds even when severely filtered; for the high-pass filters, performance was never below 70%. Experiment 2 used octave-wide bandpass filtered sounds with center frequencies from 212 to 6788 Hz and found that performance with the higher bandpass filters was from 70%-80% correct, whereas with the lower filters listeners achieved 30%-50% correct. To examine the contribution of temporal factors, in experiment 3 vocoder methods were used to create event-modulated noises (EMN) which had extremely limited spectral information. About half of the 70 EMN were identifiable on the basis of the temporal patterning. Multiple regression analysis suggested that some acoustic features listeners may use to identify EMN include envelope shape, periodicity, and the consistency of temporal changes across frequency channels. Identification performance with high- and low-pass filtered environmental sounds varied in a manner similar to that of speech sounds, except that there seemed to be somewhat more information in the higher frequencies for the environmental sounds used in this experiment.
Performance on 19 auditory discrimination and identification tasks was measured for 340 listeners with normal hearing. Test stimuli included single tones, sequences of tones, amplitude-modulated and rippled noise, temporal gaps, speech, and environmental sounds. Principal components analysis and structural equation modeling of the data support the existence of a general auditory ability and four specific auditory abilities. The specific abilities are (1) loudness and duration (overall energy) discrimination; (2) sensitivity to temporal envelope variation; (3) identification of highly familiar sounds (speech and nonspeech); and (4) discrimination of unfamiliar simple and complex spectral and temporal patterns. Examination of Scholastic Aptitude Test (SAT) scores for a large subset of the population revealed little or no association between general or specific auditory abilities and general intellectual ability. The findings provide a basis for research to further specify the nature of the auditory abilities. Of particular interest are results suggestive of a familiar sound recognition (FSR) ability, apparently specialized for sound recognition on the basis of limited or distorted information. This FSR ability is independent of normal variation in both spectral-temporal acuity and of general intellectual ability.
Auditory discrimination abilities of professional musicians were compared with those of nonmusicians. The stimuli for the frequency-discrimination tasks were 300-msec sinusoidal tones, 300-msec square waves, and tone patterns consisting of ten 40-msec tones played sequentially. The musicians’ difference thresholds for single tones were between Δf /f=0.001 and 0.0045. One-half of the nonmusicians attained thresholds almost as low; the rest attained larger thresholds, up to Δf /f=0.017. The results for the pattern stimuli show a clearer separation between the nonmusicians and musicians, whose median difference thresholds were about three times smaller. However, nonmusician listeners who had previously trained with patterns not in the test set had different thresholds, substantially smaller than those obtained by the musicians. The appropriateness of preferential recruitment of musicians for psychoacoustic research is discussed. The responses to a musical background survey and correlations between the survey items and discrimination performance are contained in a supplement to this paper [PAPS JASMA 76, xxx-xx].
This is the second in a series of articles in human listeners’ abilities to discriminate between word-length tonal sequences, or ’’patterns.’’ The first article reported that frequency resolution, by highly trained listeners, is four to five times more accurate for high-frequency, late-occurring components of such sequences than for low−frequency early components [Watson, Kelly, and Benbasset, J. Acoust. Soc. Am. 57, 1175–1185 (l975)]. These effects, which are similar to described as ’’recognition masking’’ or ’’informational masking’’ by other authors, have now been shown to be strongly dependent on the degree of trial-to-trial stimulus uncertainty of the psychophysical procedure in which they are measured. When stimulus uncertainty is reduced to its psychophysical minimum, frequency resolution for any component of a tonal sequence is only slightly less accurate than for isolated tones. Previous reports of recognition masking this may reflect limitations imposed by those more dynamic parts of the sensory process concerned with memory and attention, rather than information loss in the more static peripheral auditory system. Subject Classification: [43]65.22, [43]65.75, [43]65.64, [43]65.58.
Levels of monaural signals at behavioral threshold were determined by a psychophysical method of adjustment for seven highly trained listeners. Thresholds were studied as a function of signal frequency (octave steps, from 0.125 to 8 kHz) and of signal duration (logarithmic steps, from 16 to 1024 msec). Measurements were made in the presence of a contralateral broad-band masking noise with a spectrum level of 30 dB SPL. The time constant, r, estimated from at least 12 replications of each measurement, was found to range systematically from values considered normal (125-175 msec) by some earlier investigators, at low frequencies, to much lower values (30-70 msec) at high frequencies. Comparison between the performance of listeners with normal audiograms and those with high-frequency hearing loss shows this interaction between frequency and the time constant to be similar for both samples. The data are also compared to the results of a second experiment that employed a two-alternative forced-choice psychophysical method. psychophysical questions may facilitate the search for the physiological process relevant to temporal integration.The general form of the temporal integration function has been discussed by several of the investigators listed above, and different theories lead to the use of different statistics to describe the temporal integration effect. An early assumption was, apparently, that the ear might perform as a perfect integrator and so would show 3 dB of reduction of the auditory threshold for each doubling of a signal duration (Garner and Miller, 1947). It was observed that this theory described the data within the range from about 20 to 100 msec, but that very short durations yielded an even steeper slope (about 4.5 dB per doubling, for signals briefer than about 20 msec). The very short signals were shown to deviate from the 3dB per doubling rule because of the spread of energy over the frequency domain (Garner, 1947). Very long durations also deviated from the simple rule, producing a slope of about 1.5 dB per doubling for signals longer than 100 msec. Therefore, Green et al. described the relationship with three lines rather than one [-see Fig. la•. A summary statistic, the critical duration, was introduced by Harris, Haines, and Myers (1958). This was defined as the intersection
While a large portion of the variance among listeners in speech recognition is associated with the audibility of components of the speech waveform, it is not possible to predict individual differences in the accuracy of speech processing strictly from the audiogram. This has suggested that some of the variance may be associated with individual differences in spectral or temporal resolving power, or acuity. Psychoacoustic measures of spectral-temporal acuity with nonspeech stimuli have been shown, however, to correlate only weakly (or not at all) with speech processing. In a replication and extension of an earlier study [Watson et al., J. Acoust. Soc. Am. Suppl. 1 71. S73 (1982)] 93 normal-hearing college students were tested on speech perception tasks (nonsense syllables, words, and sentences in a noise background) and on six spectral-temporal discrimination tasks using simple and complex nonspeech sounds. Factor analysis showed that the abilities that explain performance on the nonspeech tasks are quite distinct from those that account for performance on the speech tasks. Performance was significantly correlated among speech tasks and among nonspeech tasks. Either, (a) auditory spectral-temporal acuity for nonspeech sounds is orthogonal to speech processing abilities, or (b) the appropriate tasks or types of nonspeech stimuli that challenge the abilities required for speech recognition have yet to be identified.
This US version of the digits-in-noise telephone screening test is sufficiently valid to be implemented for use by the general public. Its properties are quite similar to those telephone screening tests currently in use in most European countries. Telephone tests provide efficient, easy to use, and valid screening for functional hearing impairment. The results of this test are a reasonable basis for advising those who fail to seek a comprehensive hearing evaluation by an audiologist.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.