Discrimination of time intervals marked by brief acoustic pulses of various intensities and spectra

Three subjects were given extensive practice in discriminating syllables which differed in voice onset time. For these subjects, there were two major findings. First, discrimination of speech follows normal psychophysical laws: long-onset-time stimuli require larger differences than shorter ones for comparable discrimination. Second, the shape of the discrimination function for experienced subjects is more like a leaning W than an inverted V, the usual shape for naive subjects. The data support a model of speech perception with both an acoustic and a phonetic component. The phonetic component is best characterized as a prototype matching process, with the prototype including information on the simultaneity of formant onset.For the last 20 years, the laws governing speech perception have been thought to differ from the laws governing psychophysical perception. In most psychophysical experiments, subjects can discriminate many more stimuli than they can identify. In most speech perception studies, discrimination seems to be bounded by identification; subjects can only discriminate two speech items if they can give them different phonetic labels. Speech perception appears to be categorical. Largely on the basis of this finding, researchers at the Haskins Laboratories spectrographic analysis, these studies established that two acoustic features are most important in the transition from voiced to voiceless stops. In voiced stops, the first formant (Fl) begins at the same time as the higher formants. Removing the initial portion of the first formant, and thereby delaying its onset, leads to the perception of voicelessness. A more realistic continuum is obtained if the higher formants are aspirated (energized by a noise source) during the period of Fl cutback.The Haskins researchers have generally tested discrimination with the ABX paradigm. In this paradigm, subjects hear three syllables per trial. The first two (A and B) always differ from each other, while the third (X) is identical to one of the first two. The subject's task is to determine if X is the same as A orB.Liberman et al. (1961) used this paradigm in their study of voicing. The authors synthesized a continuum of speech syllables which varied in voice onset time (VaT) by varying the Fl cutback and aspiration cues in lO-msec steps. The other parameters were appropriate for an alveolar consonant followed by the vowel 10/, yielding a continuum perceived as Idol at one end (O-msec VaT) and as Itol at the other (60-msec VaT). The data generally indicate better discrimination between phonetic categories than within them, an example of categorical perception.Early Haskins papers (e.g., Liberman et aI., 1967; Liberman et aI., Note 1) cited this finding as evidence for a motor theory of speech perception. In more recent work (e.g., , no specific mechanism has been offered which would produce the categorical results, but the general position of a special speech mode has been maintained. As Liberman (1970) puts it, "The [speech] decoder is not merely an extension ...

Section: Methodssupporting

confidence: 80%

The effect of discrimination training on speech perception: Noncategorical perception

Samuel

1977

“…Figure 3 shows that beyond an approximately twooctave separation between the CF of high and low NBNs, there is no further increase in the difficulty of holding on to a single stream. In a review of this paper, Pierre Divenyi (personal communication,December 1998) pointed out that similar observations have been made for the discrimination of unfilled intervals between two tones of different frequencies (Divenyi & Danner, 1977) and for the detection of gaps between narrow-band noise-burst markers (Formby, Barker, Abbey, & Raney, 1993). The similarity suggests to Divenyi that all these observations may be looking at the same process: frequency integration in temporal processing.…”

Section: Discussionmentioning

confidence: 76%

Stream segregation of narrow-band noise bursts

Bregman

Ahad²,

Loon

2001

Periodic sounds (tones) represent only a fraction of the sounds that populate our everyday lives. However, most of the research on the perceptual organization of sounds has used tones as stimuli (Bregman, 1990, chap. 2). There are only two studies, to our knowledge,that have addressed the perceptual grouping of noises. The stream segregation of alternating narrow-band noises of higher and lower center frequencies (CFs) was studied by Dannenbring and Bregman (1976) in an experiment that was primarily about the subjective overlap of segregated streams. Their sharply peaked noise bands, created by the filtering of white noise, had two different CF separations, 3.2 and 19.0 semitones. Ratings of stream segregation were higher in sequences that had the 19-semitone separation. Bregman, Colantonio, and Ahad (1999) studied the segregation of band-limited noise bursts of high and low CFs as part of an experiment whose purpose was to demonstrate that several variables (bandwidth [BW], rate of onsets, and separation in CF of the high-and low-pitched noises) would have similar effects on stream segregation and on the continuity illusion ("apparent continuity"). They used noise bands whose spectra were flat between the band edges. Among their other findings, they found that BW had a significant effect on the segregation of narrow-band noises. The present experiment followed up on this experiment, studying the stream segregation of the same types of noise bursts, but controlling their properties more precisely.Some terminology can be introduced with the help of Figure 1. Two bands of narrow-band noise (NBN) are shown. The one that is higher in frequency is referred to as H, and the lower as L. The top edge of H and the bottom edge of L can be called the "outer band edges" of the NBN pair. They represent the two most separated frequency components of H and L. Similarly, the bottom edge of H and the top edge of L represent the two "inner band edges" of the pair, the closest frequencies of the two bands. The CF of each band of noise (on a log-frequency scale) and the BW is also shown, bandwidth being defined here as the difference in frequency between the upper and lower band edges in semitones (a log-frequency scale).Any such flat-spectrum NBNs, in which all frequencies have equal intensities when plotted on a linear scale of frequency, can be described by a pair of parameters. One possible pair consists of the CF and BW. An alternative description is given by a different parameter pairupper band edge and lower band edge. Although the two descriptions are fully equivalent, they emphasize different properties of the band. The first treats the noise band as a block whose overall frequency can be represented by its CF, whereas the second focuses on the frequencies at the edges.The experiment by Bregman et al. (1999) showed that the segregation of a sequence into separate H and L streams was increased by the difference between the CFs of the two bands of noise. In addition, they found that H and L bursts having greater BWs were seg...

“…541 There seems to exist a general agreement on the fact that when an observer is asked to discriminate between intervals of short duration, his performance is little affected by such stimulus characteristics as intensity, frequency, and bandwidth (Allan, 1979). More specifically, if the discrimination is performed on empty intervals marked by short auditory pulses, numerous studies have shown that, for intervals longer than 100 msec, performance is not sensitive to variations in marker intensity (Abel, 1972;Carbotte & Kristofferson, 1973;Divenyi& Danner, 1977;Penner, 1976), frequency or spectrum (Divenyi & Danner, 1977;Divenyi & Sachs, 1978), and duration (Abel, 1972;Carbotte & Kristofferson, 1973;Penner, 1976). Moreover, within the range of 0-100 msec, Nilsson (1969) and Oostenbrug, Horst, and Kniper (1978) have shown the performance to be relatively insensitive to changes in the energy of light pulse markers.…”

mentioning

confidence: 99%

“…In general, discrimination models consider that the encoding of temporal extent is performed by a central timekeeper common to visual and auditory modalities (Allan, 1979). Most current models do assume that duration information is obtained through the accumulation, over the extent of the interval, of pulses originating in some central source (Abel, 1972;Creelman, 1962;Divenyi & Danner, 1977;Kinchla, 1972;Thomas & Brown, 1974;Treisman, 1963). Although some models (Creelman, 1962;Divenyi & Danner, 1977) have formally defined parameters representing nontemporal stimulus variables, their theoretical importance has remained very minor in view of the empirical evidence.…”

mentioning

confidence: 99%

Duration discrimination of empty time intervals marked by intermodal pulses

Rousseau

Poirier

Lemyre

1983

In 1973, Rousseau and Kristofferson reported that short empty intermodal time intervals marked by a light flash and a brief tone were poorly discriminated by subjects, and that 4T 75 was constant overa large range of durations. It led them to suggest that short intramodal empty intervals, marked by stimuli from the same sensory modality, might be handled by a "more efficient mechanism" to whichintermodal intervals would not have access. Unfortunately, their study lacked the basic evidence needed to make a strong statement: no direct comparison between inter-and intramodal duration discrimination and no within-subject discrimination function were available. To clarify these two issues, three experiments were performed. The data indicate that intermodal time intervals are discriminated more poorly than intramodal ones, and that intermodal duration discrimination functions follow Weber's law. Analysis of data from different experiments lead to the conclusion that inter-and intramodal intervals are timed by a commontimekeeper and that intermodal intervals induce a large noise component in the timekeeping operation.541