Steven Greenberg scite author profile

This study was motivated by the prospective role played by brain rhythms in speech perception. The intelligibility – in terms of word error rate – of natural-sounding, synthetically generated sentences was measured using a paradigm that alters speech-energy rhythm over a range of frequencies. The material com-prised 96 semantically unpredictable sentences, each approximately 2 s long (6–8 words per sentence), generated by a high-quality text-to-speech (TTS) synthesis engine. The TTS waveform was time-compressed by a factor of 3, creating a signal with a syllable rhythm three times faster than the original, and whose intel-ligibility is poor (<50% words correct). A waveform with an artificial rhythm was produced by automatically segmenting the time-compressed waveform into consecutive 40-ms fragments, each followed by a silent interval. The parameters varied were the length of the silent interval (0–160 ms) and whether the lengths of silence were equal (‘periodic’) or not (‘aperiodic’). The performance curve (word error rate as a function of mean duration of silence) was U-shaped. The lowest word error rate (i.e., highest intelligibility) occurred when the silence was 80 ms long and inserted periodically. This is also the condition for which word error rate increased when the silence was inserted aperiodically. These data are consistent with a model (TEMPO) in which low-frequency brain rhythms affect the ability to decode the speech signal. In TEMPO, optimum intelligibility is achieved when the syllable rhythm is within the range of the high theta-frequency brain rhythms (6–12 Hz), comparable to the rate at which segments and syllables are articulated in conversational speech.

show abstract

Temporal properties of spontaneous speech—a syllable-centric perspective

Greenberg

Carvey

Hitchcock

et al. 2003

Journal of Phonetics

298

278

View full text Add to dashboard Cite

show abstract

Speaking in shorthand – A syllable-centric perspective for understanding pronunciation variation

Greenberg

1999

Speech Communication

270

245

View full text Add to dashboard Cite

Current-generation automatic speech recognition (ASR) systems model spoken discourse as a linear sequence of words and phones. Because it is unusual for every phone within a word to be pronounced in a standard ("canonical") way, ASR systems often depend on a multi-pronunciation lexicon to match an acoustic sequence with a lexical unit. Since there are, in practice, many different ways for a word to be pronounced, this standard approach adds a layer of complexity and ambiguity to the decoding process which, if modified, could potentially improve recognition performance. Systematic analysis of pronunciation variation in a corpus of spontaneous English discourse (Switchboard) demonstrates that the variation observed is systematic at the level of the syllable. Syllabic onsets are realized in canonical form far more frequently than either coda or nuclear constituents. Prosodic stress also plays an important role in pronunciation. The governing mechanism is likely to involve the informational valence associated with syllable elements, and for this reason pronunciation variation offers a potential window onto the mechanisms responsible for the production and understanding of speech.

show abstract

Encoding of amplitude modulation in the cochlear nucleus of the cat

Rhode

Greenberg²

1994

Journal of Neurophysiology

240

200

View full text Add to dashboard Cite

1. Amplitude modulation (AM) is a pervasive property of acoustic communication systems. In the present study we investigate neural temporal mechanisms in the auditory nerve and cochlear nuclei of the pentobarbital sodium-anesthesized cat associated with the neural coding of 100% AM tones, both in quiet and in the presence of wideband, quasi-flat-spectrum noise. The AM carrier frequency was set to the neuron's characteristic frequency (CF) and the sound pressure level (SPL) of acoustic stimuli was varied over a wide dynamic range of intensities (< or = 40 dB). The temporal AM-encoding capability of auditory neurons was measured by computing the synchronization coefficient (SC) of the neural response to the signal's modulation and carrier frequency. The temporal modulation transfer function (tMTF) of a neuron was then computed by measuring the SC of the response to signals of variable fmod (50-2550 Hz). 2. Neurons in the cochlear nuclei synchronize on average more highly to the modulation frequency than fibers of comparable CF, threshold, and spontaneous rate in the auditory nerve. The disparity in performance is greatest at high SPLs and low signal-to-noise ratios. However, there is a significant degree of diversity in AM-encoding capability among neurons in both the cochlear nuclei and auditory nerve. Among auditory nerve fibers (ANFs), low- and medium-spontaneous-rate (SR) units (SR < 18 spike/s) phase-lock with greater precision than comparable high-SR units at any given frequency, particularly at moderate to high SPLs, consistent with previous studies. 3. The phase-locking capabilities of neurons in the cochlear nucleus are considerably more variable than in the auditory nerve. Moreover, the variability itself depends on two distinct measures of phase-locking performance. Most ANFs are capable of phase-locking to frequencies as high as 3-4 kHz. In the cochlear nucleus many unit types do not phase-lock to modulation frequencies > 1 kHz. As a result, phase-locking performance is measured on the basis of two parameters, maximum synchronization, irrespective of stimulus frequency, and the upper frequency limit for significant phase-locking. 4. Cochlear nucleus neurons may be divided into three distinct groups on the basis of maximum synchronization capability. In group 1 are the primary-like (PL) units of the anteroventral division, whose phase-locking capabilities are comparable with those of high-SR ANFs.(ABSTRACT TRUNCATED AT 400 WORDS)

show abstract

Neural temporal coding of low pitch. I. Human frequency-following responses to complex tones

et al. 1987

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.