The Benefit Obtained from Visually Displayed Text from an Automatic Speech Recognizer During Listening to Speech Presented in Noise

Zekveld, Adriana A.; Kramer, Sophia E.; Kessens, Judith M.; Vlaming, Marcel S. M. G.; Houtgast, Tammo

doi:10.1097/aud.0b013e31818005bd

Cited by 20 publications

(13 citation statements)

References 23 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Forevaluating such an audiovisual combination it can be researched howthe SRT (being the SNR levelofspeech in noise as measured at the 50% intelligibility threshold)w ill improve when simultaneously presenting the (imperfectly)t ranscripted speech as text on adisplay. Figure 10 shows the mean improvement in SRTf or normal hearing persons for the standard Dutch sentences when adding transcripted speech from aspeech recognizer at 37%, 55% and 74% accuracyand for delays of zero, 2, 4or6seconds [38]. It can be seen that the required SNR for 50% understanding improvesby0.5 to 3.5 dB.…”

Section: Automatic Speech Recognition To Assist Speech Understandingmentioning

confidence: 99%

“…Such aspeech to text transcription system can be offered as an online service for (web-)t elephony, in which ar emote Figure 10. Improvement of SRTb yp resenting transcripted speech from ASR at 3l evels of accuracya nd for 0, 2, 4a nd 6 secs of processing delay (from [38]). server provides the calculation intensive ASR processing.…”

Section: Automatic Speech Recognition To Assist Speech Understandingmentioning

confidence: 99%

See 1 more Smart Citation

HearCom: Hearing in the Communication Society

Vlaming¹,

Kollmeier²,

Dreschler³

et al. 2011

Acta Acustica united with Acustica

Self Cite

View full text Add to dashboard Cite

Agroup of 28 research partners joined the EU-funded project HearCom with the overall aim to improve hearing communication. One of the main achievements has been the provision of advanced hearing screening tests by telephone and Internet. Next to that it wasaimed to harmonize hearing diagnostic tests for European languages. Forthis the concept of an Auditory Profile wasdefined on which anumber of diagnostic hearing tests were developed in several languages. As hearing problems are also aresult of adverse acoustical circumstances such as for room acoustics and telecom systems, these effects have been studied, modelled and evaluated for hearing impaired persons. In the area of rehabilitation alarge scale comparison study wasperformed on signal enhancement techniques for hearing devices. Both objective and subjective benefits were found for specificlistening conditions in relation to achosen signal processing method. As modern technology may assist on hearing and communication it wasstudied howthe use of automatic speech transcription or the use of handheld communication devices may help people with hearing problems. It is shown that communication benefits can be obtained, butt hat the benefiti sl imited in practice as processing power of today'sh andheld devices is still insufficient. An overview is givenonthe HearCom portal with sections for screening diagnostics, hearing information for the public and professionals, and anew HearCompanion service that provides step-by-step support for the hearing rehabilitation process.

show abstract

Section: Automatic Speech Recognition To Assist Speech Understandingmentioning

confidence: 99%

Section: Automatic Speech Recognition To Assist Speech Understandingmentioning

confidence: 99%

HearCom: Hearing in the Communication Society

Vlaming¹,

Kollmeier²,

Dreschler³

et al. 2011

Acta Acustica united with Acustica

Self Cite

View full text Add to dashboard Cite

show abstract

“…There is much evidence that sensory processing of speech in auditory cortex can be modulated by higher order processing, such as syntactic or semantic analysis (Miller and Isard 1963; Kalikow et al 1977; Peelle et al 2012; Peelle 2013), speaker familiarity (Johnsrude et al 2013) or linguistic expectations set up by visual cues (Jacoby et al 1988; Zekveld et al 2008; Sohoglu et al 2012; for review: Peelle et al 2010). Sohoglu and colleagues (using EEG and MEG) showed that a visual cue, which provides prior knowledge of the speech content, increases the perceived speech clarity in a similar manner as altering the physical parameters of the stimulus.…”

Section: 2 Effects Of Linguistic Processing On Sensory Responsesmentioning

confidence: 99%

The effects of selective attention and speech acoustics on neural speech-tracking in a multi-talker scene

Rimmele

Golumbic

Schröger

et al. 2015

Cortex

152

132

View full text Add to dashboard Cite

Attending to one speaker in multi-speaker situations is challenging. One neural mechanism proposed to underlie the ability to attend to a particular speaker is phase-locking of low-frequency activity in auditory cortex to speech’s temporal envelope (“speech-tracking”), which is more precise for attended speech. However, it is not known what brings about this attentional effect, and specifically if it reflects enhanced processing of the fine structure of attended speech. To investigate this question we compared attentional effects on speech-tracking of natural vs. vocoded speech which preserves the temporal envelope but removes the fine-structure of speech. Pairs of natural and vocoded speech stimuli were presented concurrently and participants attended to one stimulus and performed a detection task while ignoring the other stimulus. We recorded magnetoencephalography (MEG) and compared attentional effects on the speech-tracking response in auditory cortex. Speech-tracking of natural, but not vocoded, speech was enhanced by attention, whereas neural tracking of ignored speech was similar for natural and vocoded speech. These findings suggest that the more precise speech tracking of attended natural speech is related to processing its fine structure, possibly reflecting the application of higher-order linguistic processes. In contrast, when speech is unattended its fine structure is not processed to the same degree and thus elicits less precise speech tracking more similar to vocoded speech.

show abstract

“…However, it is not always possible to see the talker, and visual facial information alone is ambiguous. Orthographic text may serve as an alternate source of visual speech information to supplement aided speech in adverse listening conditions, both in young and in older adults (Zekveld et al 2008, 2009). …”

Section: Introductionmentioning

confidence: 99%

Text as a Supplement to Speech in Young and Older Adults

Krull

Humes

2016

Ear &Amp; Hearing

View full text Add to dashboard Cite

Objective The purpose of this experiment was to quantify the contribution of visual text to auditory speech recognition in background noise. Specifically, we tested the hypothesis that partially accurate visual text from an automatic speech recognizer could be used successfully to supplement speech understanding in difficult listening conditions in older adults, with normal or impaired hearing. Our working hypotheses were based on what is known regarding audiovisual speech perception in the elderly from speechreading literature. We hypothesized that: 1) combining auditory and visual text information will result in improved recognition accuracy compared to auditory or visual text information alone; 2) benefit from supplementing speech with visual text (auditory and visual enhancement) in young adults will be greater than that in older adults; and 3) individual differences in performance on perceptual measures would be associated with cognitive abilities. Design Fifteen young adults with normal hearing, fifteen older adults with normal hearing, and fifteen older adults with hearing loss participated in this study. All participants completed sentence recognition tasks in auditory-only, text-only, and combined auditory-text conditions. The auditory sentence stimuli were spectrally shaped to restore audibility for the older participants with impaired hearing. All participants also completed various cognitive measures, including measures of working memory, processing speed, verbal comprehension, perceptual and cognitive speed, processing efficiency, inhibition, and the ability to form wholes from parts. Group effects were examined for each of the perceptual and cognitive measures. Audiovisual benefit was calculated relative to performance on auditory-only and visual-text only conditions. Finally, the relationship between perceptual measures and other independent measures were examined using principal-component factor analyses, followed by regression analyses. Results Both young and older adults performed similarly on nine out of ten perceptual measures (auditory, visual, and combined measures). Combining degraded speech with partially correct text from an automatic speech recognizer improved the understanding of speech in both young and older adults, relative to both auditory- and text-only performance. In all subjects, cognition emerged as a key predictor for a general speech-text integration ability. Conclusions These results suggest that neither age nor hearing loss affected the ability of subjects to benefit from text when used to support speech, after ensuring audibility through spectral shaping. These results also suggest that the benefit obtained by supplementing auditory input with partially accurate text is modulated by cognitive ability, specifically lexical and verbal skills.

show abstract

The Benefit Obtained from Visually Displayed Text from an Automatic Speech Recognizer During Listening to Speech Presented in Noise

Cited by 20 publications

References 23 publications

HearCom: Hearing in the Communication Society

HearCom: Hearing in the Communication Society

The effects of selective attention and speech acoustics on neural speech-tracking in a multi-talker scene

Text as a Supplement to Speech in Young and Older Adults

Contact Info

Product

Resources

About