Preprocessing for elderly speech recognition of smart devices

Kwon, Soonil; Kim, Sung-Jae; Choeh, Joon Yeon

doi:10.1016/j.csl.2015.09.002

Cited by 13 publications

(5 citation statements)

References 12 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…These errors may 'propagate' downstream (Errattahi, Hannani, & Ouahmane, 2018). Adapting ASR systems for older voices can help to reduce errors: Zhou et al (2016) found that using a small, domain specific dataset led to fewer errors than using large, out-of-domain data, and Kwon, Kim & Choeh (2016) improved accuracy by preprocessing data in-line with elderly speech patterns. Given that early detection of AD will rely on ASR capabilities in adult voices, as opposed to older, current systems may be appropriate.…”

Section: Automatic Speech Recognitionmentioning

confidence: 99%

How to do things with (thousands of) words: Computational approaches to discourse analysis in Alzheimer's disease

Clarke

Foltz

Garrard

2020

Cortex

View full text Add to dashboard Cite

Natural Language Processing (NLP) is an ever-growing field of computational science that aims to model natural human language. Combined with advances in machine learning, which learns patterns in data, it offers practical capabilities including automated language analysis. These approaches have garnered interest from clinical researchers seeking to understand the breakdown of language due to pathological changes in the brain, offering fast, replicable and objective methods. The study of Alzheimer's disease (AD), and preclinical Mild Cognitive Impairment (MCI), suggests that changes in discourse (connected speech or writing) may be key to early detection of disease. There is currently no disease-modifying treatment for AD, the leading cause of dementia in people over the age of 65, but detection of those at risk of developing the disease could help with the identification and testing of medications which can take effect before the underlying pathology has irreversibly spread. We outline important components of natural language, as well as NLP tools and approaches with which they can be extracted, analysed and used for disease identification and risk prediction. We review literature using these tools to model discourse across the spectrum of AD, including the contribution of machine learning approaches and Automatic Speech Recognition (ASR). We conclude that NLP and machine learning techniques are starting to greatly enhance research in the field, with measurable and quantifiable language components showing promise for early detection of disease, but there remain research and practical challenges for clinical implementation of these approaches. Challenges discussed include the availability of large and diverse datasets, ethics of data collection and sharing, diagnostic specificity and clinical acceptability.

show abstract

Section: Automatic Speech Recognitionmentioning

confidence: 99%

How to do things with (thousands of) words: Computational approaches to discourse analysis in Alzheimer's disease

Clarke

Foltz

Garrard

2020

Cortex

View full text Add to dashboard Cite

show abstract

“…We note that, although high, the word error rate of the elderly participants in the current study is consistent with other studies using automated speech recognition on elderly speech, 42,43 even in controlled laboratory settings. 44 There are reports that pre-processing of elderly speech can decrease the word error rate by up to 12%, 45 although evidence suggests that natural language processing models are relatively impervious to high word error rate. 30 The robust performance of natural language processing models can be attributed to different normalizations of words between a human transcript and an automated speech recognition transcript, trivial word errors that would not change the meaning of a sentence (e.g., "one" vs. "1"), and that natural language processing models are generally trained with a diverse set of language features and are thus able to retain different facets of the language even in the context of a large word error rate.…”

Section: Discussionmentioning

confidence: 99%

Acceptability of collecting speech samples from the elderly via the telephone

et al. 2021

View full text Add to dashboard Cite

Objective There is a critical need to develop rapid, inexpensive and easily accessible screening tools for mild cognitive impairment (MCI) and Alzheimer’s disease (AD). We report on the efficacy of collecting speech via the telephone to subsequently develop sensitive metrics that may be used as potential biomarkers by leveraging natural language processing methods. Methods Ninety-one older individuals who were cognitively unimpaired or diagnosed with MCI or AD participated from home in an audio-recorded telephone interview, which included a standard cognitive screening tool, and the collection of speech samples. In this paper we address six questions of interest: (1) Will elderly people agree to participate in a recorded telephone interview? (2) Will they complete it? (3) Will they judge it an acceptable approach? (4) Will the speech that is collected over the telephone be of a good quality? (5) Will the speech be intelligible to human raters? (6) Will transcriptions produced by automated speech recognition accurately reflect the speech produced? Results Participants readily agreed to participate in the telephone interview, completed it in its entirety, and rated the approach as acceptable. Good quality speech was produced for further analyses to be applied, and almost all recorded words were intelligible for human transcription. Not surprisingly, human transcription outperformed off the shelf automated speech recognition software, but further investigation into automated speech recognition shows promise for its usability in future work. Conclusion Our findings demonstrate that collecting speech samples from elderly individuals via the telephone is well tolerated, practical, and inexpensive, and produces good quality data for uses such as natural language processing.

show abstract

“…The diversity of languages, vernaculars, dialects, and people understood and supported by CUIs is an important, yet incredibly difficult challenge. This shouldn't be seen just as a problem of improving speechto-text accuracy for specific populations (for example, adapting to the slower speech and inter-syllabic silence of elderly users [53]), but more widely on the understanding and adapting to how different groups of people speak; their idioms, tropes, and methods for imbuing emotional and social subtlety in language.…”

Section: Breakdowns and Recoverymentioning

confidence: 99%

Conversational User Interfaces on Mobile Devices

Jaber

2020

Proceedings of the 2nd Conference on Conversational User Interfaces

View full text Add to dashboard Cite

Conversational User Interfaces (CUI) on mobile devices are the most accessible and widespread examples of voice-based interaction in the wild. This paper presents a survey of mobile conversation user interface research since the commercial deployment of Apple's Siri, the first readily available consumer CUI. We present and discuss Text Entry & Typing, Application Control, Speech Analysis, Conversational Agents, Spoken Output, & Probes as the prevalent themes of research in this area. We also discuss this body of work in relation to the domains of Health & Well-being, Education, Games, and Transportation. We conclude this paper with a discussion on Multi-modal CUIs, Conversational Repair, and the implications for CUIs of greater access to the context of use. CCS CONCEPTS• Computing methodologies → Speech recognition; • Humancentered computing → Interaction techniques; Smartphones; • General and reference → Surveys and overviews.

show abstract

Preprocessing for elderly speech recognition of smart devices

Cited by 13 publications

References 12 publications

How to do things with (thousands of) words: Computational approaches to discourse analysis in Alzheimer's disease

How to do things with (thousands of) words: Computational approaches to discourse analysis in Alzheimer's disease

Acceptability of collecting speech samples from the elderly via the telephone

Conversational User Interfaces on Mobile Devices

Contact Info

Product

Resources

About