In the last months, there has been an increasing interest in developing reliable, cost-effective, immediate and easy to use machine learning based tools that can help health care operators, institutions, companies, etc. to optimize their screening campaigns. In this line, several initiatives emerged aimed at the automatic detection of COVID-19 from speech, breathing and coughs, with inconclusive preliminary results. The ComParE 2021 COVID-19 Cough Sub-challenge provides researchers from all over the world a suitable test-bed for the evaluation and comparison of their work. In this paper, we present the INESC-ID contribution to the ComParE 2021 COVID-19 Cough Sub-challenge. We leverage transfer learning to develop a set of three expert classifiers based on deep cough representation extractors. A calibrated decision-level fusion system provides the final classification of coughs recordings as either COVID-19 positive or negative. Results show unweighted average recalls of 72.3% and 69.3% in the development and test sets, respectively. Overall, the experimental assessment shows the potential of this approach although much more research on extended respiratory sounds datasets is needed.
Silent Computational Paralinguistics (SCP) -the assessment of speaker states and traits from non-audibly spoken communication -has rarely been targeted in the rich body of either Computational Paralinguistics or Silent Speech Processing. Here, we provide first steps towards this challenging but potentially highly rewarding endeavour: Paralinguistics can enrich spoken language interfaces, while Silent Speech Processing enables confidential and unobtrusive spoken communication for everybody, including mute speakers. We approach SCP by using speech-related biosignals stemming from facial muscle activities captured by surface electromyography (EMG). To demonstrate the feasibility of SCP, we select one speaker trait (speaker identity) and one speaker state (speaking mode). We introduce two promising strategies for SCP: (1) deriving paralinguistic speaker information directly from EMG of silently produced speech versus (2) first converting EMG into an audible speech signal followed by conventional computational paralinguistic methods. We compare traditional feature extraction and decision making approaches to more recent deep representation and transfer learning by convolutional and recurrent neural networks, using openly available EMG data. We find that paralinguistics can be assessed not only from acoustic speech but also from silent speech captured by EMG.
Automatic detection of speech affecting (SA) diseases has received significant attention, particularly in clinical scenarios. However, the same task in in-the-wild conditions is often neglected, in part, due to the lack of appropriate datasets.In this work, we present the in-the-Wild Speech Medical (WSM) Corpus, a collection of in-the-wild videos, featuring subjects potentially affected by a SA disease -specifically, depression or Parkinson's disease. The WSM Corpus contains a total 928 videos, and over 131 hours of speech. Each video is accompanied by a crowdsourced annotation for perceived age/gender, and self-reported health status of the speaker. The WSM Corpus is balanced over all the labels.In this work we present a detailed description of the collection, and annotation processes of the WSM corpus. Furthermore, we present present several baseline systems for the detection of SA diseases using speech alone, thus motivating the use of this type of in-the-wild data in paralinguistic audiovisual tasks.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.