Proceedings of the 2021 International Conference on Multimodal Interaction 2021
DOI: 10.1145/3462244.3479967
|View full text |Cite
|
Sign up to set email alerts
|

Multimodal Approach for Assessing Neuromotor Coordination in Schizophrenia Using Convolutional Neural Networks

Abstract: This study investigates the speech articulatory coordination in schizophrenia subjects exhibiting strong positive symptoms (e.g. hallucinations and delusions), using two distinct channel-delay correlation methods. We show that the schizophrenic subjects with strong positive symptoms and who are markedly ill pose complex articulatory coordination pattern in facial and speech gestures than what is observed in healthy subjects. This distinction in speech coordination pattern is used to train a multimodal convolut… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
2
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
3
2

Relationship

2
3

Authors

Journals

citations
Cited by 5 publications
(4 citation statements)
references
References 24 publications
0
2
0
Order By: Relevance
“…Due to the small size of the dataset, we trained our model using leave-one-out cross-validation and then took the average to validate the generalization ability of our model, and so does the reproduction of all related works. Specifically, we re-implement the bimodal assessment models [7]- [10], as introduced in Section II, by replacing the classification parts with regression heads. To compare the overall performance between our multimodal model and the related works, we calculated the mean average error (MAE) and mean squared error (MSE) of each symptom, and then average the MAEs and MSEs over 16 TLC symptoms, 15 PANSS symptoms and all 31 symptoms in both scales, respectively, and the results are shown in Table I.…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…Due to the small size of the dataset, we trained our model using leave-one-out cross-validation and then took the average to validate the generalization ability of our model, and so does the reproduction of all related works. Specifically, we re-implement the bimodal assessment models [7]- [10], as introduced in Section II, by replacing the classification parts with regression heads. To compare the overall performance between our multimodal model and the related works, we calculated the mean average error (MAE) and mean squared error (MSE) of each symptom, and then average the MAEs and MSEs over 16 TLC symptoms, 15 PANSS symptoms and all 31 symptoms in both scales, respectively, and the results are shown in Table I.…”
Section: Resultsmentioning
confidence: 99%
“…Siriwardena et al [7] extended the work with deep learning techniques. They adopted two Convolutional Neural Networks (CNNs) as backbone networks to process vocal tract variables and FAU respectively, and then the intermediate features from both CNNs were concatenated and passed thorough FC layers to predict the output.…”
Section: B Automatic Assessment Using Multimodalitiesmentioning
confidence: 99%
“…The mapping from acoustics to articulation is an ill-posed problem which is known to be highly non-linear and non-unique [2]. However, over the recent years, the development of Speech Inversion (SI) systems has gained attention due to its potential in wide range of applications ranging from Automatic Speech Recognition (ASR) [3,4], speech synthesis [5,6], speech therapy [7] and most recently with detecting mental health disorders like Major Depressive Disorder and Schizophrenia [8,9]. Real articulatory data is obtained using techniques like X-ray microbeam [10], Electromagnetic Articulometry (EMA) [11] and real-time Magnetic Resonance Imaging (rt-MRI) [12].…”
Section: Introductionmentioning
confidence: 99%
“…This mapping from acoustics to articulation is an ill-posed problem which is known to be highly non-linear and non-unique [3]. However, developing Speech Inversion (SI) systems have gained attention over the recent years mainly due to its potential in a wide range of speech applications like Automatic Speech Recognition (ASR) [4,5,6], speech synthesis [7,8], speech therapy [9] and most recently with detecting mental health disorders like Major Depressive Disorder and Schizophrenia [10,11]. Real articulatory data are collected by techniques like X-ray microbeam [12], Electromagnetic Articulometry (EMA) [13] and real-time Magnetic Resonance Imaging (rt-MRI) [14].…”
Section: Introductionmentioning
confidence: 99%