Negative symptoms in schizophrenia are associated with significant burden and possess little to no robust treatments in clinical practice today. One key obstacle impeding the development of better treatment methods is the lack of an objective measure. Since negative symptoms almost always adversely affect speech production in patients, speech dysfunction have been considered as a viable objective measure. However, researchers have mostly focused on the verbal aspects of speech, with scant attention to the non-verbal cues in speech. In this paper, we have explored non-verbal speech cues as objective measures of negative symptoms of schizophrenia. We collected an interview corpus of 54 subjects with schizophrenia and 26 healthy controls. In order to validate the non-verbal speech cues, we computed the correlation between these cues and the NSA-16 ratings assigned by expert clinicians. Significant correlations were obtained between these non-verbal speech cues and certain NSA indicators. For instance, the correlation between Turn Duration and Restricted Speech is -0.5, Response time and NSA Communication is 0.4, therefore indicating that poor communication is reflected in the objective measures, thus validating our claims. Moreover, certain NSA indices can be classified into observable and non-observable classes from the non-verbal speech cues by means of supervised classification methods. In particular the accuracy for Restricted speech quantity and Prolonged response time are 80% and 70% respectively. We were also able to classify healthy and patients using non-verbal speech features with 81.3% accuracy.
Negative symptoms of schizophrenia are often associated with the blunting of emotional affect which creates a serious impediment in the daily functioning of the patients. Affective prosody is almost always adversely impacted in such cases, and is known to exhibit itself through the low-level acoustic signals of prosody. To automate and simplify the process of assessment of severity of emotion related symptoms of schizophrenia, we utilized these low-level acoustic signals to predict the expert subjective ratings assigned by a trained psychologist during an interview with the patient. Specifically, we extract acoustic features related to emotion using the openSMILE toolkit from the audio recordings of the interviews. We analysed the interviews of 78 paid participants (52 patients and 26 healthy controls) in this study. The subjective ratings could be accurately predicted from the objective openSMILE acoustic signals with an accuracy of 61-85% using machine-learning algorithms with leave-oneout cross-validation technique. Furthermore, these objective measures can be reliably utilized to distinguish between the patient and healthy groups, as supervised learning methods can classify the two groups with 79-86% accuracy.
A real-time system is proposed to quantitatively assess speaking mannerisms and social behavior from audio recordings of two-person dialogs. Speaking mannerisms are quantitatively assessed by low-level speech metrics such as volume, rate, and pitch of speech. The social behavior is quantified by sociometrics including level of interest, agreement, and dominance. Such quantitative measures can be used to provide real-time feedback to the speakers, for instance, to alarm to speaker when the voice is too strong (speaking mannerism), or when the conversation is not proceeding well due to disagreements or numerous interruptions (social behavior). In the proposed approach, machine learning algorithms are designed to compute the sociometrics (level of interest, agreement, and dominance) in real-time from combinations of low-level speech metrics. To this end, a corpus of 150 brief two-person dialogs in English was collected. Several experts assessed the sociometrics for each of those dialogs. Next, the resulting annotated dialogs are used to train the machine learning algorithms in a supervised manner. Through this training procedure, the algorithms learn how the sociometrics depend on the low-level speech metrics, and consequently, are able to compute the sociometrics from speech recordings in an automated fashion, without further help of experts. Numerical tests through leave-one-out cross-validation indicate that the accuracy of the algorithms for inferring the sociometrics is in the range of 80-90%. In future, those reliable predictions can be the key to real-time sociofeedback, where speakers will be provided feedback in real-time about their behavior in an ongoing discussion. Such technology may be helpful in many contexts, for instance in group meetings, counseling, or executive training.
Negative symptoms in schizophrenia are associated with significant burden and functional impairment, especially speech production. In clinical practice today, there are no robust treatments for negative symptoms and one obstacle surrounding its research is the lack of an objective measure. To this end, we explore non-verbal speech cues as objective measures. Specifically, we extract these cues while schizophrenic patients are interviewed by psychologists. We have analyzed interviews of 15 patients who were enrolled in an observational study on the effectiveness of Cognitive Remediation Therapy (CRT). The subject (undergoing CRT) and control group (not undergoing CRT) contains 8 and 7 individuals respectively. The patients were recorded during three sessions while being evaluated for negative symptoms over a 12-week follow-up period. In order to validate the non-verbal speech cues, we computed their correlation with the Negative Symptom Assessment (NSA-16). Our results suggest a strong correlation between certain measures of the two rating sets. Supervised prediction of the subjective ratings from the non-verbal speech features with leave-one-person-out cross-validation has reasonable accuracy of 53-80%. Furthermore, the non-verbal cues can be used to reliably distinguish between the subjects and controls, as supervised learning methods can classify the two groups with 80-93% accuracy.
Speech disorders are among the salient characteristics of negative symptoms of schizophrenia. Such impairments are often exhibited through disorganized speech, inappropriate affective prosody, and poverty of speech. The current method of detecting such symptoms requires the expertise of a trained clinician, which may be prohibitive due to cost, stigma or high patient-to-clinician ratio. An objective method to extract nonverbal and verbal speech-related cues can help to automate and simplify the assessment method of severity of speechrelated symptoms of schizophrenia. In this paper, a novel automated method is presented which uses speech content from schizophrenic patients to predict the clinician-assigned subjective ratings of their negative symptoms. Specifically, the interviews of 50 schizophrenia patients were recorded and features related to acoustics, linguistics and non-verbal conversation were extracted. The subjective ratings can be accurately predicted from the objective features with an accuracy of 64-82% using machine learning algorithms with leave-one-out cross-validation. Our findings support the utility of automated speech analysis to aid clinician diagnosis, monitoring and understanding of schizophrenia.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.