Background In contrast to all other areas of medicine, psychiatry is still nearly entirely reliant on subjective assessments such as patient self-report and clinical observation. The lack of objective information on which to base clinical decisions can contribute to reduced quality of care. Behavioral health clinicians need objective and reliable patient data to support effective targeted interventions. Objective We aimed to investigate whether reliable inferences—psychiatric signs, symptoms, and diagnoses—can be extracted from audiovisual patterns in recorded evaluation interviews of participants with schizophrenia spectrum disorders and bipolar disorder. Methods We obtained audiovisual data from 89 participants (mean age 25.3 years; male: 48/89, 53.9%; female: 41/89, 46.1%): individuals with schizophrenia spectrum disorders (n=41), individuals with bipolar disorder (n=21), and healthy volunteers (n=27). We developed machine learning models based on acoustic and facial movement features extracted from participant interviews to predict diagnoses and detect clinician-coded neuropsychiatric symptoms, and we assessed model performance using area under the receiver operating characteristic curve (AUROC) in 5-fold cross-validation. Results The model successfully differentiated between schizophrenia spectrum disorders and bipolar disorder (AUROC 0.73) when aggregating face and voice features. Facial action units including cheek-raising muscle (AUROC 0.64) and chin-raising muscle (AUROC 0.74) provided the strongest signal for men. Vocal features, such as energy in the frequency band 1 to 4 kHz (AUROC 0.80) and spectral harmonicity (AUROC 0.78), provided the strongest signal for women. Lip corner–pulling muscle signal discriminated between diagnoses for both men (AUROC 0.61) and women (AUROC 0.62). Several psychiatric signs and symptoms were successfully inferred: blunted affect (AUROC 0.81), avolition (AUROC 0.72), lack of vocal inflection (AUROC 0.71), asociality (AUROC 0.63), and worthlessness (AUROC 0.61). Conclusions This study represents advancement in efforts to capitalize on digital data to improve diagnostic assessment and supports the development of a new generation of innovative clinical tools by employing acoustic and facial data analysis.
Background Previous research has shown the feasibility of using machine learning models trained on social media data from a single platform (eg, Facebook or Twitter) to distinguish individuals either with a diagnosis of mental illness or experiencing an adverse outcome from healthy controls. However, the performance of such models on data from novel social media platforms unseen in the training data (eg, Instagram and TikTok) has not been investigated in previous literature. Objective Our study examined the feasibility of building machine learning classifiers that can effectively predict an upcoming psychiatric hospitalization given social media data from platforms unseen in the classifiers’ training data despite the preliminary evidence on identity fragmentation on the investigated social media platforms. Methods Windowed timeline data of patients with a diagnosis of schizophrenia spectrum disorder before a known hospitalization event and healthy controls were gathered from 3 platforms: Facebook (254/268, 94.8% of participants), Twitter (51/268, 19% of participants), and Instagram (134/268, 50% of participants). We then used a 3 × 3 combinatorial binary classification design to train machine learning classifiers and evaluate their performance on testing data from all available platforms. We further compared results from models in intraplatform experiments (ie, training and testing data belonging to the same platform) to those from models in interplatform experiments (ie, training and testing data belonging to different platforms). Finally, we used Shapley Additive Explanation values to extract the top predictive features to explain and compare the underlying constructs that predict hospitalization on each platform. Results We found that models in intraplatform experiments on average achieved an F1-score of 0.72 (SD 0.07) in predicting a psychiatric hospitalization because of schizophrenia spectrum disorder, which is 68% higher than the average of models in interplatform experiments at an F1-score of 0.428 (SD 0.11). When investigating the key drivers for divergence in construct validities between models, an analysis of top features for the intraplatform models showed both low predictive feature overlap between the platforms and low pairwise rank correlation (<0.1) between the platforms’ top feature rankings. Furthermore, low average cosine similarity of data between platforms within participants in comparison with the same measurement on data within platforms between participants points to evidence of identity fragmentation of participants between platforms. Conclusions We demonstrated that models built on one platform’s data to predict critical mental health treatment outcomes such as hospitalization do not generalize to another platform. In our case, this is because different social media platforms consistently reflect different segments of participants’ identities. With the changing ecosystem of social media use among different demographic groups and as web-based identities continue to become fragmented across platforms, further research on holistic approaches to harnessing these diverse data sources is required.
BACKGROUND Previous research has shown the feasibility of utilizing social media data from a singular platform (e.g., Facebook or Twitter) in distinguishing individuals with a diagnosis of mental illness or experiencing an adverse outcome from healthy volunteers. However, the performance of these models on data from other social media platforms unseen in the training data (e.g., Instagram, TikTok) have not been investigated. OBJECTIVE This study aims to explore if online identities fragmented across social media platforms, models would have better testing performance on data from already seen social media platforms, in comparison to unseen social media platforms. It also aims to explain such discrepancies in performances if they are found. METHODS Windowed timeline data from three platforms with clinically-verified labels of hospitalization among patients with a diagnosis of schizophrenia was gathered: Facebook (N = 254), Twitter (N = 54), and Instagram (N = 124). Then, we utilized a 3 x 3 combinatorial binary classification design to test model’s performance on testing data from all available platforms. We further compared results from models within intra-platform experiments (i.e., training and testing data belongs to the same platform) to models within inter-platform experiments (i.e., training and testing data belongs to the different platforms). Finally, we utilized SHapley Additive exPlanations (SHAP) to extract top predictive features to explain the underlying constructs that predict hospitalization on each platform. RESULTS We found that models within intra-platform experiments on average achieved an F1-score of 0.72 in detection a psychiatric hospitalization due to schizophrenia, which is 68% higher compared to the average of models within inter-platform experiments at an F1-score of 0.428. We also found that by combining training data of all three platforms, a slight improvement of 0.5% was observed on the testing sets on average, compared to original intra-platform models. An analysis of top features for the intra-platform models shows low predictive feature overlap between the platforms, with ‘anger’ being an unique top feature for Facebook while ‘sad’ being an unique top feature for Instagram. CONCLUSIONS We demonstrated models built on one platform’s data to predict critical mental health treatment outcomes, such as a hospitalization, may not generalize to another, because each platform offers different construct validity. However, combining data from multiple platforms together may offer a more comprehensive view of a patient’s state and situation, and therefore fare better in relapse prediction. With the changing ecosystem of social media use among different demographic groups and as online identities continue to get fragmented across platforms, further research on holistic approaches to harnessing these diverse data sources is required.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.