Background Schizophrenia is a severe psychiatric disorder that causes significant social and functional impairment. Currently, the diagnosis of schizophrenia is based on information gleaned from the patient’s self-report, what the clinician observes directly, and what the clinician gathers from collateral informants, but these elements are prone to subjectivity. Utilizing computer vision to measure facial expressions is a promising approach to adding more objectivity in the evaluation and diagnosis of schizophrenia. Method We conducted a systematic review using PubMed and Google Scholar. Relevant publications published before (including) December 2021 were identified and evaluated for inclusion. The objective was to conduct a systematic review of computer vision for facial behavior analysis in schizophrenia studies, the clinical findings, and the corresponding data processing and machine learning methods. Results Seventeen studies published between 2007 to 2021 were included, with an increasing trend in the number of publications over time. Only 14 articles used interviews to collect data, of which different combinations of passive to evoked, unstructured to structured interviews were used. Various types of hardware were adopted and different types of visual data were collected. Commercial, open-access, and in-house developed models were used to recognize facial behaviors, where frame-level and subject-level features were extracted. Statistical tests and evaluation metrics varied across studies. The number of subjects ranged from 2-120, with an average of 38. Overall, facial behaviors appear to have a role in estimating diagnosis of schizophrenia and psychotic symptoms. When studies were evaluated with a quality assessment checklist, most had a low reporting quality. Conclusion Despite the rapid development of computer vision techniques, there are relatively few studies that have applied this technology to schizophrenia research. There was considerable variation in the clinical paradigm and analytic techniques used. Further research is needed to identify and develop standardized practices, which will help to promote further advances in the field.
Background Current standards of psychiatric assessment and diagnostic evaluation rely primarily on the clinical subjective interpretation of a patient’s outward manifestations of their internal state. While psychometric tools can help to evaluate these behaviors more systematically, the tools still rely on the clinician’s interpretation of what are frequently nuanced speech and behavior patterns. With advances in computing power, increased availability of clinical data, and improving resolution of recording and sensor hardware (including acoustic, video, accelerometer, infrared, and other modalities), researchers have begun to demonstrate the feasibility of cutting-edge technologies in aiding the assessment of psychiatric disorders. Objective We present a research protocol that utilizes facial expression, eye gaze, voice and speech, locomotor, heart rate, and electroencephalography monitoring to assess schizophrenia symptoms and to distinguish patients with schizophrenia from those with other psychiatric disorders and control subjects. Methods We plan to recruit three outpatient groups: (1) 50 patients with schizophrenia, (2) 50 patients with unipolar major depressive disorder, and (3) 50 individuals with no psychiatric history. Using an internally developed semistructured interview, psychometrically validated clinical outcome measures, and a multimodal sensing system utilizing video, acoustic, actigraphic, heart rate, and electroencephalographic sensors, we aim to evaluate the system’s capacity in classifying subjects (schizophrenia, depression, or control), to evaluate the system’s sensitivity to within-group symptom severity, and to determine if such a system can further classify variations in disorder subtypes. Results Data collection began in July 2020 and is expected to continue through December 2022. Conclusions If successful, this study will help advance current progress in developing state-of-the-art technology to aid clinical psychiatric assessment and treatment. If our findings suggest that these technologies are capable of resolving diagnoses and symptoms to the level of current psychometric testing and clinician judgment, we would be among the first to develop a system that can eventually be used by clinicians to more objectively diagnose and assess schizophrenia and depression with the possibility of less risk of bias. Such a tool has the potential to improve accessibility to care; to aid clinicians in objectively evaluating diagnoses, severity of symptoms, and treatment efficacy through time; and to reduce treatment-related morbidity. International Registered Report Identifier (IRRID) DERR1-10.2196/36417
BACKGROUND Automatic speech recognition (ASR) technology is increasingly being used for transcription in clinical contexts. Although there are numerous HIPAA-compliant transcription services using ASR, few studies have compared the word error rate (WER) between different transcription services among different diagnostic groups in a mental health setting. There has also been little research into the types of words ASR transcriptions mistakenly generate or omit. OBJECTIVE This study compared the WER of three ASR transcription services (Amazon Transcribe, Zoom/Otter AI, and Whisper/Open AI) in interviews across three different clinical categories (controls, participants experiencing depression, and participants experiencing a variety of other mental health conditions). These ASR transcription services were also compared to a commercial human transcription service, REV. Words that were either included or excluded by the error in the transcripts were systematically analyzed by their Linguistic Inquiry and Word Count (LIWC) categories. METHODS Participants completed a one-time research psychiatric interview, which was recorded on a secure server. Transcriptions created by the research team were used as the gold standard from which WER was calculated. The interviewees were categorized into either the control group (N = 19), the major depressive disorder (MDD) group (N = 22), or the other group (N = 24) using the Mini-International Neuropsychiatric Interview. The total sample included 65 participants. Brunner-Munzel tests were used for comparing independent sets such as the diagnostic groupings, and Wilcoxon signed-rank tests were used for correlated samples when comparing the total sample between different transcriptions services. RESULTS There were significant differences between each ASR transcription service WER (P < .001). Amazon Transcribe’s output exhibited significantly lower WERs compared to the Zoom/Otter AI and Whisper/Open AI ASR. ASR performances did not significantly differ across the three different clinical categories within each service (P > 0.05). A comparison between the human transcription service output from REV and the best-performing ASR (Amazon Transcribe) demonstrated a significant difference, with REV having a slightly lower median WER (7.6% versus 8.9%). Heatmaps and spider plots were used to visualize the most common errors in LIWC categories, which were found to be within three overarching categories: Conversation, Cognition, and Function. CONCLUSIONS Overall, these results indicate that the WER between manual and automated transcription services is narrowing as ASR services advance. These advances, coupled with decreased cost and time in receiving transcriptions, may make ASR transcriptions a more viable option within healthcare settings. However, more research is required to determine if errors in specific types of words impact the analysis and utility of these transcriptions, particularly for specific applications and in a variety of populations in terms of clinical diagnosis, literacy level, accent, and cultural origin.
BACKGROUND Current standards of psychiatric assessment and diagnostic evaluation rely primarily on the clinical subjective interpretation of a patient’s outward manifestations of their internal state. While psychometric tools can help to evaluate these behaviors more systematically, the tools still rely on the clinician’s interpretation of what are frequently nuanced speech and behavior patterns. With advances in computing power, increased availability of clinical data, and improving resolution of recording and sensor hardware (including acoustic, video, accelerometer, infrared, and other modalities), researchers have begun to demonstrate the feasibility of cutting-edge technologies in aiding the assessment of psychiatric disorders. OBJECTIVE We present a research protocol that utilizes facial expression, eye gaze, voice and speech, locomotor, heart rate, and electroencephalography monitoring to assess schizophrenia symptoms and to distinguish patients with schizophrenia from those with other psychiatric disorders and control subjects. METHODS We plan to recruit three outpatient groups: (1) 50 patients with schizophrenia, (2) 50 patients with unipolar major depressive disorder, and (3) 50 individuals with no psychiatric history. Using an internally developed semistructured interview, psychometrically validated clinical outcome measures, and a multimodal sensing system utilizing video, acoustic, actigraphic, heart rate, and electroencephalographic sensors, we aim to evaluate the system’s capacity in classifying subjects (schizophrenia, depression, or control), to evaluate the system’s sensitivity to within-group symptom severity, and to determine if such a system can further classify variations in disorder subtypes. RESULTS Data collection began in July 2020 and is expected to continue through December 2022. CONCLUSIONS If successful, this study will help advance current progress in developing state-of-the-art technology to aid clinical psychiatric assessment and treatment. If our findings suggest that these technologies are capable of resolving diagnoses and symptoms to the level of current psychometric testing and clinician judgment, we would be among the first to develop a system that can eventually be used by clinicians to more objectively diagnose and assess schizophrenia and depression with the possibility of less risk of bias. Such a tool has the potential to improve accessibility to care; to aid clinicians in objectively evaluating diagnoses, severity of symptoms, and treatment efficacy through time; and to reduce treatment-related morbidity. INTERNATIONAL REGISTERED REPORT DERR1-10.2196/36417
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.