Background and Hypothesis
Despite decades of “proof of concept” findings supporting the use of Natural Language Processing (NLP) in psychosis research, clinical implementation has been slow. One obstacle reflects the lack of comprehensive psychometric evaluation of these measures. There is overwhelming evidence that criterion and content validity can be achieved for many purposes, particularly using machine learning procedures. However, there has been very little evaluation of test-retest reliability, divergent validity (sufficient to address concerns of a “generalized deficit”), and potential biases from demographics and other individual differences.
Study Design
This article highlights these concerns in development of an NLP measure for tracking clinically rated paranoia from video “selfies” recorded from smartphone devices. Patients with schizophrenia or bipolar disorder were recruited and tracked over a week-long epoch. A small NLP-based feature set from 499 language samples were modeled on clinically rated paranoia using regularized regression.
Study Results
While test–retest reliability was high, criterion, and convergent/divergent validity were only achieved when considering moderating variables, notably whether a patient was away from home, around strangers, or alone at the time of the recording. Moreover, there were systematic racial and sex biases in the model, in part, reflecting whether patients submitted videos when they were away from home, around strangers, or alone.
Conclusions
Advancing NLP measures for psychosis will require deliberate consideration of test-retest reliability, divergent validity, systematic biases and the potential role of moderators. In our example, a comprehensive psychometric evaluation revealed clear strengths and weaknesses that can be systematically addressed in future research.
Individuals with schizophrenia have higher mortality and shorter lifespans. There are a multitude of factors which create these conditions, but one aspect is worse physical health, particularly cardiovascular and metabolic health. Many interventions to improve the health of individuals with schizophrenia have been created, but on the whole, there has been limited effectiveness in improving quality of life or lifespan. One potential new avenue for inquiry involves a more patient-centric perspective; understanding aspects of physical health most important, and potentially most amenable to change, for individuals based on their life narratives. This study used topic modeling, a type of Natural Language Processing (NLP) on unstructured speech samples from individuals (n = 366) with serious mental illness, primarily schizophrenia, in order to extract topics. Speech samples were drawn from three studies collected over a decade in two geographically distinct regions of the United States. Several health-related topics emerged, primarily centered around food, living situation, and lifestyle (e.g., routine, hobbies). The implications of these findings for how individuals with serious mental illness and schizophrenia think about their health, and what may be most effective for future health promotion policies and interventions, are discussed.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.