Purpose:
Over the past decade, the signal processing and machine learning literature has demonstrated notable advancements in automated speech processing with the use of artificial intelligence for medical assessment and monitoring (e.g., depression, dementia, and Parkinson's disease, among others). Meanwhile, the clinical speech literature has identified several interpretable, theoretically motivated measures that are sensitive to abnormalities in the cognitive, linguistic, affective, motoric, and anatomical domains. Both fields have, thus, independently demonstrated the potential for speech to serve as an informative biomarker for detecting different psychiatric and physiological conditions. However, despite these parallel advancements, automated speech biomarkers have not been integrated into routine clinical practice to date.
Conclusions:
In this article, we present opportunities and challenges for adoption of speech as a biomarker in clinical practice and research. Toward clinical acceptance and adoption of speech-based digital biomarkers, we argue for the importance of several factors such as robustness, specificity, diversity, and physiological interpretability of speech analytics in clinical applications.
This paper introduces an automated posttraumatic stress disorder (PTSD) screening tool that could potentially be used as a self-assessment or inserted into routine medical visits for PTSD diagnosis and treatment. Methods: With an emotion estimation algorithm providing arousal (excited to calm) and valence (pleasure to displeasure) levels through discourse, we select regions of the acoustic signal that are most salient for PTSD detection. Our algorithm was tested on a subset of data from the DVBIC-TBICoE TBI Study, which contains PTSD Check List Civilian (PCL-C) assessment scores. Results: Speech from low-arousal and positive-valence regions provide the best discrimination for PTSD. Our model achieved an AUC (area under the curve) equal to 0.80 in detecting PCL-C ratings, outperforming models with no emotion filtering (AUC = 0.68). Conclusions: This result suggests that emotion drives the selection of the most salient temporal regions of an audio recording for PTSD detection. Impact Statement-Vocal biomarkers based on temporal regions of low-arousal and positive-valence achieve an area under the curve of 0.80 in detecting PTSD Check List Civilian (PCL-C) ratings.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.