Depression Severity Prediction Based on Biomarkers of Psychomotor Retardation

Syed, Zafi Sherhan; Sidorov, Kirill; Marshall, David

doi:10.1145/3133944.3133947

Cited by 30 publications

(21 citation statements)

References 29 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The second approach to modeling depression attempts to exploit global and/or time varying statistics, independent of the question that prompted the response. Williamson et al utilized correlations of formants and spectral information across different time scales [11], Syed et al developed audio and video features to capture temporal variations [12], while Pampouchidou et al and Nasir et al fused low and high-level features [13,14]. Utilizing emerging techniques, Ma et al used audio to model depression by allowing deep neural networks to learn such associations rather than perform feature engineering [15].…”

Section: Introductionmentioning

confidence: 99%

Detecting Depression with Audio/Text Sequence Modeling of Interviews

Alhanai¹,

2018

View full text Add to dashboard Cite

Medical professionals diagnose depression by interpreting the responses of individuals to a variety of questions, probing lifestyle changes and ongoing thoughts. Like professionals, an effective automated agent must understand that responses to queries have varying prognostic value. In this study we demonstrate an automated depression-detection algorithm that models interviews between an individual and agent and learns from sequences of questions and answers without the need to perform explicit topic modeling of the content. We utilized data of 142 individuals undergoing depression screening, and modeled the interactions with audio and text features in a Long-Short Term Memory (LSTM) neural network model to detect depression. Our results were comparable to methods that explicitly modeled the topics of the questions and answers which suggests that depression can be detected through sequential modeling of an interaction, with minimal information on the structure of the interview.

show abstract

Section: Introductionmentioning

confidence: 99%

Detecting Depression with Audio/Text Sequence Modeling of Interviews

Alhanai¹,

2018

View full text Add to dashboard Cite

show abstract

“…Feature aggregation has been a key component of successful systems for speech paralinguistic tasks [64]- [66], and in recent years we have also had success using these methods [67]- [69]. In this work, we experimented with three types of feature aggregation methods that are based on functionals, Bag of Words [70], and Fisher Vector Encoding [71].…”

Section: Feature Aggregationmentioning

confidence: 99%

“…The Fisher Vector Encoding (FVE) method for feature aggregation was originally proposed by Perronnin et al [71], [72] as an improvement to the bag-of-visual-words method for computer vision applications. However, this method has been also successfully adapted for applications related to speech paralinguistics [64], [65], [67]. In this approach, a generative model, typically a Gaussian mixture model (GMM) [73], is used to model the background feature space.…”

Section: Feature Aggregationmentioning

confidence: 99%

Automated Recognition of Alzheimer’s Dementia Using Bag-of-Deep-Features and Model Ensembling

Syed

Lech

et al. 2021

IEEE Access

Self Cite

View full text Add to dashboard Cite

Alzheimer's dementia is a progressive neurodegenerative disease that causes cognitive and physical impairment. It severely deteriorates the quality of life in affected individuals. An early diagnosis can assist immensely in better management of their healthcare needs. In recent years, there has been a renewed impetus in development of automated methods for recognition of various disorders by leveraging advancements in artificial intelligence. Here, we propose a multimodal system that can identify linguistic and paralinguistic traits of dementia using an automated screening tool. We show that bag-of-deep-neuralembeddings and ensemble learning offer a viable approach to objective assessment of dementia. The developed system is tested on the Alzheimer's Dementia Recognition Challenge dataset, where it achieved a new state-of-the-art (SOTA) performance for the classification task and matched the current SOTA for the regression task. These results highlight the efficacy of our proposed system for facilitating an early diagnosis of dementia.

show abstract

“…Feature aggregation is an approach through which LLDs are summarised to create features which provide global information about the speech recordings. While several feature aggregation methods exist, such as functionals [8], GMM supervectors [9], Vectors of Locally Aggregated Descriptors (VLADs) [10], i-vectors [11] etc., we opt to use Fisher Vector encoding for aggregating spectral LLDs based on our previous experience: we found them effective for classifying between individuals with and without depression [12], as well as prediction of their depression severity [13].…”

Section: Spectral Modelling With Fisher Vectorsmentioning

confidence: 99%

“…While FV encoding was originally proposed by [14] for building visual vocabularies, it has become popular for a variety of applications in the field of social signal processing, such as depression recognition [15,16,12,13], emotion recognition [17] as well as recent Interspeech Computational Paralinguistics (Com-ParE) challenges [18,19,20].…”

Section: Spectral Modelling With Fisher Vectorsmentioning

confidence: 99%

Computational Paralinguistics: Automatic Assessment of Emotions, Mood and Behavioural State from Acoustics of Speech

et al. 2018

Self Cite

View full text Add to dashboard Cite

Paralinguistic analysis of speech remains a challenging task due to the many confounding factors which affect speech production. In this paper, we address the Interspeech 2018 Computational Paralinguistics Challenge (ComParE) which aims to push the boundaries of sensitivity to non-textual information that is conveyed in the acoustics of speech. We attack the problem on several fronts. We posit that a substantial amount of paralinguistic information is contained in spectral features alone. To this end, we use a large ensemble of Extreme Learning Machines for classification of spectral features. We further investigate the applicability of (an ensemble of) CNN-GRUs networks to model temporal variations therein. We report on the details of the experiments and the results for three ComParE sub-challenges: Atypical Affect, Self-Assessed Affect, and Crying. Our results compare favourably and in some cases exceed the published state-of-the-art performance.

show abstract

Depression Severity Prediction Based on Biomarkers of Psychomotor Retardation

Cited by 30 publications

References 29 publications

Detecting Depression with Audio/Text Sequence Modeling of Interviews

Detecting Depression with Audio/Text Sequence Modeling of Interviews

Automated Recognition of Alzheimer’s Dementia Using Bag-of-Deep-Features and Model Ensembling

Computational Paralinguistics: Automatic Assessment of Emotions, Mood and Behavioural State from Acoustics of Speech

Contact Info

Product

Resources

About