Proceedings of the 7th Annual Workshop on Audio/Visual Emotion Challenge 2017
DOI: 10.1145/3133944.3133947
|View full text |Cite
|
Sign up to set email alerts
|

Depression Severity Prediction Based on Biomarkers of Psychomotor Retardation

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
20
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
5
2
1

Relationship

2
6

Authors

Journals

citations
Cited by 30 publications
(21 citation statements)
references
References 29 publications
0
20
0
Order By: Relevance
“…The second approach to modeling depression attempts to exploit global and/or time varying statistics, independent of the question that prompted the response. Williamson et al utilized correlations of formants and spectral information across different time scales [11], Syed et al developed audio and video features to capture temporal variations [12], while Pampouchidou et al and Nasir et al fused low and high-level features [13,14]. Utilizing emerging techniques, Ma et al used audio to model depression by allowing deep neural networks to learn such associations rather than perform feature engineering [15].…”
Section: Introductionmentioning
confidence: 99%
“…The second approach to modeling depression attempts to exploit global and/or time varying statistics, independent of the question that prompted the response. Williamson et al utilized correlations of formants and spectral information across different time scales [11], Syed et al developed audio and video features to capture temporal variations [12], while Pampouchidou et al and Nasir et al fused low and high-level features [13,14]. Utilizing emerging techniques, Ma et al used audio to model depression by allowing deep neural networks to learn such associations rather than perform feature engineering [15].…”
Section: Introductionmentioning
confidence: 99%
“…Feature aggregation has been a key component of successful systems for speech paralinguistic tasks [64]- [66], and in recent years we have also had success using these methods [67]- [69]. In this work, we experimented with three types of feature aggregation methods that are based on functionals, Bag of Words [70], and Fisher Vector Encoding [71].…”
Section: Feature Aggregationmentioning
confidence: 99%
“…The Fisher Vector Encoding (FVE) method for feature aggregation was originally proposed by Perronnin et al [71], [72] as an improvement to the bag-of-visual-words method for computer vision applications. However, this method has been also successfully adapted for applications related to speech paralinguistics [64], [65], [67]. In this approach, a generative model, typically a Gaussian mixture model (GMM) [73], is used to model the background feature space.…”
Section: Feature Aggregationmentioning
confidence: 99%
“…Feature aggregation is an approach through which LLDs are summarised to create features which provide global information about the speech recordings. While several feature aggregation methods exist, such as functionals [8], GMM supervectors [9], Vectors of Locally Aggregated Descriptors (VLADs) [10], i-vectors [11] etc., we opt to use Fisher Vector encoding for aggregating spectral LLDs based on our previous experience: we found them effective for classifying between individuals with and without depression [12], as well as prediction of their depression severity [13].…”
Section: Spectral Modelling With Fisher Vectorsmentioning
confidence: 99%
“…While FV encoding was originally proposed by [14] for building visual vocabularies, it has become popular for a variety of applications in the field of social signal processing, such as depression recognition [15,16,12,13], emotion recognition [17] as well as recent Interspeech Computational Paralinguistics (Com-ParE) challenges [18,19,20].…”
Section: Spectral Modelling With Fisher Vectorsmentioning
confidence: 99%