Encoding Visual Behaviors with Attentive Temporal Convolution for Depression Prediction

Du, Zhengyin; Li, Weixin; Huang, Di; Wang, Yunlong

doi:10.1109/fg.2019.8756584

Cited by 26 publications

(13 citation statements)

References 25 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…To the best of our knowledge, this is the first work that extends the Neural Architecture Search (NAS) technique to automatic depression analysis. • The experimental results show that our approach achieved new state-of-the-art results with 27% RMSE and 30% MAE improvements over the previous state-of-the-art method [21], [19].…”

Section: Introductionmentioning

confidence: 85%

“…Haque et al [22] use a Causal Convolutional Neural Network (C-CNN) to deep learn sentence-level depression cues from 3D facial landmarks. Du et al [21] propose a Atrous Residual Temporal Convolutional Network (DepArt-Net) that generates multi-scale contextual features from several lowlevel visual behaviors, then temporally fuse them through attention mechanism to capture the long-range depressionrelated cues. Song et al [16], [17] propose to use Fourier transforms to encode facial attribute time-series (AUs, gazes, and head poses) of a clip into a length-independent spectral representation, incorporating multi-scale temporal information.…”

Section: A Facial Attributes-based Depression Recognitionmentioning

confidence: 99%

“…Since an early study [9] show that mid and low-level facial attributes (e.g., Facial Action Units (AUs) and facial landmarks) are informative for depression status, a certain number of recent studies devote to recognize depression from automatic detected facial attributes such as facial landmarks [10], [11], [12], gaze direction [13], [14], facial action units (AUs) [15], [16], [17], and head poses [10]. Besides some of them compute several statistics [18], [19] (e.g., displacement, velocity, acceleration) from facial attributes time-series as the clip-level representation for depression recognition, recent advances in deep learning (e.g., 1D-CNN [16], [17], LSTM [20], attention-based temporal CNN [21], Causal CNN [22], etc.) also have been applied to infer depression from facial attribute time-series, and achieved enhanced results over most hand-crafted approaches.…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Neural Architecture Searching for Facial Attributes-based Depression Recognition

Chen¹,

Xiao²,

Zhang³

et al. 2022

Preprint

View full text Add to dashboard Cite

Recent studies show that depression can be partially reflected from human facial attributes. Since facial attributes have various data structure and carry different information, existing approaches fail to specifically consider the optimal way to extract depression-related features from each of them, as well as investigates the best fusion strategy. In this paper, we propose to extend Neural Architecture Search (NAS) technique for designing an optimal model for multiple facial attributes-based depression recognition, which can be efficiently and robustly implemented in a small dataset. Our approach first conducts a warmer up step to the feature extractor of each facial attribute, aiming to largely reduce the search space and providing customized architecture, where each feature extractor can be either a Convolution Neural Networks (CNN) or Graph Neural Networks (GNN). Then, we conduct an end-to-end architecture search for all feature extractors and the fusion network, allowing the complementary depression cues to be optimally combined with less redundancy. The experimental results on AVEC 2016 dataset show that the model explored by our approach achieves breakthrough performance with 27% and 30% RMSE and MAE improvements over the existing state-of-the-art. In light of these findings, this paper provides solid evidences and a strong baseline for applying NAS to time-series data-based mental health analysis.

show abstract

Section: Introductionmentioning

confidence: 85%

Section: A Facial Attributes-based Depression Recognitionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Neural Architecture Searching for Facial Attributes-based Depression Recognition

Chen¹,

Xiao²,

Zhang³

et al. 2022

Preprint

View full text Add to dashboard Cite

show abstract

“…RELATED WORK Many methods have been developed for the task of automatic depression detection. Some methods are based on a single modality like audio [14,24,30], text [6,40] or visual cues [7,37], while others combine at least two modalities that tend to achieve a higher accuracy [26,31].…”

Section: Introductionmentioning

confidence: 99%

Automatic Depression Detection via Learning and Fusing Features from Visual Cues

Guo¹,

Zhu²,

Hao³

et al. 2022

Preprint

View full text Add to dashboard Cite

Depression is one of the most prevalent mental disorders, which seriously affects one's life. Traditional depression diagnostics commonly depends on rating with scales, which can be labor-intensive and subjective. In this context, Automatic Depression Detection (ADD) has been attracting more attention for its low cost and objectivity. ADD systems are able to detect depression automatically from some medical records, like video sequences. However, it remains challenging to effectively extract depression-specific information from long sequences, thereby hindering a satisfying accuracy. In this paper, we propose a novel ADD method via learning and fusing features from visual cues. Specifically, we firstly construct Temporal Dilated Convolutional Network (TDCN), in which multiple Dilated Convolution Blocks (DCB) are designed and stacked, to learn the long-range temporal information from sequences. Then, the Feature-Wise Attention (FWA) module is adopted to fuse different features extracted from TDCNs. The module learns to assign weights for the feature channels, aiming to better incorporate different kinds of visual features and further enhance the detection accuracy. Our method achieves the state-of-the-art performance on the DAIC WOZ dataset compared to other visual-feature-based methods, showing its effectiveness.

show abstract

“…ADE features can either be hand-crafted or based on deep learning models. Examples of widely used hand-crafted features include Local Binary Patterns (LBP) [18], Local Phase Quantization from Three Orthogonal Planes (LPQ-TOP) [19], Local Binary Patterns from Three Orthogonal Planes (LBP-TOP) [20], and others (e.g., Facial Action Units (FAUs), Landmarks, Head Poses, Gazes) [21]. However, since 2013, depression recognition challenges such as Audio-Visual Emotion Recognition Challenge (AVEC2013) [22] have recorded depression data by Human-Computer Interaction.…”

Section: Introductionmentioning

confidence: 99%

Deep Learning for Depression Recognition with Audiovisual Cues: A Review

He¹,

Niu²,

Tiwari³

et al. 2021

Preprint

View full text Add to dashboard Cite

With the acceleration of the pace of work and life, people have to face more and more pressure, which increases the possibility of suffering from depression. However, many patients may fail to get a timely diagnosis due to the serious imbalance in the doctor-patient ratio in the world. Promisingly, physiological and psychological studies have indicated some differences in speech and facial expression between patients with depression and healthy individuals. Consequently, to improve current medical care, many scholars have used deep learning to extract a representation of depression cues in audio and video for automatic depression detection. To sort out and summarize these works, this review introduces the databases and describes objective markers for automatic depression estimation (ADE). Furthermore, we review the deep learning methods for automatic depression detection to extract the representation of depression from audio and video. Finally, this paper discusses challenges and promising directions related to automatic diagnosing of depression using deep learning technologies.

show abstract

Encoding Visual Behaviors with Attentive Temporal Convolution for Depression Prediction

Cited by 26 publications

References 25 publications

Neural Architecture Searching for Facial Attributes-based Depression Recognition

Neural Architecture Searching for Facial Attributes-based Depression Recognition

Automatic Depression Detection via Learning and Fusing Features from Visual Cues

Deep Learning for Depression Recognition with Audiovisual Cues: A Review

Contact Info

Product

Resources

About