2021
DOI: 10.3389/fnhum.2021.713823
|View full text |Cite
|
Sign up to set email alerts
|

Depression Speech Recognition With a Three-Dimensional Convolutional Network

Abstract: Depression has become one of the main afflictions that threaten people's mental health. However, the current traditional diagnosis methods have certain limitations, so it is necessary to find a method of objective evaluation of depression based on intelligent technology to assist in the early diagnosis and treatment of patients. Because the abnormal speech features of patients with depression are related to their mental state to some extent, it is valuable to use speech acoustic features as objective indicator… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
5
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
8
2

Relationship

0
10

Authors

Journals

citations
Cited by 17 publications
(6 citation statements)
references
References 39 publications
0
5
0
Order By: Relevance
“…A speech emotion recognition system is helpful in medical practice for detecting changes in mental state and emotions. For example, when a patient has mood swings, the system will react rapidly and examine their current psychological state [ 9 ]. As a result, the depression prediction methods might help design better mental health care software and technologies such as intelligent robots.…”
Section: Introductionmentioning
confidence: 99%
“…A speech emotion recognition system is helpful in medical practice for detecting changes in mental state and emotions. For example, when a patient has mood swings, the system will react rapidly and examine their current psychological state [ 9 ]. As a result, the depression prediction methods might help design better mental health care software and technologies such as intelligent robots.…”
Section: Introductionmentioning
confidence: 99%
“…PLP, and MFCC, called the low-level descriptors, are used to train the multiple classifier systems ( Long et al, 2017 ). The input of the network model is a 3D feature made up of FBANK, the first-order and second-order differences to use the information in speech signals entirely ( Wang et al, 2021 ). The findings of the aforementioned study illustrate that MFCC, PLP, and FBANK as front-end features can refine enough speech details.…”
Section: Related Workmentioning
confidence: 99%
“…There are numerous physiological sensors that have been investigated for the estimation of depression. Some of the common physiological measures employed for the recognition of depression include electroencephalography (EEG) [ 13 ], electrocardiography (ECG) [ 14 ], heart rate variability (HRV) [ 15 ], galvanic skin response (GSR) [ 16 ], actigraphy [ 17 ], and speech signals [ 18 ]. Physiological sensors used for analyzing depression offer several compensations over traditional questionnaires developed by psychologists.…”
Section: Introductionmentioning
confidence: 99%