Interspeech 2019 2019
DOI: 10.21437/interspeech.2019-1603
|View full text |Cite
|
Sign up to set email alerts
|

A Saliency-Based Attention LSTM Model for Cognitive Load Classification from Speech

Abstract: Cognitive Load (CL) refers to the amount of mental demand that a given task imposes on an individual's cognitive system and it can affect his/her productivity in very high load situations. In this paper, we propose an automatic system capable of classifying the CL level of a speaker by analyzing his/her voice. Our research on this topic goes into two main directions. In the first one, we focus on the use of Long Short-Term Memory (LSTM) networks with different weighted pooling strategies for CL level classific… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
10
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
5
2

Relationship

4
3

Authors

Journals

citations
Cited by 9 publications
(12 citation statements)
references
References 22 publications
0
10
0
Order By: Relevance
“…In contrast, non-relevant frames should be diminished or even ignored, so the values of the corresponding weights should be small. This approach has been proposed with great success in other automatic learning problems that deal with temporal sequences [14,16,17,[19][20][21]25,45], including our previous works on the estimation of the intelligibility level [8,11].…”
Section: Attention Poolingmentioning
confidence: 99%
See 2 more Smart Citations
“…In contrast, non-relevant frames should be diminished or even ignored, so the values of the corresponding weights should be small. This approach has been proposed with great success in other automatic learning problems that deal with temporal sequences [14,16,17,[19][20][21]25,45], including our previous works on the estimation of the intelligibility level [8,11].…”
Section: Attention Poolingmentioning
confidence: 99%
“…More recently, deep learning (DL) methods have been proposed for SIC as they have been proven to be very effective in several audio and speech-related tasks, such as acoustic event detection [14], automatic speech recognition [15], speech emotion recognition [16][17][18], cognitive load classification from speech [19,20], or deception detection from speech [21]. Recent studies propose the use of dense networks fed by features derived from the decomposition of log-mel spectrograms in temporal and frequency basis vectors [22], the use of convolutional neural networks and different spectro-temporal representations as input [23], or long short-term memory (LSTM) networks with MFCC as feature vectors [24] for multilevel or binary speech intelligibility classification.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…For this reason, in recent years, speech technologies are being proposed for the assessment, diagnosis and tracking of different health conditions that affect the subject’s voice [ 20 ]. In this area, commonly referred to as Computational Paralinguistic Analysis , current research encompasses the detection of pathological voices due, for example, to laryngeal disorders [ 21 ]; the diagnosis and monitoring of neurodegenerative conditions, such as Parkinson’s disease [ 22 , 23 ], Mild Cognitive Impairment [ 24 ], Alzheimer’s disease [ 24 , 25 ] or Amyotrophic Lateral Sclerosis [ 26 ]; the prediction of stress and cognitive load level [ 27 , 28 ]; and the detection of psychological pathologies, such as autism [ 29 ] or depression [ 30 ], which is the topic of this paper.…”
Section: Related Workmentioning
confidence: 99%
“…Conventional systems for speech-based health tasks consists of data-driven approaches based on hand-crafted acoustic features, such as pitch, prosody, loudness, rate of speech, and energies, among others, and a machine-learning algorithm such as Logistic Regression, Support Vector Machines (SVM) or Gaussian Mixture models [ 22 , 23 , 24 , 29 ]. Nevertheless, very recent works, such as, for example, [ 20 , 21 , 25 , 26 , 27 , 28 ], deal with the use of deep-learning techniques for these tasks, since, presently, these kinds of methods have achieved unprecedented successes in the field of automatic learning applied to signal processing, and particularly in image, video, and audio problems.…”
Section: Related Workmentioning
confidence: 99%