2019 41st Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC) 2019
DOI: 10.1109/embc.2019.8857691
|View full text |Cite
|
Sign up to set email alerts
|

Detecting emotional valence using time-domain analysis of speech signals

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
7
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
5
2

Relationship

0
7

Authors

Journals

citations
Cited by 9 publications
(7 citation statements)
references
References 23 publications
0
7
0
Order By: Relevance
“…42,43 At the same time, an end-to-end learning from acoustic features like the Mel-frequency cepstral coefficients (MFCCs) suffers from task independence and requires more resources especially in long audio files. 44,45 We note that the performance drop due to the automated transcription is rather modest, 3.5% in AUC for Task I and 6.0% for Task II, when using the STS+RS+Dem. method.…”
Section: Discussionmentioning
confidence: 95%
See 1 more Smart Citation
“…42,43 At the same time, an end-to-end learning from acoustic features like the Mel-frequency cepstral coefficients (MFCCs) suffers from task independence and requires more resources especially in long audio files. 44,45 We note that the performance drop due to the automated transcription is rather modest, 3.5% in AUC for Task I and 6.0% for Task II, when using the STS+RS+Dem. method.…”
Section: Discussionmentioning
confidence: 95%
“…Another characteristic of our study is that it relies on semantic features, enabling us to transfer the entire pipeline to other languages, given the existence of transcription tools from any language to English and/or powerful NLP models in different languages 42,43 . At the same time, an end‐to‐end learning from acoustic features like the Mel‐frequency cepstral coefficients (MFCCs) suffers from task independence and requires more resources especially in long audio files 44,45 . We note that the performance drop due to the automated transcription is rather modest, 3.5% in AUC for Task I and 6.0% for Task II, when using the STS+RS+Dem.…”
Section: Discussionmentioning
confidence: 99%
“…These were calculated for each speaker turn, that is, when the surgeon was speaking as part of a communication sender or receiver. Successive differences were also calculated for each vocal feature per turn to capture changes of the features over time (Deshpande et al, 2019). Descriptive statistics obtained for these eight measures included minimum, maximum, mean, standard deviation, range, and interquartile range.…”
Section: Methodsmentioning
confidence: 99%
“…Cook et al [19], [20] explored the structure of the fundamental frequency (F0), extracting dominant pitches in the detection of valence from speech. Despande et al [21] proposed a reduced feature set consisting of the autocorrelation of pitch contour, root mean square (RMS) energy and a 10-dimensional time domain difference (TDD) vector. The TDD vector corresponds to successive differences in the speech signal.…”
Section: Improving the Prediction Of Valencementioning
confidence: 99%