Interspeech 2018 2018
DOI: 10.21437/interspeech.2018-1959
|View full text |Cite
|
Sign up to set email alerts
|

Neural Network Architecture That Combines Temporal and Summative Features for Infant Cry Classification in the Interspeech 2018 Computational Paralinguistics Challenge

Abstract: This paper describes the application of a novel deep neural network architecture to the classification of infant vocalisations as part of the Interspeech 2018 Computational Paralinguistics Challenge. Previous approaches to infant cry classification have either applied a statistical classifier to summative features of the whole cry, or applied a syntactic pattern recognition technique to a temporal sequence of features. In this work we explore a deep neural network architecture that exploits both temporal and s… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
7
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
1
1

Relationship

0
6

Authors

Journals

citations
Cited by 8 publications
(8 citation statements)
references
References 8 publications
1
7
0
Order By: Relevance
“…This result reflects recent developments in the DCASE community, in which virtually all competition systems contain convolutional stages [21], [22], [25], [26], [28], [43]- [46]. However the paralinguistics community still relies primarily on pure recurrent networks [8], [9], [18], [19].…”
Section: Discussionsupporting
confidence: 63%
See 2 more Smart Citations
“…This result reflects recent developments in the DCASE community, in which virtually all competition systems contain convolutional stages [21], [22], [25], [26], [28], [43]- [46]. However the paralinguistics community still relies primarily on pure recurrent networks [8], [9], [18], [19].…”
Section: Discussionsupporting
confidence: 63%
“…Recurrent Stage: Gated recurrent units (GRU) and long short-term memory units (LSTM) [54] are the two most common recurrent types in paralinguistics [9], [18], [19], [55], [56]. Unidirectional [9], [18] as well as bidirectional [19], [56] networks are popular. We used the CuDNN implementations 3 to reduce the training time of the recurrent units.…”
Section: Hyperparameter Search Spacementioning
confidence: 99%
See 1 more Smart Citation
“…Our previous work [15] has shown that combining weighted prosodic features with MFCC features help improve the classification accuracy in a deep learning model. Other researchers have also found that F0 is critical in identifying infant cry signals [40]. Chittora and Patil used F0 to calculate unvoiced segments ratio and found out unvoiced percentage in a cry is an important parameter for analysis of infant cry [19].…”
Section: Fig 4 Multiple Order Mfcc Features Of Normal and Asphyxiatementioning
confidence: 99%
“…The most popular probabilistic classifier used in infant cry classification is Support Vector Machine (SVM) [26,40,43]. Many machine learning methods have been experimented in infant research.…”
Section: Infant Cry Classification Models 411 Traditional Machine Lmentioning
confidence: 99%