2017
DOI: 10.1007/978-3-319-71078-5_2
|View full text |Cite
|
Sign up to set email alerts
|

Masked Conditional Neural Networks for Environmental Sound Classification

Abstract: The ConditionaL Neural Network (CLNN) exploits the nature of the temporal sequencing of the sound signal represented in a spectrogram, and its variant the Masked ConditionaL Neural Network (MCLNN) 1 induces the network to learn in frequency bands by embedding a filterbank-like sparseness over the network's links using a binary mask. Additionally, the masking automates the exploration of different feature combinations concurrently analogous to handcrafting the optimum combination of features for a recognition t… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
11
0

Year Published

2017
2017
2023
2023

Publication Types

Select...
3
2
1

Relationship

1
5

Authors

Journals

citations
Cited by 12 publications
(11 citation statements)
references
References 23 publications
0
11
0
Order By: Relevance
“…We used the model specified in Table I and the signal representation (60 mel-spec with delta) discussed earlier, which is the same transformation used by Piczak-CNN [7]. The dataset is pre-distributed into 10-folds, which we used to report the mean accuracy in Table II. The shallow MCLNN in combination with a long segment (k=50) achieved an accuracy of 74.22% compared to a deep MCLNN with a shorter segment (k=5) in [24]. The accuracy of the MCLNN surpasses other reported neural networks based attempts using state-of-the-art CNN architectures proposed by Salamon et al in [31] and Piczak in [7].…”
Section: A Urbansound8kmentioning
confidence: 73%
See 4 more Smart Citations
“…We used the model specified in Table I and the signal representation (60 mel-spec with delta) discussed earlier, which is the same transformation used by Piczak-CNN [7]. The dataset is pre-distributed into 10-folds, which we used to report the mean accuracy in Table II. The shallow MCLNN in combination with a long segment (k=50) achieved an accuracy of 74.22% compared to a deep MCLNN with a shorter segment (k=5) in [24]. The accuracy of the MCLNN surpasses other reported neural networks based attempts using state-of-the-art CNN architectures proposed by Salamon et al in [31] and Piczak in [7].…”
Section: A Urbansound8kmentioning
confidence: 73%
“…We have performed the MCLNN evaluation using the Urbansound8k [28], YorNoise [24], ESC-10 [29] and ESC-50 [29] environmental sound datasets. We will discuss the composition of each dataset with the common preprocessing applied, and we will defer the discussion to each dataset's relevant section.…”
Section: Methodsmentioning
confidence: 99%
See 3 more Smart Citations