2019
DOI: 10.1109/jstsp.2019.2902305
|View full text |Cite
|
Sign up to set email alerts
|

Polyphonic Sound Event Detection by Using Capsule Neural Networks

Abstract: Artificial sound event detection (SED) has the aim to mimic the human ability to perceive and understand what is happening in the surroundings. Nowadays, learning offers valuable techniques for this goal such as convolutional neural networks (CNNs). The capsule neural network (CapsNet) architecture has been recently introduced in the image processing field with the intent to overcome some of the known limitations of CNNs, specifically regarding the scarce robustness to affine transformations (i.e., perspective… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

1
39
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
4
4
1

Relationship

0
9

Authors

Journals

citations
Cited by 52 publications
(40 citation statements)
references
References 34 publications
(46 reference statements)
1
39
0
Order By: Relevance
“…To overcome the shortcomings of traditional deep learning networks, Hinton group (Sabour et al, 2017) proposed new deep learning architectures known as capsule networks (CapsNets), which introduced a novel building block that is used in deep learning to improve the model hierarchical relationships inside the internal knowledge representation of a neural network. CapsNets have shown great potential in some fields (Xi et al, 2017;Afshar et al, 2018;Lalonde and Bagci, 2018;Qiao et al, 2018;Vesperini et al, 2018;Zhao et al, 2018;Wang et al, 2019b;Peng et al, 2019). However, CapsNets have not yet been applied to drug discovery-related studies.…”
Section: Introductionmentioning
confidence: 99%
“…To overcome the shortcomings of traditional deep learning networks, Hinton group (Sabour et al, 2017) proposed new deep learning architectures known as capsule networks (CapsNets), which introduced a novel building block that is used in deep learning to improve the model hierarchical relationships inside the internal knowledge representation of a neural network. CapsNets have shown great potential in some fields (Xi et al, 2017;Afshar et al, 2018;Lalonde and Bagci, 2018;Qiao et al, 2018;Vesperini et al, 2018;Zhao et al, 2018;Wang et al, 2019b;Peng et al, 2019). However, CapsNets have not yet been applied to drug discovery-related studies.…”
Section: Introductionmentioning
confidence: 99%
“…Vesperini et al [80] proposed Capsule Neural Network (CapsNet) for polyphonic SED. The introduction of CapsNet is to overcome some limitations of CNN, in particular, the loss of information due to max-pooling operator [81].…”
Section: Figure 9 Difference Between Multi-label and Combined Singlementioning
confidence: 99%
“…Using CapsNet, Vesperini et al [80] achieved an ER of 0.36 on TUT-SED 2016 and TUT-SED 2017 development dataset using a binaural spectrogram as the input. On the other hand, results for TUT-SED 2017 evaluation dataset show that CapsNet using log mel energies achieved the lowest ER of 0.58 instead of using a binaural spectrogram as input.…”
Section: Figure 9 Difference Between Multi-label and Combined Singlementioning
confidence: 99%
“…With CNN's powerful capabilities of learning features and the property of equivariance of CapsNet, Conv-Caps has an advanced performance. Otherwise, CapsNet has been successfully applied to many fields, such as tumors classification [14], sound event detection [15], and remote sensing image classification [16]. The main idea of the CapsNet is that vector capsules are utilized to represent internal attributes, and replace the neuron in the traditional neural network with a set of neurons as a capsule to solve the problem of spatial hierarchies between features effectively.…”
Section: Introductionmentioning
confidence: 99%