2020
DOI: 10.1016/j.apacoust.2020.107238
|View full text |Cite
|
Sign up to set email alerts
|

Multi-scale semantic feature fusion and data augmentation for acoustic scene classification

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
12
0
1

Year Published

2020
2020
2023
2023

Publication Types

Select...
9

Relationship

0
9

Authors

Journals

citations
Cited by 25 publications
(13 citation statements)
references
References 19 publications
0
12
0
1
Order By: Relevance
“…The malaria dataset was split into training 80%, validation 20% and the final model was applied to 700 images (or 10% of the total number images) to test the CNN model. The research used ROI to detect the image boundaries, which does not affect other parts of the image [ 67 , 68 ].…”
Section: Implementation Detailsmentioning
confidence: 99%
“…The malaria dataset was split into training 80%, validation 20% and the final model was applied to 700 images (or 10% of the total number images) to test the CNN model. The research used ROI to detect the image boundaries, which does not affect other parts of the image [ 67 , 68 ].…”
Section: Implementation Detailsmentioning
confidence: 99%
“…McDonnell selected a residual network pre-activated CNN and rounded the layer values to reduce memory usage [10]. A modified SegNet [12], fine-resolution CNN (FR-CNN) [13] and a multi-scale feature fusion CNN [14] are other types of modified CNNs that have been used for ASC. The generative adversarial neural networks (GAN) [15], CNN with cross-entropy (CE) as loss function [16], CNN including a semantic neighbors over time (SeNoT) module [17], optimized CNNs [18][19][20] and conditional autoencoders [22] are among the deep learning methods used for audio scene classification.…”
Section: Deep Learning Methodsmentioning
confidence: 99%
“…Mel based features, such as log-Mel spectrogram, Mel-frequency cepstrum, MFCC, log-Mel delta, and delta-delta, are among the most commonly used features in ASC. For example, the Log-Mel spectrogram has been used in [8,[10][11][12][13][14][15][16][17][18][19][20][21][22], with differences between parameters such as filter banks, STFT and windowing function.…”
Section: Feature Extraction and Preprocessingmentioning
confidence: 99%
“…The study needs to provide the enough computational requirements. To improve the computation and accuracy of deep learning model, multi-level feature [12,26] or multi-scale semantic features fusion [27,29] methods are used in the model combining with data augmentation.…”
Section: Related Workmentioning
confidence: 99%