Multi-scale semantic feature fusion and data augmentation for acoustic scene classification

Yang, Liping; Tao, Lianjie; Chen, Xinxing; Gu, Xianfeng

doi:10.1016/j.apacoust.2020.107238

Cited by 25 publications

(13 citation statements)

References 19 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…The malaria dataset was split into training 80%, validation 20% and the final model was applied to 700 images (or 10% of the total number images) to test the CNN model. The research used ROI to detect the image boundaries, which does not affect other parts of the image [ 67 , 68 ].…”

Section: Implementation Detailsmentioning

confidence: 99%

Analyzing Malaria Disease Using Effective Deep Learning Approach

2020

View full text Add to dashboard Cite

Medical tools used to bolster decision-making by medical specialists who offer malaria treatment include image processing equipment and a computer-aided diagnostic system. Malaria images can be employed to identify and detect malaria using these methods, in order to monitor the symptoms of malaria patients, although there may be atypical cases that need more time for an assessment. This research used 7000 images of Xception, Inception-V3, ResNet-50, NasNetMobile, VGG-16 and AlexNet models for verification and analysis. These are prevalent models that classify the image precision and use a rotational method to improve the performance of validation and the training dataset with convolutional neural network models. Xception, using the state of the art activation function (Mish) and optimizer (Nadam), improved the effectiveness, as found by the outcomes of the convolutional neural model evaluation of these models for classifying the malaria disease from thin blood smear images. In terms of the performance, recall, accuracy, precision, and F1 measure, a combined score of 99.28% was achieved. Consequently, 10% of all non-dataset training and testing images were evaluated utilizing this pattern. Notable aspects for the improvement of a computer-aided diagnostic to produce an optimum malaria detection approach have been found, supported by a 98.86% accuracy level.

show abstract

Section: Implementation Detailsmentioning

confidence: 99%

Analyzing Malaria Disease Using Effective Deep Learning Approach

2020

View full text Add to dashboard Cite

show abstract

“…McDonnell selected a residual network pre-activated CNN and rounded the layer values to reduce memory usage [10]. A modified SegNet [12], fine-resolution CNN (FR-CNN) [13] and a multi-scale feature fusion CNN [14] are other types of modified CNNs that have been used for ASC. The generative adversarial neural networks (GAN) [15], CNN with cross-entropy (CE) as loss function [16], CNN including a semantic neighbors over time (SeNoT) module [17], optimized CNNs [18][19][20] and conditional autoencoders [22] are among the deep learning methods used for audio scene classification.…”

Section: Deep Learning Methodsmentioning

confidence: 99%

“…Mel based features, such as log-Mel spectrogram, Mel-frequency cepstrum, MFCC, log-Mel delta, and delta-delta, are among the most commonly used features in ASC. For example, the Log-Mel spectrogram has been used in [8,[10][11][12][13][14][15][16][17][18][19][20][21][22], with differences between parameters such as filter banks, STFT and windowing function.…”

Section: Feature Extraction and Preprocessingmentioning

confidence: 99%

Binaural Acoustic Scene Classification Using Wavelet Scattering, Parallel Ensemble Classifiers and Nonlinear Fusion

Hajihashemi

Gharahbagh

Cruz³

et al. 2022

Sensors

View full text Add to dashboard Cite

The analysis of ambient sounds can be very useful when developing sound base intelligent systems. Acoustic scene classification (ASC) is defined as identifying the area of a recorded sound or clip among some predefined scenes. ASC has huge potential to be used in urban sound event classification systems. This research presents a hybrid method that includes a novel mathematical fusion step which aims to tackle the challenges of ASC accuracy and adaptability of current state-of-the-art models. The proposed method uses a stereo signal, two ensemble classifiers (random subspace), and a novel mathematical fusion step. In the proposed method, a stable, invariant signal representation of the stereo signal is built using Wavelet Scattering Transform (WST). For each mono, i.e., left and right, channel, a different random subspace classifier is trained using WST. A novel mathematical formula for fusion step was developed, its parameters being found using a Genetic algorithm. The results on the DCASE 2017 dataset showed that the proposed method has higher classification accuracy (about 95%), pushing the boundaries of existing methods.

show abstract

“…The study needs to provide the enough computational requirements. To improve the computation and accuracy of deep learning model, multi-level feature [12,26] or multi-scale semantic features fusion [27,29] methods are used in the model combining with data augmentation.…”

Section: Related Workmentioning

confidence: 99%

Acoustic Scene Classification using Attention based Deep Learning Model

2022

IJIES

View full text Add to dashboard Cite

Acoustic scene classification is a difficult issue among artificial intelligence, signal processing, and machine learning. Scene recognition performance has a robust relation with feature learning using deep convolutional networks.In the following research, end-to-end deep residual network embedded channel attention is explored to learn the discriminative features from the audio scene. Log-Mel spectrogram is obtained from input raw audios. It is forwarded to proposed attention network. An extracted feature layer is concatenated with the SoftMax classifier in the proposed attention network. The experimentation is carried out on Detection and Classification of Acoustic Scenes and Events (DCASE) 2016 and 2017 datasets. The proposed channel-attention-based residual network achieves classification results with an average accuracy of 80.27% and 80.82%, respectively.

show abstract

Multi-scale semantic feature fusion and data augmentation for acoustic scene classification

Cited by 25 publications

References 19 publications

Analyzing Malaria Disease Using Effective Deep Learning Approach

Analyzing Malaria Disease Using Effective Deep Learning Approach

Binaural Acoustic Scene Classification Using Wavelet Scattering, Parallel Ensemble Classifiers and Nonlinear Fusion

Acoustic Scene Classification using Attention based Deep Learning Model

Contact Info

Product

Resources

About