Spectral vs. spectro-temporal features for acoustic event detection

Cotton, Courtenay V.; Ellis, Daniel P. W.

doi:10.1109/aspaa.2011.6082331

Cited by 105 publications

(73 citation statements)

References 5 publications

(6 reference statements)

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…In addition, F and G were implemented using fully-connected DNNs, and the symmetric network architecture of F was used for that of G. Then, F and G were trained to maximize (12) and to minimize (19) and (20), alternately.…”

Section: B Acoustic Feature-extractor Optimization Using Variationalmentioning

confidence: 99%

See 1 more Smart Citation

Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma

Koizumi

Saito

Uematsu

et al. 2017

2017 25th European Signal Processing Conference (EUSIPCO)

View full text Add to dashboard Cite

Abstract-We propose a method for optimizing an acoustic feature extractor for anomalous sound detection (ASD). Most ASD systems adopt outlier-detection techniques because it is difficult to collect a massive amount of anomalous sound data. To improve the performance of such outlier-detection-based ASD, it is essential to extract a set of efficient acoustic features that is suitable for identifying anomalous sounds. However, the ideal property of a set of acoustic features that maximizes ASD performance has not been clarified. By considering outlierdetection-based ASD as a statistical hypothesis test, we defined optimality as an objective function that adopts Neyman-Pearson lemma; the acoustic feature extractor is optimized to extract a set of acoustic features which maximize the true positive rate under an arbitrary false positive rate. The variational auto-encoder is applied as an acoustic feature extractor and optimized to maximize the objective function. We confirmed that the proposed method improved the F-measure score from 0.02 to 0.06 points compared to those of conventional methods, and ASD results of a stereolithography 3D-printer in a real-environment show that the proposed method is effective in identifying anomalous sounds.

show abstract

Section: B Acoustic Feature-extractor Optimization Using Variationalmentioning

confidence: 99%

“…To extract a set of acoustic features for the soundidentification problem, feature-extractor-optimization methods have been actively investigated [12], [13], [14]. These studies have revealed that it is necessary to determine both spectral and temporal characteristics to accurately identify various sounds.…”

Section: Introductionmentioning

confidence: 99%

Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma

Koizumi

Saito

Uematsu

et al. 2017

2017 25th European Signal Processing Conference (EUSIPCO)

View full text Add to dashboard Cite

show abstract

“…Recently, the method of matching pursuit that sparsely decomposes the signal by using over-complete dictionaries has been successfully applied to classify the environmental sounds [2,5]. As in the matching pursuit, the non-negative matrix factorization (NMF) also works on sparse factorization of signals with learning the dictionary; Cotton and Ellis [8] employ the NMF to construct acoustic event-based patch features from a spectrogram. Ye et al [3] utilize the acoustic subspace extracted from sound clips in the kernel-based framework.…”

Section: Introductionmentioning

confidence: 99%

Acoustic feature extraction by statistics based local binary pattern for environmental sound classification

Kobayashi

2014

2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

View full text Add to dashboard Cite

Classification of environmental sounds is a fundamental procedure for a wide range of real-world applications. In this paper, we propose a novel acoustic feature extraction method for classifying the environmental sounds. The proposed method is motivated from the image processing technique, local binary pattern (LBP), and works on a spectrogram which forms two-dimensional (time-frequency) data like an image. Since the spectrogram contains noisy pixel values, for improving classification performance, it is crucial to extract the features which are robust to the fluctuations in pixel values. We effectively incorporate the local statistics, mean and standard deviation on local pixels, to establish robust LBP. In addition, we provide the technique of L 2 -Hellinger normalization which is efficiently applied to the proposed features so as to further enhance the discriminative power while increasing the robustness. In the experiments on environmental sound classification using RWCP dataset that contains 105 sound categories, the proposed method produces the superior performance (98.62%) compared to the other methods, exhibiting significant improvements over the standard LBP method as well as robustness to noise and low computation time.

show abstract

“…NMF-related methods can be separated in those that exploit the NMF activations directly to perform event detection [8,11], and in those that employ a classifier trained on these activations [12,13]. Based on the fact that NMF-based approaches can benefit from the creation of a Mixture of Local Dictionaries (MLD) [14], in [15] the authors propose a classifier-based NMF system using MLDs for improved detection performance.…”

Section: Introductionmentioning

confidence: 99%

On the Joint Use of NMF and Classification for Overlapping Acoustic Event Detection

Giannoulis

Potamianos

Maragos

2018

International Workshop on Computational Intelligence for Multimedia Understanding (IWCIM)

View full text Add to dashboard Cite

Abstract:In this paper, we investigate the performance of classifier-based non-negative matrix factorization (NMF) methods for detecting overlapping acoustic events. We provide evidence that the performance of classifier-based NMF systems deteriorates significantly in overlapped scenarios in case mixed observations are unavailable during training. To this end, we propose a K-means based method for artificial generation of mixed data. The method of Mixture of Local Dictionaries (MLD) is employed for the building of the NMF dictionary using both the isolated and artificially mixed data. Finally an SVM classifier is trained for each of the isolated and mixed event classes, using the corresponding MLD-NMF activations from the training set. The proposed system, tested on two experiments with (a) synthetic and (b) real events, outperforms the state-of-the-art classifier-based NMF system in the overlapped scenarios.

show abstract

Spectral vs. spectro-temporal features for acoustic event detection

Cited by 105 publications

References 5 publications

Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma

Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma

Acoustic feature extraction by statistics based local binary pattern for environmental sound classification

On the Joint Use of NMF and Classification for Overlapping Acoustic Event Detection

Contact Info

Product

Resources

About