2013
DOI: 10.1007/978-3-642-38847-7_2
|View full text |Cite
|
Sign up to set email alerts
|

NMF-Based Spectral Analysis for Acoustic Event Classification Tasks

Abstract: Abstract. In this paper, we propose a new front-end for Acoustic Event Classification tasks (AEC). First, we study the spectral contents of different acoustic events by applying Non-Negative Matrix Factorization (NMF) on their spectral magnitude and compare them with the structure of speech spectra. Second, from the findings of this study, we propose a new parameterization for AEC, which is an extension of the conventional Mel Frequency Cepstrum Coefficients (MFCC) and is based on the high pass filtering of ac… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2015
2015
2017
2017

Publication Types

Select...
2
1

Relationship

2
1

Authors

Journals

citations
Cited by 3 publications
(3 citation statements)
references
References 10 publications
0
3
0
Order By: Relevance
“…The Kullback-Leibler divergence results in a non-negative quantity and is unbounded. In this work, the KL divergence is considered because it has recently been used, with good results, in audio processing tasks such as speech enhancement and denoising for automatic speech recognition [ 21 , 28 ], feature extraction [ 22 ] or acoustic event classification [ 22 , 29 ]. To find a local optimum value for the KL divergence between V and ( W H ), an iterative scheme with multiplicative update rules can be used as proposed in [ 27 ] and stated in Eqs ( 3 ) and ( 4 ), where 1 is a matrix of size V , whose elements are all ones, and the multiplications ⊗ and divisions are component-wise operations.…”
Section: Methodsmentioning
confidence: 99%
“…The Kullback-Leibler divergence results in a non-negative quantity and is unbounded. In this work, the KL divergence is considered because it has recently been used, with good results, in audio processing tasks such as speech enhancement and denoising for automatic speech recognition [ 21 , 28 ], feature extraction [ 22 ] or acoustic event classification [ 22 , 29 ]. To find a local optimum value for the KL divergence between V and ( W H ), an iterative scheme with multiplicative update rules can be used as proposed in [ 27 ] and stated in Eqs ( 3 ) and ( 4 ), where 1 is a matrix of size V , whose elements are all ones, and the multiplications ⊗ and divisions are component-wise operations.…”
Section: Methodsmentioning
confidence: 99%
“…The AEC system is based on a one-against-one SVM with Radial Basis Function (RBF) kernel on normalized features (Ludeña-Choez & Gallardo-Antolín, 2013b, 2015. The system was developed using the LIBSVM software (Chang & Lin, 2011).…”
Section: Database and Baseline Systemmentioning
confidence: 99%
“…Many state-of-the art front-ends are composed of two modules: short-time feature extraction, in which acoustic coe cients are computed on a frame-byframe basis (typically, the frame period used for speech/audio analysis is about 10-20 ms) from analysis windows of 20-40 ms, and temporal feature integration (Meng et al, 2007), in which features at larger time scales are extracted by combining somehow the short-time characteristics information over a longer time-frame composed of several consecutive frames. The resulting characteristics are often called segmental features (Zhang & Schuller, 2012;Ludeña-Choez & Gallardo-Antolín, 2013a, 2015. In this paper, two techniques which improve the performance of each of these modules by taking into account the specific spectro-temporal structure of acoustic events are presented.…”
Section: Introductionmentioning
confidence: 99%