2016
DOI: 10.5430/air.v5n2p14
|View full text |Cite
|
Sign up to set email alerts
|

A robust BFCC feature extraction for ASR system

Abstract: An auditory-based feature extraction algorithm naming the Basilar-membrane Frequency-band Cepstral Coefficient (BFCC) is proposed to increase the robustness for automatic speech recognition. Compared to Fourier spectrogram based of the MelFrequency Cepstral Coefficient (MFCC) method, the proposed BFCC method engages an auditory spectrogram based on a gammachirp wavelet transform to simulate the auditory response of human inner ear to improve the noise immunity. In addition, the Hidden Markov Model (HMM) is use… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
0
0

Year Published

2018
2018
2024
2024

Publication Types

Select...
3
3

Relationship

0
6

Authors

Journals

citations
Cited by 7 publications
(3 citation statements)
references
References 20 publications
0
0
0
Order By: Relevance
“…In [31], authors used GTCC feature and pitch at front-end for feature extraction, and passed these features to GMM and KNN to improve the performance of ASV system. Kaun et al [10] applied auditory based BFCC features with AURORA 2 dataset, and compared these features' performance with MFCC using HMM model. Noroozi et al, [32] suggested a methodology for emotion recognition using audio.…”
Section: A Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…In [31], authors used GTCC feature and pitch at front-end for feature extraction, and passed these features to GMM and KNN to improve the performance of ASV system. Kaun et al [10] applied auditory based BFCC features with AURORA 2 dataset, and compared these features' performance with MFCC using HMM model. Noroozi et al, [32] suggested a methodology for emotion recognition using audio.…”
Section: A Related Workmentioning
confidence: 99%
“…Hence, researchers tried to modify these techniques to make these noise robust. The other approach to handle the noise during feature extraction is to use features that are already noise robust such as GTCC [8], [9] and BFCC [10], [11]. GTCC employs a non-linear gammatone filter bank [12].…”
Section: Introductionmentioning
confidence: 99%
“…The proposed feature set contains many attributes computed in time and frequency domain. The feature space includes the energy of the signal, fundamental frequency (F0) (Boersma, Paul, 1993; Boersma, Weenink, 2001), linear prediction coefficients (LPC) (Markel, Gray, 1976), linear predictive cepstral coefficients (LPCC) (Rao et al, 2015), Mel frequency cepstral coefficients (MFCC) (Davis, Mermelstein, 1980), and bark frequency cepstral coefficients (BFCC) (Kuan et al, 2016). The selection of fundamental frequency for the whole spoken sentence seems to be the most promising part of the feature space.…”
Section: Feature Space Analysismentioning
confidence: 99%