2016
DOI: 10.1007/978-81-322-3592-7_9
|View full text |Cite
|
Sign up to set email alerts
|

Robust Speaker Verification Using GFCC Based i-Vectors

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
15
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
5
4
1

Relationship

0
10

Authors

Journals

citations
Cited by 34 publications
(18 citation statements)
references
References 7 publications
0
15
0
Order By: Relevance
“…The time-domain features include Short-Time Energy (STEN), Pitch Frequency (PFCY), Formant Frequency (FFCY) and Average Speech Speed (AVSS) [7]. The cepstrum features include MFCC, Gamma Frequency Cepstrum Coefficient (GFCC) [8], Barker Frequency Cepstrum Coefficient (BFCC) [9], Normalized Gamma Chirped Cepstrum Coefficient (NGCC) [10], Amplitude-based Spectrum Root Cepstral Coefficient (MSRCC), Phase-based Spectrum Root Cepstral Coefficient (PSRCC) [11] and Linear Frequency Cepstrum Coefficient (LFCC) [12].…”
Section: Feature Extractionmentioning
confidence: 99%
“…The time-domain features include Short-Time Energy (STEN), Pitch Frequency (PFCY), Formant Frequency (FFCY) and Average Speech Speed (AVSS) [7]. The cepstrum features include MFCC, Gamma Frequency Cepstrum Coefficient (GFCC) [8], Barker Frequency Cepstrum Coefficient (BFCC) [9], Normalized Gamma Chirped Cepstrum Coefficient (NGCC) [10], Amplitude-based Spectrum Root Cepstral Coefficient (MSRCC), Phase-based Spectrum Root Cepstral Coefficient (PSRCC) [11] and Linear Frequency Cepstrum Coefficient (LFCC) [12].…”
Section: Feature Extractionmentioning
confidence: 99%
“…pyAudioProcessing aims to provide an end-to-end processing solution for converting between audio file formats, visualizing time and frequency domain representations, cleaning with silence and low-activity segments removal from audio, building features from raw audio samples, and training a machine learning model that can then be used to classify unseen raw audio samples (e.g., into categories such as music, speech, etc.). This library allows the user to extract features such as Mel Frequency Cepstral Coefficients (MFCC) [CD14], Gammatone Frequency Cepstral Coefficients (GFCC) [JDHP17], spectral features, chroma features and other beat-based and cepstrum based features from audio to use with one's own classification backend or scikit-learn classifiers that have been built into pyAudioProcessing. The classifier implementation examples that are a part of this software aim to give the users a sample solution to audio classification problems and help build the foundation to tackle new and unseen problems.…”
Section: Core Functionalitiesmentioning
confidence: 99%
“…The gammatone filter bank is series of overlapping band-pass filters that models the human auditory system [33]. The combination of gammatone filter bank (GF), cubic root and equivalent rectangular bandwidth (ERB) gives the robustness of GFCC features in noisy environments [34].…”
Section: Gammatone Frequency Cepstral Coefficients (Gfcc)mentioning
confidence: 99%