2017
DOI: 10.1121/1.4979841
|View full text |Cite
|
Sign up to set email alerts
|

Robust speaker identification via fusion of subglottal resonances and cepstral features

Abstract: This letter investigates the use of subglottal resonances (SGRs) for noise-robust speaker identification (SID). It is motivated by the speaker specificity and stationarity of subglottal acoustics, and the development of noise-robust SGR estimation algorithms which are reliable at low signal-to-noise ratios for large datasets. A two-stage framework is proposed which combines the SGRs with different cepstral features. The cepstral features are used in the first stage to reduce the number of target speakers for a… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1

Citation Types

0
9
0

Year Published

2017
2017
2022
2022

Publication Types

Select...
4
4

Relationship

2
6

Authors

Journals

citations
Cited by 14 publications
(9 citation statements)
references
References 13 publications
0
9
0
Order By: Relevance
“…The testing is done on clean and noisy conditions to test the robustness of the proposed feature extraction algorithm, 4 noise types are chosen from the Noisex-92 noise dataset (babble, factory 1, pink and white) that are added to the test utterances with SNR levels 0, 5, 10 and 15 db. The results showed that the proposed features outperforms baseline features (PNCC and GFCC) and other proposed works Islam et al, in 2016, Korba et al, in 2018, Guo et al, in 2017and Ajgou et al in 2016, so it's a promising approach for extracting robust features and increasing speaker identification rate.…”
Section: Discussionmentioning
confidence: 85%
See 1 more Smart Citation
“…The testing is done on clean and noisy conditions to test the robustness of the proposed feature extraction algorithm, 4 noise types are chosen from the Noisex-92 noise dataset (babble, factory 1, pink and white) that are added to the test utterances with SNR levels 0, 5, 10 and 15 db. The results showed that the proposed features outperforms baseline features (PNCC and GFCC) and other proposed works Islam et al, in 2016, Korba et al, in 2018, Guo et al, in 2017and Ajgou et al in 2016, so it's a promising approach for extracting robust features and increasing speaker identification rate.…”
Section: Discussionmentioning
confidence: 85%
“…Figure 5. Comparison with other studies: (a) with work proposed by [1], (b) with work proposed by[18], (c) with work proposed by[38], and (d) with work proposed by[39] …”
mentioning
confidence: 83%
“…After the 1980s, characteristic parameters such as time domain decomposition, frequency domain decomposition, and wavelet packet node energy also gradually appeared and were widely used [6]. Jinxi Guo et al studied the recognition system in the noise environment [7] and made some achievements and progress.…”
Section: Introductionmentioning
confidence: 99%
“…A majority of automatic speaker recognition systems use only the physiological speech features due to their high discriminability and ease of characterization [2]. However, such automatic speaker recognition systems are vulnerable to audio degradations, such as background noise and channel effects [3]. Behavioral speech characteristics, while being susceptible to intra-user variations, are considered robust to audio degradations [4].…”
Section: Introductionmentioning
confidence: 99%