2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2014
DOI: 10.1109/icassp.2014.6853878
|View full text |Cite
|
Sign up to set email alerts
|

Large-scale speaker identification

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
27
0

Year Published

2015
2015
2023
2023

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 31 publications
(27 citation statements)
references
References 18 publications
0
27
0
Order By: Relevance
“…Other work by Schmidt et al [4] uses Local Sensitive Hashing (LSH) and fast nearest neighbor search algorithm for speaker indexing. Schmidt proposed an indexing method using i- Vector.…”
Section: Introductionmentioning
confidence: 99%
“…Other work by Schmidt et al [4] uses Local Sensitive Hashing (LSH) and fast nearest neighbor search algorithm for speaker indexing. Schmidt proposed an indexing method using i- Vector.…”
Section: Introductionmentioning
confidence: 99%
“…This section summarises the current work on I-vector and GMM-UBM approaches and other related work, alongside our previous work and other state of the art methods [14], [22], [12], [13], [23], [24], and [5]. According to Table IV, the handset used was G.712 type at 16 kHz, and all proposed noise measurements in this table were at SNR 30 dB and mixture size 256.…”
Section: Related Workmentioning
confidence: 99%
“…Nevertheless, this study lacked a large number of speakers, as only 50 self collected speakers were used. In [13], 1,000 speakers were selected from YouTube to construct an I-vector speaker identification framework, but this non-standard database did not include noisy conditions.…”
Section: Introductionmentioning
confidence: 99%
“…where, both i and j take values 1 and 2, therefore f weight ij takes one of four values f weight 11 , f weight 12 , f weight 13 , and f weight 22 , and f weight 11 is the linear combination of f 1 and g 1 , likewise f weight 12 is the linear combination of f 1 and g 2 and so on. For each f weight ij , ω β can take on one of four values, namely, ω β ∈ {0.9, 0.8, 0.77, 0.7} which is chosen to give empirically the best SIA.…”
Section: Fusion Strategiesmentioning
confidence: 99%
“…However, the identification rate using the NIST 2003 database was poor. In [13], approximately 1000 speakers were selected and recordings were made, including in an acoustics room, with noise, and with varying microphone distance. However, the conditions were perhaps unfair and a non-standard database (derived from YouTube) was used.…”
Section: Introductionmentioning
confidence: 99%