2007
DOI: 10.1109/tasl.2007.902870
|View full text |Cite
|
Sign up to set email alerts
|

Fusion of Heterogeneous Speaker Recognition Systems in the STBU Submission for the NIST Speaker Recognition Evaluation 2006

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
166
0
4

Year Published

2013
2013
2020
2020

Publication Types

Select...
5
3
1

Relationship

0
9

Authors

Journals

citations
Cited by 189 publications
(171 citation statements)
references
References 24 publications
1
166
0
4
Order By: Relevance
“…6c and d). In contrast, when the separation between the categories is greater and the amount of data is small, the increases in the ELUB 15 A potential alternative that avoids the sudden truncation could be to fit a sigmoidal function in the logistic space [45]. 16 We believe that this range of amount of sample data and range of separation between the categories is sufficient to gain an understanding of the relative behaviour of the procedures and to conceptually interpolate and extrapolate within and beyond these ranges.…”
Section: Exploration Of the Behaviour Of The Four Procedures Using Simentioning
confidence: 99%
“…6c and d). In contrast, when the separation between the categories is greater and the amount of data is small, the increases in the ELUB 15 A potential alternative that avoids the sudden truncation could be to fit a sigmoidal function in the logistic space [45]. 16 We believe that this range of amount of sample data and range of separation between the categories is sufficient to gain an understanding of the relative behaviour of the procedures and to conceptually interpolate and extrapolate within and beyond these ranges.…”
Section: Exploration Of the Behaviour Of The Four Procedures Using Simentioning
confidence: 99%
“…In this work we choose the high-level fusion approach due to its ease of use for both multi-modal [8] and multi-algorithm [44,45,46] fusion.…”
Section: Bi-modal and Multi-algorithm Authentication Systemsmentioning
confidence: 99%
“…We take the well-known statistical linear logistic regression approach, which has been successfully employed for combining heterogeneous speaker and face authentication classifiers [44,45,46] and for bi-modal (face and speaker) authentication [8].…”
Section: Linear Logistic Regressionmentioning
confidence: 99%
“…But the biggest difference, as highlighted in Brümmer et al (2007), is that especially from SRE 2006, "systems no longer train individual speaker models from some minutes of speech, but whole systems are trained on hundreds of hours of speech in whole NIST SRE databases" (p. 2082), transforming the conceptually simple speaker detection task, classically seen as that of comparing two utterances to determine if they come or not from the same speaker, into a serious big data task where systems are designed to jointly optimize the detection of thousands of speakers in hundreds of thousands of comparisons, where the speech segments in the comparisons are tens of thousands of utterances of varied and mixed channel, speaking style, duration and noise characteristics.…”
Section: Big Data Evaluations (2006-2012): Session Variability Compenmentioning
confidence: 99%