2000
DOI: 10.1016/s0167-6393(99)00070-9
|View full text |Cite
|
Sign up to set email alerts
|

Localization and selection of speaker-specific information with statistical modeling

Abstract: Statistical modeling of the speech signal has been widely used in speaker recognition. The performance obtained with this type of modeling is excellent in laboratories but decreases dramatically for telephone or noisy speech. Moreover, it is difficult to know which piece of information is taken into account by the system. In order to solve this problem and to improve the current systems, a better understanding of the nature of the information used by statistical methods is needed. This knowledge should allow t… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

6
43
0
1

Year Published

2000
2000
2016
2016

Publication Types

Select...
6
2
1

Relationship

0
9

Authors

Journals

citations
Cited by 46 publications
(50 citation statements)
references
References 20 publications
6
43
0
1
Order By: Relevance
“…This result is also consistent with findings of other studies such as [124,125] that the high frequency region contains the most discriminative speaker information.…”
Section: The Effect Of Speech Bandwidth On Accent Recognitionsupporting
confidence: 93%
“…This result is also consistent with findings of other studies such as [124,125] that the high frequency region contains the most discriminative speaker information.…”
Section: The Effect Of Speech Bandwidth On Accent Recognitionsupporting
confidence: 93%
“…How to identify the reliable feature parts assuming minimum noise information remains a focus of the research for the MF method. Previous studies have suggested different methods (see, for example, [13], [14], [24]- [26]). In this paper, we study the posterior union model (PUM) [27].…”
Section: Combining Wf Nc With Missingfeature Technique (Wf+mf Nc+mf)mentioning
confidence: 99%
“…Other techniques rely on a statistical model of the noise, for example, parallel model combination (PMC) [9], [10], or on the use of microphone arrays [11], [12]. Recent studies on the missing-feature method have shown improved robustness for speech data subjected to partial noise corruption (e.g., [13], [14]). …”
Section: Introductionmentioning
confidence: 99%
“…For this application, the watermark bits are embedded at locations where fewer speaker-specific sub-bands are available. Basically, the discriminative speaker features are contained within the low-and high-frequency bands: the glottis frequency range is between 100 and 400 Hz, the piriform fossa range is between 4 and 5 kHz, and the constriction of consonants occurs at 7.5 kHz [12][13][14].…”
Section: Introductionmentioning
confidence: 99%