2006
DOI: 10.1109/tsa.2005.858054
|View full text |Cite
|
Sign up to set email alerts
|

Text-independent speaker recognition based on the Hurst parameter and the multidimensional fractional Brownian motion model

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
37
0
9

Year Published

2008
2008
2018
2018

Publication Types

Select...
4
3

Relationship

0
7

Authors

Journals

citations
Cited by 46 publications
(48 citation statements)
references
References 24 publications
0
37
0
9
Order By: Relevance
“…Two different vector-based distances were employed in the feature space: the classic Euclidean distance and the Minkowsky distance. We let the parameter δ vary from 2 to 25, whereas the dimensionality of the feature vector was analyzed in the range [2][3][4][5][6][7][8][9][10]. The results are graphically shown in Fig.…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…Two different vector-based distances were employed in the feature space: the classic Euclidean distance and the Minkowsky distance. We let the parameter δ vary from 2 to 25, whereas the dimensionality of the feature vector was analyzed in the range [2][3][4][5][6][7][8][9][10]. The results are graphically shown in Fig.…”
Section: Resultsmentioning
confidence: 99%
“…The Hurst exponent H has a clear interpretation: if H = 1/2, then the fBm represents the standard Brownian motion (namely the process has no memory); for H > 1/2, the process is characterized by a persistence (positive or negative) -namely the process has a clear trend; finally, for H < 1/2 the process is characterized by an antipersistence. Even if this method has been largely applied in the finance community (mainly to understand some characteristics of the financial series), some works applying the fBm function have also been presented in the pattern recognition context, like in medical imaging [8] or in speech recognition [9]. It is important to note that fBm is related to the fractal theory [10] -a theory largely employed in image analysis [11] or even object classification [12,13] -since it is modeling self-similarity.…”
Section: Introductionmentioning
confidence: 99%
“…Daubechies [25]), which was computed by applying a wavelet-based multidimensional transformation of the short-time input speech in the work by Sant'Ana [124]. Thanks to its statistical definition, pH is robust to channel distortions as it models the stochastic behavior of input speech signal (see Zao et al [125], or Palo et al [126]).…”
Section: Wavelet-based Physical Featuresmentioning
confidence: 99%
“…Thanks to its statistical definition, pH is robust to channel distortions as it models the stochastic behavior of input speech signal (see Zao et al [125], or Palo et al [126]). pH was originally applied as a means to improve speech-related problems, such as text-independent speaker recognition [124], speech emotion classification [125,126], or speech enhancement [127]. However, it has also been applied to sound source localization in noisy environments recently, as in the work by Dranka and Coelho [128].…”
Section: Wavelet-based Physical Featuresmentioning
confidence: 99%
“…The goal of the multi-style training is to reduce the mismatch between training and test features by corrupting the speech signals. Features based on discrete wavelet transform, and pH features [94] are fused with the MFCCs. Speech enhancement techniques are also applied.…”
Section: Robust Features Against Additive Noisementioning
confidence: 99%