2011
DOI: 10.1109/tasl.2011.2109379
|View full text |Cite
|
Sign up to set email alerts
|

The Delta-Phase Spectrum With Application to Voice Activity Detection and Speaker Recognition

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
5

Citation Types

0
17
0

Year Published

2013
2013
2021
2021

Publication Types

Select...
4
3
2

Relationship

0
9

Authors

Journals

citations
Cited by 48 publications
(17 citation statements)
references
References 37 publications
0
17
0
Order By: Relevance
“…In few findings, we see that vector quantization and VAD have been used along with MFCC to increase the efficiency of speaker recognition [2]. Speaker recognition rates are also controlled with the help of Mel-frequency delta phase (MFDP) along with MFCC and it is found that error probability is less in MFCC, but when both MFCC and MFDP are used together, it proves to be more efficient [3]. A review of the use of phase information in speech processing, however, indicates that broadly effective phase domain features remain difficult to extract [4].…”
Section: Introductionmentioning
confidence: 99%
“…In few findings, we see that vector quantization and VAD have been used along with MFCC to increase the efficiency of speaker recognition [2]. Speaker recognition rates are also controlled with the help of Mel-frequency delta phase (MFDP) along with MFCC and it is found that error probability is less in MFCC, but when both MFCC and MFDP are used together, it proves to be more efficient [3]. A review of the use of phase information in speech processing, however, indicates that broadly effective phase domain features remain difficult to extract [4].…”
Section: Introductionmentioning
confidence: 99%
“…Nowadays, many speech-related applications are developed to facilitate our daily lives. Voice activity detection (VAD), which detects speech segments in an audio stream, is often included in the front-end of speech-related systems, such as in telecommunication systems [1], [2], robust automatic speech recognition system [3] and speaker recognition systems [4], [5]. Therefore, a robust VAD for any noise condition is greatly needed.…”
Section: Introductionmentioning
confidence: 99%
“…According to [10], energy VAD with spectral subtraction enhancement can outperform more advanced statistical model VAD [3]. Alternative ways to tackle noise include alternative features such as periodicity [11] or phase [12].…”
Section: Introductionmentioning
confidence: 99%
“…Beyond the simple energy VAD, at the other extreme are methods that adopt an off-the-shelf phone recognizer or trainable models for VAD [13,14,15,16,12]. For instance, phone posterior probabilities can be merged and combined with energy measures [13].…”
Section: Introductionmentioning
confidence: 99%