Comparison of energy-based endpoint detectors for speech signal processing

Ganapathiraju, Aravind; Webster, L. D.; Trimble, J. E.; Bush, K.; Kornman, P.

doi:10.1109/secon.1996.510121

Cited by 23 publications

(14 citation statements)

References 8 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The step-size of the sliding window indicates the resolution of the system. For the purpose of VAD, we need to evaluate the following statistical hypotheses: -H 0 : (x 1 Using the log-value of the Generalized Likelihood Ratio Test (GLRT), associated with the defined hypothesis test the distance between the two segments in Fig. 1 is: ( , ; ) log log ( , ; ) ( , ; )…”

Section: Bayessian Information Criterionmentioning

confidence: 99%

“…First, we select a sufficiently big sliding window, model it and its adjacent sub-segments using GΓD instead of GD, and calculate the distance d R associated with the GLRT using (1). Here, as in [9], we are making the assumption that both noise and speech signals have uncorrelated components in the DCT domain.…”

Section: Distbic Using Generalized Gamma Distributionmentioning

confidence: 99%

“…The detection principles of conventional VADs are usually energy-based approaches, which have been proved computationally efficient to such an extent that they allow real-time signal processing [1]. Moreover, these methods work relatively well in high signal to noise ratios (SNR) and for known stationary noise.…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Voice Activity Detection with Generalized Gamma Distribution

Almpanidis

Kotropoulos

2006

2006 IEEE International Conference on Multimedia and Expo

View full text Add to dashboard Cite

show abstract

Section: Bayessian Information Criterionmentioning

confidence: 99%

Section: Distbic Using Generalized Gamma Distributionmentioning

confidence: 99%

See 1 more Smart Citation

Voice Activity Detection with Generalized Gamma Distribution

Almpanidis

Kotropoulos

2006

2006 IEEE International Conference on Multimedia and Expo

View full text Add to dashboard Cite

show abstract

“…Infusion of pitch and duration information, use of adaptive thresholds, augmentation of zero crossover rate result in somewhat improved performance [4]. The proposed algorithms, replaces entropy of the speech as the key feature for boundary detection.…”

Section: Entropy-based Speech Segmentation Algorithmmentioning

confidence: 99%

“…The most commonly used method of endpoint detection is the use of short-time or spectral energy [1,2,3,4]. Typically an adaptive threshold is employed based on the features of the energy profile to differentiate between the speech segments and the background noise.…”

Section: Introductionmentioning

confidence: 99%

A robust algorithm for detecting speech segments using an entropic contrast

Waheed

Weaver

Salam

The 2002 45th Midwest Symposium on Circuits and Systems, 2002. MWSCAS-2002.

View full text Add to dashboard Cite

This paper addresses the issue of automatic word/sentence boundary detection in both quiet and noisy environments. We propose to use an entropy based contrast function between the speech segments and the background noise. A simplified data based scheme of computing the entropy of the speech data is presented. The entropy-based contrast exhibits better-behaved characteristics as compared to the energy-based methods. An adaptive threshold is used to determine the candidate speech segments, which are subjected to word/sentence constraints. Experimental results show that this algorithm outperforms energy-based algorithms. The improved detection accuracy of speech segments results in at least 25 % improvement of recognition performance for isolated speech and more than 16% for connected speech. For continuous speech, a preprocessing stage comprising of the proposed speech segment detection makes the overall HMM based scheme more computationally efficient by rejection of silence periods.

show abstract