2017
DOI: 10.5370/jeet.2017.12.1.373
|View full text |Cite
|
Sign up to set email alerts
|

Applying the Bi-level HMM for Robust Voice-activity Detection

Abstract: -This paper presents a voice-activity detection (VAD) method for sound sequences with various SNRs. For real-time VAD applications, it is inadequate to employ a post-processing for the removal of burst clippings from the VAD output decision. To tackle this problem, building on the bilevel hidden Markov model, for which a state layer is inserted into a typical hidden Markov model (HMM), we formulated a robust method for VAD not requiring any additional post-processing. In the method, a forward-inference-ratio t… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
1
1

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(2 citation statements)
references
References 15 publications
0
2
0
Order By: Relevance
“…On the contrary, the greater the likelihood of an event, the greater the probability and the smaller the amount of information contained. Therefore, the short-term average amount of information provided by the source is expressed as: (11) Since the amplitude of the sound signal has a large dynamic range relative to the amplitude of the background noise, it can be considered that the random event of the sound signal in the range (-M, M) is large, that is, the information entropy is large, and the invalid sound segment only contains the amplitude of the background noise Small, the distribution is relatively concentrated, so the information entropy is small. As shown in Figure 5, as the signal-to-noise ratio decreases, the short-term information entropy fluctuates to a certain extent, but the start and end points of the sound can still be distinguished.…”
Section: Information Entropymentioning
confidence: 99%
See 1 more Smart Citation
“…On the contrary, the greater the likelihood of an event, the greater the probability and the smaller the amount of information contained. Therefore, the short-term average amount of information provided by the source is expressed as: (11) Since the amplitude of the sound signal has a large dynamic range relative to the amplitude of the background noise, it can be considered that the random event of the sound signal in the range (-M, M) is large, that is, the information entropy is large, and the invalid sound segment only contains the amplitude of the background noise Small, the distribution is relatively concentrated, so the information entropy is small. As shown in Figure 5, as the signal-to-noise ratio decreases, the short-term information entropy fluctuates to a certain extent, but the start and end points of the sound can still be distinguished.…”
Section: Information Entropymentioning
confidence: 99%
“…Zero-crossing rate detection algorithm and detection algorithm based on spectral entropy. Literature [9][10][11] combined frequency domain features and time domain features, and studied detection algorithms based on short-term energy and information entropy and detection algorithms based on Mel frequency cepstrum coefficients. Literature [12] uses wavelet transform convolution method to detect.…”
Section: Introductionmentioning
confidence: 99%