2020
DOI: 10.1016/j.apacoust.2020.107344
|View full text |Cite
|
Sign up to set email alerts
|

Voice activity detection with quasi-quadrature filters and GMM decomposition for speech and noise

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
1
1

Relationship

0
7

Authors

Journals

citations
Cited by 9 publications
(3 citation statements)
references
References 18 publications
0
3
0
Order By: Relevance
“…Semantic analysis procedures in radio content mainly involve voice detection, speech recognition and speaker identification tasks. Machine learning approaches based on clustering techniques that determine speech/non-speech frames were implemented for voice activity detection via Gaussian mixture models, Laplacian similarity matrices, expectation maximization algorithms, hidden Markov chains and artificial neural networks [26][27][28][29][30]. A more specific and interesting audio pattern that can be detected in audio signals, i.e., in broadcast programs, refers to phone line voices, due to the contained particular spectral audio properties [1,25,26].…”
Section: Background Work and Problem Definitionmentioning
confidence: 99%
“…Semantic analysis procedures in radio content mainly involve voice detection, speech recognition and speaker identification tasks. Machine learning approaches based on clustering techniques that determine speech/non-speech frames were implemented for voice activity detection via Gaussian mixture models, Laplacian similarity matrices, expectation maximization algorithms, hidden Markov chains and artificial neural networks [26][27][28][29][30]. A more specific and interesting audio pattern that can be detected in audio signals, i.e., in broadcast programs, refers to phone line voices, due to the contained particular spectral audio properties [1,25,26].…”
Section: Background Work and Problem Definitionmentioning
confidence: 99%
“…There has been a set of research works on machine learning approaches applied to the broad voice analysis research area such as pathological voice detection 25,26 , voice activity detection 27,28 . A number of studies have investigated voice analysis based on specific machine learning algorithms such as decision trees 29 , support vector machine (SVM) 30,31 , hidden Markov model (HMM) 32,33 , Gaussian mixture model (GMM) 34,35 , artificial neural networks (ANN) 36,37 and have reported high accuracy and performance [38][39][40] .…”
Section: Introductionmentioning
confidence: 99%
“…Finally, the VAD decision is based on a threshold derived from the parameter contour at each utterance. This method is also used in [15], which proposed changing envelope calculation and utilized histograms and estimators of probability distributions to determine the detection threshold; citing that the SFF method is also used in the enhancement of speech intelligibility [16].…”
Section: Introductionmentioning
confidence: 99%