2010
DOI: 10.1016/j.specom.2009.08.003
|View full text |Cite
|
Sign up to set email alerts
|

Noise robust voice activity detection based on periodic to aperiodic component ratio

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
41
0

Year Published

2010
2010
2023
2023

Publication Types

Select...
6
4

Relationship

0
10

Authors

Journals

citations
Cited by 56 publications
(41 citation statements)
references
References 72 publications
0
41
0
Order By: Relevance
“…The maximum value can be employed to detect periodicity [30]. Normalizing the maximum value based on an estimate of the aperiodic components increases the robustness of the feature as described in [25] and similarly in [35].…”
Section: Pitch and Harmonicitymentioning
confidence: 99%
“…The maximum value can be employed to detect periodicity [30]. Normalizing the maximum value based on an estimate of the aperiodic components increases the robustness of the feature as described in [25] and similarly in [35].…”
Section: Pitch and Harmonicitymentioning
confidence: 99%
“…The second stage is the recognition of whether the detected voice is a part of conversational utterance or not. Numerous automatic human voice detection algorithms have been proposed, including ones based on periodicity [13], power ratio in the frequency domain [14], and frequency deviation [15]- [17]. To correctly recognize the conversation period and the end of the conversation, the voice of the talking partner, not only the target person of the estimation, has to be detected.…”
Section: Automatic Conversational Voice Detectionmentioning
confidence: 99%
“…These methods utilize the facts that vowels exhibit strong (quasi-)periodicity and apply it to discriminate speech from silence. Periodicity based approaches are usually more robust to noisy environments, however they require more computational effort than the energy-based ones, (Ishizuka et al, 2010). Finally, in the Broadcast News field, most systems discriminate between acoustic classes like speech, music, music and speech, and silence.…”
Section: Front-end Features and Preprocessing Stepsmentioning
confidence: 99%