2014
DOI: 10.1109/taslp.2014.2352451
|View full text |Cite
|
Sign up to set email alerts
|

Detection of Glottal Closure Instants Based on the Microcanonical Multiscale Formalism

Abstract: This paper presents a novel algorithm for automatic detection of Glottal Closure Instants (GCI) from the speech signal. Our approach is based on a novel multiscale method that relies on precise estimation of a multiscale parameter at each time instant in the signal domain. This parameter quantifies the degree of signal singularity at each sample from a multi-scale point of view and thus its value can be used to classify signal samples accordingly. We use this property to develop a simple algorithm for detectio… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
30
0

Year Published

2018
2018
2022
2022

Publication Types

Select...
3
3
2

Relationship

0
8

Authors

Journals

citations
Cited by 30 publications
(30 citation statements)
references
References 32 publications
0
30
0
Order By: Relevance
“…The instant of significant excitation (within each period) is termed as Epoch which coincides with instant of closure of the glottis [3]. The problem of detecting the precise locations of such Glottal Closure Instants (GCIs) from speech signal has been studied for decades given its importance in several speech processing tasks [4,5,6,7,8,9,10,11,12]. Most of the successful GCI detectors adopt a two-stage approach -(i) Obtaining an intermediate representation from the speech signal, which explicitly manifests GCIs as discontinuities, impulses, extremas or as other perceptual events and (ii) detecting precise temporal location of glottal closures using custommade algorithms.…”
Section: Background and Previous Workmentioning
confidence: 99%
See 1 more Smart Citation
“…The instant of significant excitation (within each period) is termed as Epoch which coincides with instant of closure of the glottis [3]. The problem of detecting the precise locations of such Glottal Closure Instants (GCIs) from speech signal has been studied for decades given its importance in several speech processing tasks [4,5,6,7,8,9,10,11,12]. Most of the successful GCI detectors adopt a two-stage approach -(i) Obtaining an intermediate representation from the speech signal, which explicitly manifests GCIs as discontinuities, impulses, extremas or as other perceptual events and (ii) detecting precise temporal location of glottal closures using custommade algorithms.…”
Section: Background and Previous Workmentioning
confidence: 99%
“…For instance, [5,6,10,13] choose either linear-prediction residual or glottal flow derivative as the representative signal. Other class of algorithms do not explicitly make any model assumption for speech production rather indirectly use the properties of excitation signal (such as its impulsive nature) and estimate appropriate representations (E.g., zero-frequency filtered signal [7], mean-based signal [8], wavelet-decompositions [14], singularity exponents [11]). During the second stage, aforementioned algorithms employ several heuristics to extract (or refine) the GCIs.…”
Section: Background and Previous Workmentioning
confidence: 99%
“…Then, dynamic-programming or peak-picking is used to select the GCIs among the detected candidates. Examples of such approaches are the SEDREAMS [15], DYPSA [13], DPI [16], YAGA [17], ZFR [14], or MMF [18] algorithms. Although such approaches have been shown to perform reasonably well, they rely on different processing techniques that require manual tuning of parameters (e.g.…”
Section: Introductionmentioning
confidence: 99%
“…From the studies in [18], it was observed that most of the epoch detection methods were shown to provide good accuracy on the speech data collected in the lab environments. Also, some attempts were made to see the effectiveness of these methods for additive noise degraded conditions [19][20][21][22][23]. However, there are not many attempts in GCI detection for the degraded conditions like telephone quality speech.…”
Section: Introductionmentioning
confidence: 99%
“…An optimal LoMA is computed within a pitch period using a dynamic programming to locate GCIs. In [19], a nonlinear formalism, namely microcanonical multiscale formalism was used to highlight the impulses present in the speech signal directly.…”
Section: Introductionmentioning
confidence: 99%