1999
DOI: 10.1109/89.784109
|View full text |Cite
|
Sign up to set email alerts
|

Modeling of the glottal flow derivative waveform with application to speaker identification

Abstract: Speech production has long been viewed as a linear filtering process, as described by Fant in the late 1950's [10]. The vocal tract, which acts as the filter, is the primary focus of most speech work. This thesis develops a method for estimating the source of speech, the glottal flow derivative. Models are proposed for the coarse and fine structure of the glottal flow derivative, accounting for nonlinear sourcefilter interaction, and techniques are developed for estimating the parameters of these models. The i… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

2
181
0
2

Year Published

2005
2005
2017
2017

Publication Types

Select...
5
2
2

Relationship

0
9

Authors

Journals

citations
Cited by 236 publications
(185 citation statements)
references
References 36 publications
2
181
0
2
Order By: Relevance
“…For speech signals, the zeros are introduced by pitch, nasals and the short-time analysis window function. Further, zeros are also introduced by the glottal return phase (the time interval between the most negative value of the glottal flow derivative and glottal closure [Plumpe et al (1999)] when the glottal flow is modelled as a linear filter [Doval et al (2003)]. The presence of zeros makes the denominator term in Equation 3 vanish, leading to an illbehaved group delay function.…”
Section: Representations Of Group Delay Functionsmentioning
confidence: 99%
“…For speech signals, the zeros are introduced by pitch, nasals and the short-time analysis window function. Further, zeros are also introduced by the glottal return phase (the time interval between the most negative value of the glottal flow derivative and glottal closure [Plumpe et al (1999)] when the glottal flow is modelled as a linear filter [Doval et al (2003)]. The presence of zeros makes the denominator term in Equation 3 vanish, leading to an illbehaved group delay function.…”
Section: Representations Of Group Delay Functionsmentioning
confidence: 99%
“…The linear predictive coding (LPC) coefficients are readily estimated using classical methods. In addition to providing a good approximation to the vocal tract filter (VTF), the prediction error waveform is a first-order approximation to the voice source waveform (VSW), which approximates the derivative of the glottal flow [2]. Refer to Figure 1, which is a block-diagram of GLOMM.…”
Section: Review Of Glommmentioning
confidence: 99%
“…It is logical, then, that much speaker-related information is missing from MFCC. Attempts to add voice source information back into speaker identification systems to improve them have met limited success [1], [2], [3], [4], probably due to the difficulty and reliability of estimating the voice source waveform itself. We previously introduced the GLOMM method [5], which was based on detecting glottal events (glottal opening/closing) by detecting times of high linear prediction error.…”
Section: Introduction and Previous Workmentioning
confidence: 99%
“…While techniques for modeling the vocal tract are rather well-established, it is not the case for the glottal source representation. However the characterization of this latter has been shown to be advantageous in speaker recognition [1], speech disorder analysis [2], speech recognition [3] or speech synthesis [4]. These reasons justify the need of developing algorithms able to robustly and reliably estimate and parametrize the glottal signal.…”
Section: Introductionmentioning
confidence: 99%