Interspeech 2016 2016
DOI: 10.21437/interspeech.2016-1401
|View full text |Cite
|
Sign up to set email alerts
|

Robust Estimation of Fundamental Frequency Using Single Frequency Filtering Approach

Abstract: A new method for robust estimation of fundamental frequency (F0) from speech signal is proposed in this paper. The method exploits the high SNR regions of speech in time and frequency domains in the outputs of single frequency filtering (SFF) of speech signal. The high resolution in the frequency domain brings out the harmonic characteristics of speech clearly. The harmonic spacing in the high SNR regions of spectrum determine the F0. The concept of root cepstrum is used to reduce the effects of vocal tract re… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
15
0

Year Published

2017
2017
2021
2021

Publication Types

Select...
6
3

Relationship

4
5

Authors

Journals

citations
Cited by 19 publications
(15 citation statements)
references
References 24 publications
0
15
0
Order By: Relevance
“…Performance of the proposed method is compared with eight standard methods. The eight standard methods are SWIPE [32], YIN [9], RAPT [7] and SHRP [16], YAAPT [8], SRH [15], PEFAC [19] and SFF-CEP [33]. For all the methods, F0 search range was set between 60 − 1500 Hz according the study on singing voice in [3].…”
Section: Methods For Comparisonmentioning
confidence: 99%
“…Performance of the proposed method is compared with eight standard methods. The eight standard methods are SWIPE [32], YIN [9], RAPT [7] and SHRP [16], YAAPT [8], SRH [15], PEFAC [19] and SFF-CEP [33]. For all the methods, F0 search range was set between 60 − 1500 Hz according the study on singing voice in [3].…”
Section: Methods For Comparisonmentioning
confidence: 99%
“…The SFF method is used to derive the amplitude envelope of the speech signal at every sample for a given frequency [32]. The SFF spectrum has been shown to be useful in finding burst-onset points [29] and glottal closure instants [30], and it has been demonstrated to exhibit high spectral resolution for important speech features such as harmonics and resonances [27].…”
Section: A Sffmentioning
confidence: 99%
“…This architecture was chosen in the current study because it was shown in [25] to be the best performing system in dialect classification compared to two reference techniques. The spectrum computed by single frequency filtering (SFF) has been shown to give good spectral resolution to indicate harmonics and resonances [27] and good temporal resolution to model speech excitation features such as impulse-like events [28]. The SFF spectrum has also shown promising performance in determining burstonset points related to voice-onset time (VOT) and glottal closure instances compared to the short-time Fourier transform (STFT) spectrum [28]- [30].…”
Section: Introductionmentioning
confidence: 99%
“…The instantaneous energy for a speech segment (Figure 2 (a)) is shown in Figure 2 (c). The equation for instantaneous energy E[n] is given below [28],…”
Section: Parameters Used For Feature Extractionmentioning
confidence: 99%