1976
DOI: 10.1109/tassp.1976.1162846
|View full text |Cite
|
Sign up to set email alerts
|

A comparative performance study of several pitch detection algorithms

Abstract: Abstract-A comparative performance study of seven pitch detection algorithms was conducted. A speech data base, consisting of eight utterances spoken by three males, three females, and one child was constructed. Telephone, close talking microphone, and wideband recordings were made of each of the utterances. For each of the utterances in the data base; a "standard" pitch contour was semiautomatically measured using a highly sophisticated interactive pitch detection program. The "standard" pitch contour was the… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

4
259
0
8

Year Published

1981
1981
2012
2012

Publication Types

Select...
6
1

Relationship

0
7

Authors

Journals

citations
Cited by 580 publications
(271 citation statements)
references
References 20 publications
4
259
0
8
Order By: Relevance
“…The speaker identification stage is based on a Gaussian Mixture Model (GMM) classifier [13] with some modifications to improve robustness and computation efficiency for use on the mobile phone. We use as our feature vector pitch [11] and the Mel-frequency cepstral coefficients (MFCCs) [20] computed for each admitted frame. SpeakerSense computes 20-dimensional MFCCs, and then ignores the first coefficient, which represents the energy of the frame, and instead focuses on the spectral shape, represented by the 2 nd through 20 th coefficients.…”
Section: Frame Admission and Speaker Identification On The Phonementioning
confidence: 99%
See 1 more Smart Citation
“…The speaker identification stage is based on a Gaussian Mixture Model (GMM) classifier [13] with some modifications to improve robustness and computation efficiency for use on the mobile phone. We use as our feature vector pitch [11] and the Mel-frequency cepstral coefficients (MFCCs) [20] computed for each admitted frame. SpeakerSense computes 20-dimensional MFCCs, and then ignores the first coefficient, which represents the energy of the frame, and instead focuses on the spectral shape, represented by the 2 nd through 20 th coefficients.…”
Section: Frame Admission and Speaker Identification On The Phonementioning
confidence: 99%
“…Similar to both of these systems, our goal is not to design new speaker identification algorithms. Instead we leverage well-established techniques such as the MFCCs feature set [20], pitch tracking [11], and GMM classifiers [e.g., 13,14], which have been proven effective for speaker identification. Our focus is on adapting these techniques to a mobile platform and addressing challenges that arise when using speaker identification on energy constrained mobile phones.…”
Section: Introductionmentioning
confidence: 99%
“…All The fundamental frequency extraction is an important task in many speech processing systems [1] [2]. Therefore, extracting the fundamental frequency of a speech signal is essential for research in speech processing and many methods have been proposed [1]- [6].…”
Section: Introductionmentioning
confidence: 99%
“…Therefore, extracting the fundamental frequency of a speech signal is essential for research in speech processing and many methods have been proposed [1]- [6]. However, a performance improvement in noisy environments is still desired.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation