In this paper our main aim to provide the difference between cepstral and non-cepstral feature extraction techniques. Here we try to cover-up most of the comparative features of Mel Frequency Cepstral Coefficient and prosodic features. In speaker recognition, there are two type of techniques are available for feature extraction: Short-term features i.e. Mel Frequency Cepstral Coefficient (MFCC) and long-term features (Prosodic) extraction techniques. In this paper, we explore the usefulness of prosodic features for syllable classification and MFCC for feature extraction of a speech signal followed by comparison between them. The Me1 Frequency Cepstral Coefficients (MFCC) is one of the most important features extraction techniques, which is required among various kinds of speech applications. The MFCC features are extracted from the speaker phonemes in the presegmented speech sentences. Now days Prosodic features are currently used in most emotion recognition algorithms Prosodic features are relatively simple in their structures and known for their effectiveness in some speech recognition tasks. There are various ways of generating prosodic syllable contour features that have recently been applied to enhance systems for speaker recognition.
General TermsSpeaker Recognition, Mel Frequency Cepstral Coefficient (MFCC), Prosodic.
<p>Automatic speaker recognition is the process to recognizing speaker automatically by their speech/voice on the basis of specific characteristics of his/her speech signal. These voice specific characteristics are called speech features. Over the past six decades many recent advances in the area of speaker recognition have been achieved, but still many problems remains to be solved or require better solutions. The main problems in speaker recognition are session variability, channel mismatch and recording conditions of voice. To develop an efficient speaker recognition system it needs to examine stable parameters of voice features parameters over time, unaffected from variation in speaking, background noise, channel distortion and robust against variation of physical problems. This paper overviews recent advances and general ideas of speaker recognition technology.</p>
Automatic Speaker Recognition (ASR) is use to recognizing persons from their voice. Since the voice of every human is not same because their vocal tract shapes, larynx sizes and other parts of a human voice production system. Automatic Speaker recognition is a procedure to automatically recognizing a speaker or who is speaking by the individual information counted in speech signal/waves. Automatic speaker recognition technique makes it possible to use the speaker's speech to verify their identity. It have many applications for example control access to services such as voice mail, voice dialing, banking by telephone, remote access to computers, telephone shopping, information services, database access services and security control for confidential information areas.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.