Feature based Recognition Systems has been an area of intense research for long. The creation of a reliable, robust and sufficiently efficient recognition system has been tried using features from several sources including textual and image sources. Speech based sources have also been used for the creation of such a recognition system. However, variations caused due to differences in individual speaker characteristics, mood variations and inter-mingled noise disturbances make the realization of such a system very difficult. This paper proposes a recognition system for identification of the speaker, language and the words spoken. The system makes use of Adaptive Neuro-Fuzzy Inference paradigm for the same. First, the sampling frequency and the speech features are extracted from the speech database to form speech feature vectors. The features used are LPC, LPCC, RC, LAR, LSF and ARSCIN. The speech database is prepared using 25 speakers including male and female speakers. Five different speaking texts of different languages having same meaning are used to get the best speaker identification accuracy. The languages spoken by the speakers include English, Hindi, Punjabi, Sanskrit and Telugu. The Feature vectors, thus prepared, are fed to an Adaptive Neuro-Fuzzy Inference System for speaker, language and word recognition. The experimental results show the system to be amply efficient and successful in the recognition tasks involved.
Biometric authentication techniques are more consistent and efficient than conventional authentication techniques and can be used in monitoring, transaction authentication, information retrieval, access control, forensics, etc. In this paper, we have presented a detailed comparative analysis between Principle Component Analysis (PCA) and Independent Component Analysis (ICA) which are used for feature extraction on the basis of different Artificial Neural Network (ANN) such as Back Propagation (BP), Radial Basis Function (RBF) and Learning Vector Quantization (LVQ). In this paper, we have chosen "TULIPS1 database, (Movellan, 1995)" which is a small audiovisual database of 12 subjects saying the first 4 digits in English for the incorporation of above methods. The six geometric lip features i.e. height of the outer corners of the mouth, width of the outer corners of the mouth, height of the inner corners of the mouth, width of the inner corners of the mouth, height of the upper lip, and height of the lower lip which extracts the identity relevant information are considered for the research work. After the comprehensive analysis and evaluation a maximum of 91.07% accuracy in speaker recognition is achieved using PCA and RBF and 87.36% accuracy is achieved using ICA and RBF. Speaker identification has a wide scope of applications such as access control, monitoring, transaction authentication, information retrieval, forensics, etc.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.