In recent years, by merit of convenient and unique features, bio-authentication techniques have been applied to identify and authenticate a person based on his/her spoken words and/or sentences. Among these techniques, speaker recognition/identification is the most convenient one, providing a secure and strong authentication solution viable for a wide range of applications.In this paper, to safeguard real-world objects, like buildings, we develop a speaker identification system named mel frequency cepstral coefficients (MFCC)-based speaker identification system for access control (MSIAC for short), which identifies a speaker U by first collecting U's voice signals and converting the signals to frequency domain. An MFCC-based human auditory filtering model is utilized to adjust the energy levels of different frequencies as U's voice quantified features. Next, a Gaussian mixture model is employed to represent the distribution of the logarithmic features as U's specific acoustic model. When a person, eg, x, would like to access a real-world object protected by the MSIAC, x's acoustic model is compared with known-people's acoustic models. Based on the identification result, the MSIAC will determine whether the access will be accepted or denied. KEYWORDS acoustic model, Fourier transformation, Gaussian mixture model, mel frequency cepstral coefficients, speaker identification 1 | INTRODUCTIONIn this information era, numerous high-tech products gradually enter our everyday lives and significantly change our living habits and patterns. The biometrics identification technology that provides us with easier and more convenient methods to identify people has gradually replaced some existing authentication techniques, which use passwords or pin numbers to authenticate users. But the passwords or pin numbers may be forgotten or forged and are no longer considered to offer a high level of security. The face recognition systems used at airport halls 1 and the voice assistant SIRI of iPhone 2 are two examples of the biometric identification systems.On the one hand, voice has been the most direct and natural method for us to express ideas, communicate with others, and do something for interaction. Therefore, recognizing people's identities from user's dialogue voice and contents and then providing the corresponding services should be a better method to practically make our daily lives easier. Up to present, speech recognition technology 3 has been well developed and applied to our living activities. But speaker recognition technology 4 is still far away from its practical applications. The reasons are that (i) there are too many parameters needed to be processed for speaker recognition; (ii) it is hard to collect voice features completely; and (iii) the identification process is complicated and takes a long time for calculation. Thus, it is difficult to be applied to those applications that need immediate response. Furthermore, the studies of speaker identification nowadays are partial, rather than whole. For example, the Hidden ...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.