In this paper, we present our initial study with the recently collected speech database for developing robust speaker recognition systems in Indian context. The database contains the speech data collected across different sensors, languages, speaking styles, and environments, from 200 speakers. The speech data is collected across five different sensors in parallel, in English and multiple Indian languages, in reading and conversational speaking styles, and in office and uncontrolled environments such as laboratories, hostel rooms and corridors etc. The collected database is evaluated using adapted Gaussian mixture model based speaker verification system following the NIST 2003 speaker recognition evaluation protocol and gives comparable performance to those obtained using NIST data sets. Our initial study exploring the impact of mismatch in training and test conditions with collected data finds that the mismatch in sensor, speaking style, and environment result in significant degradation in performance compared to the matched case whereas for language mismatch case the degradation is found to be relatively smaller.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.