Hybrid machine learning classification scheme for speaker identification

Karthikeyan, V.; Suja, S.

doi:10.1111/1556-4029.15006

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

Supporting

Mentioning

Contrasting

Year Published

2022

2024

Publication Types

Select...

Article3

Other1

Relationship

Self Cite0

Independent4

Authors

Journals

Cited by 4 publications

References 44 publications

(95 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

A stacked convolutional neural network framework with multi-scale attention mechanism for text-independent voiceprint recognition

Karthikeyan,

Suja Priyadharsini

2024

Pattern Anal Applic

View full text Add to dashboard Cite

A stacked convolutional neural network framework with multi-scale attention mechanism for text-independent voiceprint recognition

Karthikeyan,

Suja Priyadharsini

2024

Pattern Anal Applic

View full text Add to dashboard Cite

A focus module-based lightweight end-to-end CNN framework for voiceprint recognition

Velayuthapandian¹,

Subramoniam

2023

SIViP

View full text Add to dashboard Cite

Speaker identi cation is a recognition problem that entails identifying a spokesperson from a set of consecutive time-series information. Because voice is a continuous one-dimensional time-stream, the majority of recent experimental techniques use convolutional neural networks (CNNs) or deep neural networks (DNNs). Because of the spectrogram in audio, the spatial attributes of utterances (which correlates to the speech spectra) and CNN are appropriate for spatial characteristic extraction. Simultaneously, the signal is time-series data, and DNN can better capture extended speeches than deep models. This work presents a DNN model for speaker identi cation using a jump-connected onedimensional convolutional neural network (1-D CNN) with a focus module (FM). The 1-D convolutional layer integrated with FM is employed in the presented model for speaker characteristic extraction and lessens heterogeneity in the temporal and spatial domains, allowing for quicker layer processing. Furthermore, the layered CNN hopping interconnection is employed to overcome the connectivity glitches, and a solution based on softmax loss and smooth L1-norm combined regulation is presented to increase e ciency. The ELSDSR, TIMIT, NIST, 16000PCM, and experimental audio datasets were used to test the suggested network model. The Equal Error Rate (EER) of end-to-end CNN for voiceprint identi cation is enhanced by 9.02% when assessed to baseline approaches, according to the experimental data. Our proposed DNN model, which we term the deep FM-1D CNN, had a high recognition accuracy of 99.21 percent in the experiments.At the same time, the observations show that the suggested network model outperforms other models in terms of robustness. This method could be used for other types of research, such as language modelling, with further optimization.

show abstract

Modified layer deep convolution neural network for text-independent speaker recognition

Karthikeyan

Suja

2022

Journal of Experimental & Theoretical Artificial Intelligen

View full text Add to dashboard Cite

Hybrid machine learning classification scheme for speaker identification

Cited by 4 publications

References 44 publications

A stacked convolutional neural network framework with multi-scale attention mechanism for text-independent voiceprint recognition

A stacked convolutional neural network framework with multi-scale attention mechanism for text-independent voiceprint recognition

A focus module-based lightweight end-to-end CNN framework for voiceprint recognition

Modified layer deep convolution neural network for text-independent speaker recognition

Contact Info

Product

Resources

About