2023
DOI: 10.3390/app13127008
|View full text |Cite
|
Sign up to set email alerts
|

Combined Bidirectional Long Short-Term Memory with Mel-Frequency Cepstral Coefficients Using Autoencoder for Speaker Recognition

Abstract: Recently, neural network technology has shown remarkable progress in speech recognition, including word classification, emotion recognition, and identity recognition. This paper introduces three novel speaker recognition methods to improve accuracy. The first method, called long short-term memory with mel-frequency cepstral coefficients for triplet loss (LSTM-MFCC-TL), utilizes MFCC as input features for the LSTM model and incorporates triplet loss and cluster training for effective training. The second method… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2023
2023
2023
2023

Publication Types

Select...
2

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(1 citation statement)
references
References 34 publications
0
1
0
Order By: Relevance
“…The mel-scale filter bank extracts MFCCs from audio signals [29]. MFCCs represent the short-term power spectrum of sound.…”
Section: Mel-scale Filter Bankmentioning
confidence: 99%
“…The mel-scale filter bank extracts MFCCs from audio signals [29]. MFCCs represent the short-term power spectrum of sound.…”
Section: Mel-scale Filter Bankmentioning
confidence: 99%