2021
DOI: 10.3390/s21175892
|View full text |Cite
|
Sign up to set email alerts
|

Age and Gender Recognition Using a Convolutional Neural Network with a Specially Designed Multi-Attention Module through Speech Spectrograms

Abstract: Speech signals are being used as a primary input source in human–computer interaction (HCI) to develop several applications, such as automatic speech recognition (ASR), speech emotion recognition (SER), gender, and age recognition. Classifying speakers according to their age and gender is a challenging task in speech processing owing to the disability of the current methods of extracting salient high-level speech features and classification models. To address these problems, we introduce a novel end-to-end age… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
22
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
7
2

Relationship

0
9

Authors

Journals

citations
Cited by 59 publications
(28 citation statements)
references
References 40 publications
(44 reference statements)
0
22
0
Order By: Relevance
“…In the traditional non-automatic method, the age estimation formula or trend has been obtained separately because males and females have different progression of tooth and bone development and aging processes 11 , 27 , 28 . However, using data augmentation and a CNN, age and gender can be automatically distinguished beyond that achievable by a human observer 26 , 29 , 30 . Technological innovation in artificial intelligence has led to an era in which it is no longer necessary to divide the dataset by gender, and it is possible to easily estimate the age and gender of unidentified persons.…”
Section: Discussionmentioning
confidence: 99%
“…In the traditional non-automatic method, the age estimation formula or trend has been obtained separately because males and females have different progression of tooth and bone development and aging processes 11 , 27 , 28 . However, using data augmentation and a CNN, age and gender can be automatically distinguished beyond that achievable by a human observer 26 , 29 , 30 . Technological innovation in artificial intelligence has led to an era in which it is no longer necessary to divide the dataset by gender, and it is possible to easily estimate the age and gender of unidentified persons.…”
Section: Discussionmentioning
confidence: 99%
“…Today, the speech emotion recognition system (SER) assesses the emotional state of the speaker by examining his/her speech signal [24][25][26]. Work [27] proposes key technologies for recognition of speech emotions based on neural networks and recognition of facial emotions based on SVM, and in paper [28], they show a system of emotion recognition based on an artificial neural network (ANN) and its comparison with a system based on the scheme Hidden Markov Modeling (HMM). Both systems were built on the basis of probabilistic pattern recognition and acoustic phonetic modeling approaches.…”
Section: Introductionmentioning
confidence: 99%
“…In recent times, security surveillance system has employed visual-based tracking and detection technologies for enhancing safety and convenience for human beings [1]. Human tracking and detection systems are important topics in a surveillance scheme.…”
Section: Introductionmentioning
confidence: 99%