2021
DOI: 10.1016/j.apacoust.2020.107823
|View full text |Cite
|
Sign up to set email alerts
|

Voice gender recognition under unconstrained environments using self-attention

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
11
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
7
1

Relationship

0
8

Authors

Journals

citations
Cited by 20 publications
(11 citation statements)
references
References 14 publications
0
11
0
Order By: Relevance
“…MFCC is the coefficient of the short-time windowed signal obtained by fast Fourier transformation (FFT), which has better results than the time domain operation. MFCC feature extraction mainly includes six steps 18 : pre-weighting, framing, windowing, FFT, Meyer filter bank and discrete cosine transform (DCT), as shown in Fig. 1 .…”
Section: Proposed Modelsmentioning
confidence: 99%
See 1 more Smart Citation
“…MFCC is the coefficient of the short-time windowed signal obtained by fast Fourier transformation (FFT), which has better results than the time domain operation. MFCC feature extraction mainly includes six steps 18 : pre-weighting, framing, windowing, FFT, Meyer filter bank and discrete cosine transform (DCT), as shown in Fig. 1 .…”
Section: Proposed Modelsmentioning
confidence: 99%
“…Jung et al used short-time Fourier transform and MFCC to extract the features of lung sounds, revealed the relationship between lung sounds and pulmonary mechanism, and employed the depth separable CNN to effectively classify four types of lung sounds 17 . Nasef et al reported a recognition technique to distinguish gender using MFCC features and Logistic Regression (LG) classifier, which can be carried out in the presence of background noise and different language, accent, age and emotional states 18 . It can be seen that MFCC is a powerful method to represent intrinsic characteristics of the sound signals.…”
Section: Introductionmentioning
confidence: 99%
“…[35]), voice processing (e.g. [36], [37]) and hash-based cross-modal retrieval of images and texts (e.g. [38]).…”
Section: Related Workmentioning
confidence: 99%
“…For example, average frequency, mode and standard deviation. The second approach is to use the spectral properties of the sound, like MFCC's and Log-Mel features [18].…”
Section: Related Workmentioning
confidence: 99%
“…One of the main uses of spectrograms is sound analysis. The [18] display of signals in the time-frequency field provides many benefits in terms of sound classification. First, the time-frequency conversion is reversible.…”
Section: Figure 2 Mfcc Feature Inference Stepsmentioning
confidence: 99%