2021
DOI: 10.1007/s10772-021-09888-y
|View full text |Cite
|
Sign up to set email alerts
|

A deep learning approach for robust speaker identification using chroma energy normalized statistics and mel frequency cepstral coefficients

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
6
1

Relationship

0
7

Authors

Journals

citations
Cited by 9 publications
(3 citation statements)
references
References 35 publications
0
3
0
Order By: Relevance
“…2) Chroma is a feature that focusing on music oriented audio tones. 26 This feature can provide a distribution of tonal variations in audio. The Chroma feature's result is a chromagram built based on 12 (twelve) tone levels.…”
Section: Designed Systemmentioning
confidence: 99%
“…2) Chroma is a feature that focusing on music oriented audio tones. 26 This feature can provide a distribution of tonal variations in audio. The Chroma feature's result is a chromagram built based on 12 (twelve) tone levels.…”
Section: Designed Systemmentioning
confidence: 99%
“…2) Chroma is a feature extraction focusing on musicoriented audio tones [21]. This feature can provide a distribution of tonal variations in audio in the form of a simple feature.…”
Section: B Feature Extraction 1) Mel Frequency Cepstral Coefficients ...mentioning
confidence: 99%
“…The way people hear pitch is periodic, meaning that two pitches that are different by one or more octaves are heard as having the same color, or harmonic role (where, in our scale, an octave is defined as the distance of 12 pitches). The main idea behind chroma features is to combine all spectral information about a given pitch class into a single coefficient [34]. One of the most important things about chroma features is that they capture the harmony and melody of music.…”
Section: ) Mfcc(mel-frequency Cepstral Coefficients)mentioning
confidence: 99%