Singing voice identification using harmonic spectral envelope

Loni, Deepali Yoginath; Subbaraman, Shaila

doi:10.1109/infop.2015.7489362

Cited by 2 publications

(2 citation statements)

References 13 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Further, the preprocessing of the images is comparatively small to other approaches since the voice recordings must only be transformed into histograms without additional feature extraction (i.e., no information loss) [50], allowing the approach to be highly objective and reproducible.…”

Section: Discussionmentioning

confidence: 99%

“…Another limitation of spectrograms is the need for feature extraction methods such as Fourier transformation in order to determine and extract the most relevant sub-bands [35]. On the other side, histograms allow identifying the probability distribution of different such as frequencies in the case of voice signals [50]. Furthermore, histograms address the limitations of spectrograms as no feature extraction is necessary, and long-term temporal dependencies can be depicted.…”

Section: Histogram-based Visualization Of Voice Signalsmentioning

confidence: 99%

See 1 more Smart Citation

High-Performance Fake Voice Detection on Automatic Speaker Verification Systems for the Prevention of Cyber Fraud with Convolutional Neural Networks

Buettner¹,

Gross²,

Roessler³

et al. 2022

Proceedings of the Annual Hawaii International Conference on System Sciences

View full text Add to dashboard Cite

This study proposes a highly effective data analytics approach to prevent cyber fraud on automatic speaker verification systems by classifying histograms of genuine and spoofed voice recordings. Our deep learning-based lightweight architecture advances the application of fake voice detection on embedded systems. It sets a new benchmark with a balanced accuracy of 95.64% and an equal error rate of 4.43%, contributing to adopting artificial intelligence technologies in organizational systems and technologies. As fake voice-related fraud causes monetary damage and serious privacy concerns for various applications, our approach improves the security of such services, being of high practical relevance. Furthermore, the post-hoc analysis of our results reveals that our model confirms image texture analysis-related findings of prior studies and discovers further voice signal features (i.e., textural and contextual) that can advance future work in this field.

show abstract

Section: Discussionmentioning

confidence: 99%

Section: Histogram-based Visualization Of Voice Signalsmentioning

confidence: 99%