ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2019
DOI: 10.1109/icassp.2019.8682743
|View full text |Cite
|
Sign up to set email alerts
|

"Hello? Who Am I Talking to?" A Shallow CNN Approach for Human vs. Bot Speech Classification

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
15
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 23 publications
(15 citation statements)
references
References 13 publications
0
15
0
Order By: Relevance
“…As an additional remark on the binary setup, it is worth noting that we also tested the purely data-driven method proposed in [8]. However, due to the heterogeneous nature of the used datasets, and the limited amount of available data when considering balanced classes, we could not achieve an accuracy higher than 0.72 on D dev and 0.71 on D eval .…”
Section: Binary Resultsmentioning
confidence: 99%
See 2 more Smart Citations
“…As an additional remark on the binary setup, it is worth noting that we also tested the purely data-driven method proposed in [8]. However, due to the heterogeneous nature of the used datasets, and the limited amount of available data when considering balanced classes, we could not achieve an accuracy higher than 0.72 on D dev and 0.71 on D eval .…”
Section: Binary Resultsmentioning
confidence: 99%
“…NNs have been proposed both for feature learning and classification steps. For example, in [8], a time frequency representation of the speech signal is presented at the input of a shallow CNN architecture. A similar framework is tested in [14].…”
Section: Fake Speech Detectionmentioning
confidence: 99%
See 1 more Smart Citation
“…Requires precise static calibration during enrollment/testing. Artifact Detection [16,17,32,52,54,91] Trains models to recognize spectral characteristics of synthetic speech.…”
Section: Defensementioning
confidence: 99%
“…An important task within the NLP area is that of automated translation, which usually involves feature analysis (based on neural networks as well as Markov chains, for which software tools such as HTK or SPHINX are available [22]), unit detection (such as phonemes, [23]), syntactic analysis (for example, for word validation [24]), and proper translation via a looking-up procedure in a database. Remarkably, to the best of our knowledge, voice-to-voice translation by direct learning is not present in the literature with few exceptions for classification [25][26][27][28] and some others for translation: [2], which consists in a neural network composed by an 8-layer LSTM, an attention model for phonemes recognition and a voice synthesis module and [3] which uses a codebook of phonemes for translation with three stages, firstly there are some BiLSTM-layers to encoding the input data into phonemes, a convolutional middle layers and a final LSTM-layers for decoding the phonemes into an audio signal.…”
Section: Introductionmentioning
confidence: 99%