Interspeech 2020 2020
DOI: 10.21437/interspeech.2020-2723
|View full text |Cite
|
Sign up to set email alerts
|

Siamese Convolutional Neural Network Using Gaussian Probability Feature for Spoofing Speech Detection

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
9
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
3
3
2

Relationship

0
8

Authors

Journals

citations
Cited by 23 publications
(9 citation statements)
references
References 15 publications
0
9
0
Order By: Relevance
“…Fig. 5 shows [13] 0.1000 5.06 FFT-LCNN [13] 0.1028 4.53 LFCC-Siamese CNN [15] 0.0930 3.79 FFT-LCGRNN [7] 0.0776 3.03 RW-Resnet [19] 0.0820 2.98 Ling et al [16] 0.0510 1.87 FFT-L-SENet [38] 0.0368 1.14 AASIST [7] 0.0347 1.13 LPS(F0) (ours) 0.0358 1.21 (b) Primary systems System t-DCF EER% T05 [28] 0.0069 0.22 T45 [13] 0.0510 1.84 T60 [3] 0.0755 2.64 GMM fusion [26] 0.0740 2.92 T24 [28] 0.0953 3.45 T50 [36] 0.1671 3.56 (Imag(L)+Real(H)) + LPS(F0) (ours) 0.0143 0.43 the detailed performance of LPS in different attacks of the evaluation set.…”
Section: Effectiveness Of F0 Subbandmentioning
confidence: 99%
“…Fig. 5 shows [13] 0.1000 5.06 FFT-LCNN [13] 0.1028 4.53 LFCC-Siamese CNN [15] 0.0930 3.79 FFT-LCGRNN [7] 0.0776 3.03 RW-Resnet [19] 0.0820 2.98 Ling et al [16] 0.0510 1.87 FFT-L-SENet [38] 0.0368 1.14 AASIST [7] 0.0347 1.13 LPS(F0) (ours) 0.0358 1.21 (b) Primary systems System t-DCF EER% T05 [28] 0.0069 0.22 T45 [13] 0.0510 1.84 T60 [3] 0.0755 2.64 GMM fusion [26] 0.0740 2.92 T24 [28] 0.0953 3.45 T50 [36] 0.1671 3.56 (Imag(L)+Real(H)) + LPS(F0) (ours) 0.0143 0.43 the detailed performance of LPS in different attacks of the evaluation set.…”
Section: Effectiveness Of F0 Subbandmentioning
confidence: 99%
“…min-tDCF EER Spec+LFCC+CQT+SE-Res2Net [18] 0.0452 1.89 LFCC-GMM+GAT-S+GAT-T+RawNet2 0.0476 1.68 LFCC+LFCC-CMVN+CQT+FFT+LCNN+ 0.0510 1.86 LFCC-GMM [9] ResNet18+LMCL+FM [17] 0.0520 1.81 GAT-S+GAT-T+RawNet2 0.0635 2.21 LFCC-GMM+RawNet2 0.0643 2.33 GAT-S+RawNet2 0.0692 2.29 Ensemble model [10] 0.0755 2.64 GAT-S+GAT-T 0.0844 4.30 GAT-T+RawNet2 0.0854 2.61 GAT-T 0.0894 4.71 LFCC-GMM [7] 0.0904 3.50 GAT-S 0.0914 4.48 Siamese CNN [11] 0.0930 3.79 FG-CQT+LCNN+CE [41] 0.1020 4.07 LFB-ResNet18 [17] 0 ture and origins of the artefacts being detected with spectral and temporal attention and then to link these to specific spoofing attacks and the algorithmic origins.…”
Section: Systemmentioning
confidence: 99%
“…CNNs are particularly appealing because of their capacity to extract localised artefacts within spectro-temporal decompositions such as a spectrogram. For both the ASVspoof 2017 [8] and ASVspoof 2019 [9][10][11] challenges, CNN-based approaches were among the best performing systems. More elaborate systems, such as those based upon ResNet architectures, are now attracting greater interest, and enable the learning of deeper networks using residual blocks with skip connections [12][13][14][15][16][17][18][19].…”
Section: Introductionmentioning
confidence: 99%
“…Since not all study was evaluated using the same criterion, this creates a problem when comparing the works. Hence, some recent works such as [62,145], and [140] provided more than one evaluation criteria for performance comparison. A fair comparison of voice PAD in terms of performance may be made through the standardization of evaluation criteria.…”
Section: Evaluation Criteriamentioning
confidence: 99%