Interspeech 2018 2018
DOI: 10.21437/interspeech.2018-1651
|View full text |Cite
|
Sign up to set email alerts
|

Auditory Filterbank Learning for Temporal Modulation Features in Replay Spoof Speech Detection

Abstract: In this paper, we present a standalone replay spoof speech detection (SSD) system to classify the natural vs. replay speech. The replay speech spectrum is known to be affected in the higher frequency range. In this context, we propose to exploit an auditory filterbank learning using Convolutional Restricted Boltzmann Machine (ConvRBM) with the pre-emphasized speech signals. Temporal modulations in amplitude (AM) and frequency (FM) are extracted from the ConvRBM subbands using the Energy Separation Algorithm (E… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
18
0

Year Published

2019
2019
2022
2022

Publication Types

Select...
7
2
1

Relationship

0
10

Authors

Journals

citations
Cited by 33 publications
(18 citation statements)
references
References 33 publications
0
18
0
Order By: Relevance
“…The ConvRBM subband filters in temporal-domain (a) without, and (b) with pre-emphasis, respectively. After [139].…”
Section: ) Representation Of Learning Approachesmentioning
confidence: 99%
“…The ConvRBM subband filters in temporal-domain (a) without, and (b) with pre-emphasis, respectively. After [139].…”
Section: ) Representation Of Learning Approachesmentioning
confidence: 99%
“…Since the ASVspoof 2017 challenge [1,2], more and more researchers begin to focus on playback speech detection [3][4][5][6][7][8][9][10]. Similar to many speech signal processing systems, most of all playback speech detection systems usually consist of front-end feature and back-end classifier [11][12][13][14][15][16][17][18]. For the end-to-end systems such as *Correspondence: xlt@dhu.edu.cn † Jichen Yang and Longting Xu contributed equally to this work.…”
Section: Introductionmentioning
confidence: 99%
“…In some other efforts, DNN framework is used [24,25,26,27,28]. For replay detection, algorithms were proposed using Electronic Network Frequency (ENF), MFCC and fundamental frequency, linear predictive residual signal, time envelope, stratified scattering decomposition coefficient and Inverse MFCC (IMFCC) respectively [29,30,31,32,33,34,35].…”
Section: Introductionmentioning
confidence: 99%