2021
DOI: 10.48550/arxiv.2104.03617
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Half-Truth: A Partially Fake Audio Detection Dataset

Abstract: Diverse promising datasets have been designed to hold back the development of fake audio detection, such as ASVspoof databases. However, previous datasets ignore an attacking situation, in which the hacker hides some small fake clips in real speech audio. This poses a serious threat since that it is difficult to distinguish the small fake clip from the whole speech utterance. Therefore, this paper develops such a dataset for half-truth audio detection (HAD). Partially fake audio in the HAD dataset involves onl… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
2
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
1
1

Relationship

1
1

Authors

Journals

citations
Cited by 2 publications
(3 citation statements)
references
References 25 publications
0
2
0
Order By: Relevance
“…Gao et al [208] presented a novel method for the detection of audio deep-fake that used long-range spectrotemporal modulation features. Using a 2D discrete cosine transform (DCT) on a log-mel spectrogram, the system outperforms traditional feature methods such as CQCC [209]. The model leverages spectrum augmentation and feature normalisation to reduce overfitting, resulting in a stateof-the-art system for spoof detection and demonstrating its effectiveness on two external datasets.…”
Section: ) Methods Using Handcrafted Featuresmentioning
confidence: 99%
“…Gao et al [208] presented a novel method for the detection of audio deep-fake that used long-range spectrotemporal modulation features. Using a 2D discrete cosine transform (DCT) on a log-mel spectrogram, the system outperforms traditional feature methods such as CQCC [209]. The model leverages spectrum augmentation and feature normalisation to reduce overfitting, resulting in a stateof-the-art system for spoof detection and demonstrating its effectiveness on two external datasets.…”
Section: ) Methods Using Handcrafted Featuresmentioning
confidence: 99%
“…The replace part is semantically complete, guaranteed by text alignment techniques. This way is similar to the generation process of HAD dataset [15].…”
Section: Clean Fake Audios Generationmentioning
confidence: 97%
“…For the datasets used to spoof the human auditory system, FoR dataset [13] contains fake audios from 7 open resources and real audios from 4 resources. HAD dataset [15] is designed for partially fake audio detection, generated by manipulated the original utterances with genuine or synthesized audio segments. WaveFake [14] collects ten sample sets from six different network architectures across two languages.…”
Section: Related Workmentioning
confidence: 99%