Electric Network Frequency Based Audio Forensics Using Convolutional Neural Networks

Mao, Maoyu; Xiao, Zhongcheng; Kang, Xiangui; Li, Xiang; Xiao, Liang

doi:10.1007/978-3-030-56223-6_14

Cited by 8 publications

(5 citation statements)

References 21 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The experimental results demonstrate that there is no proportional enhancement in the model's performance with an increase 14/19 …”

mentioning

confidence: 89%

“…The study conducted experiments on audio data tampering detection by inserting with 1-second, 2-second, and 3-second segments, whose results showed the highest detection accuracy for 3-second insert tampering. Mao et al 19 proposed a two-dimensional convolutional neural network model for binary classification of original audio and tampered audio. Zeng et al 27 proposed an audio tampering detection method based on ENF phase sequence representation learning.…”

Section: Related Workmentioning

confidence: 99%

“…The advent of deep convolutional neural networks provides a compelling alternative, enabling the automatic extraction of latent features from audio without enabling the. Mao et al put forward a two-dimensional convolutional neural network for binary classification of raw and tampered audio 19 . However, their approach focused solely on the fundamental ENF wave, and neglected the higher-order harmonic components, thereby omitting some distinctive characteristics.…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

1D-CNN-based Audio Tampering Detection Using ENF Signals

Shen

2024

Preprint

View full text Add to dashboard Cite

The extensive adoption of digital audio recording has revolutionized its application in digital forensics, particularly in civil litigation and criminal prosecution. Electric Network Frequency (ENF) has emerged as a reliable technique in the field of audio forensics. However, the absence of comprehensive ENF reference datasets limits current ENF-based methods. To address this, this study introduces ATD, a blind audio forensics framework based on a One-Dimensional Convolutional Neural Network (1D-CNN) model. ATD can identify phase mutations and waveform discontinuities within the tampered ENF signal, without relying on an ENF reference database. To enhance feature extraction, the framework incorporates characteristics of the fundamental harmonics of ENF signals. In addition, a denoising method termed ENF Noise Reduction (ENR) based on the Variational Mode Decomposition (VMD) and Robust Filtering Algorithm (RFA) is proposed to reduce the impact of external noise on embedded Electric Network Frequency signals. This study investigates three distinct types of audio tampering—deletion, insertion, and replacement—culminating in the design of binary-class tampering detection scenarios and four-class tampering detection scenarios tailored to these tampering types. ATD achieves a tampering detection accuracy of over 93% in the four-class scenario and exceeds 96% in the binary-class scenario. The effectiveness, efficiency, adaptability, and robustness of ATD in the two and four classification scenarios have been confirmed by extensive experiments.

show abstract

“…The experimental results demonstrate that there is no proportional enhancement in the model's performance with an increase 14/19 …”

mentioning

confidence: 89%

Section: Related Workmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

1D-CNN-based Audio Tampering Detection Using ENF Signals

Shen

2024

Preprint

View full text Add to dashboard Cite

show abstract

“…Lin and Kang [8] proposed a wavelet-filtered ENF signal to highlight the abnormal ENF variations and employed autoregressive coefficients to train the classifier under a supervised-learning framework. Mao et al [9] utilized the multiple ENF features as input eigenvectors to the convolutional neural networks for detecting spliced audio. Meng et al [4] used the spectral entropy method to determine the length of each syllable and calculated the variance of the background noise of each syllable, then judged whether there is an operation of the heterogeneous splicing tampering in the audio by comparing the similarities between the variance of the background noise of each syllable.…”

Section: Audio Splicing Detentionmentioning

confidence: 99%

“…However, when the signal-to-noise ratio between the spliced segments is close or even the same, the performance of the noise levels based audio splicing detection methods will decrease sharply. In addition, based on the fact that inserting an audio segment into another audio recording leads to anomalous variations of the electric network frequency (ENF) signal, several kinds of research [7][8][9] have shown that it is an efficient way to detect spliced audio via the analysis of ENF signal. Whereas due to legal restrictions, it is difficult to obtain concurrent reference datasets of power systems, which makes the ENF based audio splicing detection methods difficult to implement [10].…”

Section: Introductionmentioning

confidence: 99%

ASLNet: An Encoder-Decoder Architecture for Audio Splicing Detection and Localization

Zhang

Zhao

2022

Security and Communication Networks

View full text Add to dashboard Cite

Audio splicing means inserting an audio segment into another audio, which presents a great challenge to audio forensics. In this paper, a novel audio splicing detection and localization method based on an encoder-decoder architecture (ASLNet) is proposed. Firstly, an audio clip is divided into several small audio segments according to the size of the smallest localization region L slr , and the acoustic feature matrix and corresponding binary ground truth mask are created from each audio segment. Then, we concatenate acoustic feature matrices from all segments of an audio clip into an acoustic feature matrix and send it to a fully convolutional network (FCN) based encoder-decoder architecture which consists of a series of convolutional, pooling and transposed convolutional layers to get a binary output mask. Next, the binary output mask is divided into small segments according to the L slr , and the ratio ρ of the number of elements equal to one to the number of all elements in a small segment is calculated. Finally, we compare ρ with the predetermined threshold T to determine whether the corresponding audio segment is spliced. We evaluate the effectiveness of the proposed ASLNet on four datasets produced from publicly available speech corpus. Extensive experiments show that the best detection accuracy of ASLNet for the intradatabase and cross-database evaluation can achieve 0.9965 and 0.9740 receptively, which outperforms the state-of-the-art method.

show abstract

Towards Unconstrained Audio Splicing Detection and Localization with Neural Networks

Hirsch

Rieß

2023

Pattern Recognition, Computer Vision, and Image Processing. ICPR 2022 International Workshops and Challenges

View full text Add to dashboard Cite

Electric Network Frequency Based Audio Forensics Using Convolutional Neural Networks

Cited by 8 publications

References 21 publications

1D-CNN-based Audio Tampering Detection Using ENF Signals

1D-CNN-based Audio Tampering Detection Using ENF Signals

ASLNet: An Encoder-Decoder Architecture for Audio Splicing Detection and Localization

Towards Unconstrained Audio Splicing Detection and Localization with Neural Networks

Contact Info

Product

Resources

About