Steganalysis of adaptive multi-rate (AMR) speech is a hot topic for controlling cybercrimes grounded in steganography in related speech streams. In this paper, we first present a novel AMR steganalysis model, which utilizes extreme gradient boosting (XGBoost) as the classifier, instead of support vector machines (SVM) adopted in the previous schemes. Compared with the SVM-based model, this new model can facilitate the excavation of potential information from the high-dimensional features and can avoid overfitting. Moreover, to further strengthen the preceding features based on the statistical characteristics of pulse pairs, we present the convergence feature based on the Markov chain to reflect the global characterization of pulse pairs, which is essentially the final state of the Markov transition matrix. Combining the convergence feature with the preceding features, we propose an XGBoost-based steganalysis scheme for AMR speech streams. Finally, we conducted a series of experiments to assess our presented scheme and compared it with previous schemes. The experimental results demonstrate that the proposed scheme is feasible, and can provide better performance in terms of detecting the existing steganography methods based on AMR speech streams.
To prevent the abuse of low‐rate speech‐based steganography from threatening cyberspace security, the corresponding steganalysis approaches have been developed and received significant attention from research community. However, most existing steganalysis methods assume that steganography methods are known in advance, which in practice is impractical. That is why, in this paper, we present three blind detection schemes suitable for steganography in low‐bit‐rate speech streams. The first is based on mixed sample data augmentation. It randomly selects a certain proportion of steganographic samples from the sample set of each steganographic method to form a training set together with the original carrier samples for training to enhance the robustness of the model. The second relies on decision fusion where first step is to train a dedicated classification model for each steganography method and then use a majority voting mechanism in the detection stage to fuse the outputs of each model to give the final detection result. Compared to the other two steganalysis schemes, the third one design the detection model based on self‐paced ensemble according to the distribution characteristics of speech samples. Its main idea is to fully train multiple base classifiers through multiple iterations as well as under‐sampling processes, and organically fuse them to form a powerful ensemble classifier. In each iteration, differing from the traditional ensemble classifier solution, we put more attention to the steganographic samples at the decision boundary for the under‐sampling process of the steganography set composed of multiple steganography methods, rather than randomly selecting steganographic samples. The steganographic samples at the decision boundary are searched using the classification hardness given by the ensemble classifier trained in the last iteration, which is more informative and more conducive to improve the performance of base classifiers. The experimental results show that the proposed three schemes can achieve efficient blind detection for low‐bit‐rate speech‐based steganography, and the steganalysis scheme based on the self‐paced ensemble has the best performance. Specifically, when the embedding rate is at 30%, the accuracy of the steganalysis scheme based on self‐paced ensemble is more than 85%, while the accuracy of the other two steganalysis method is less than 80%. Additionally, the steganalysis scheme based on the self‐paced ensemble learning even outperforms dedicated detectors for specific steganographic methods in terms of recall for steganographic sample detection.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.