Deep neural networks (DNNs) have achieved remarkable success in various tasks (e.g., image classification, speech recognition, and natural language processing). However, researches have shown that DNN models are vulnerable to adversarial examples, which cause incorrect predictions by adding imperceptible perturbations into normal inputs. Studies on adversarial examples in image domain have been well investigated, but in texts the research is not enough, let alone a comprehensive survey in this field. In this paper, we aim at presenting a comprehensive understanding of adversarial attacks and corresponding mitigation strategies in texts. Specifically, we first give a taxonomy of adversarial attacks and defenses in texts from the perspective of different natural language processing (NLP) tasks, and then introduce how to build a robust DNN model via testing and verification. Finally, we discuss the existing challenges of adversarial attacks and defenses in texts and present the future research directions in this emerging field.
The social network has become the primary medium of rumor propagation. Moreover, manual identification of rumors is extremely time-consuming and laborious. It is crucial to identify rumors automatically. Machine learning technology is widely implemented in the identification and detection of misinformation on social networks. However, the traditional machine learning methods profoundly rely on feature engineering and domain knowledge, and the learning ability of temporal features is insufficient. Furthermore, the features used by the deep learning method based on natural language processing are heavily limited. Therefore, it is of great significance and practical value to study the rumor detection method independent of feature engineering and effectively aggregate heterogeneous features to adapt to the complex and variable social network. In this paper, a deep neural network- (DNN-) based feature aggregation modeling method is proposed, which makes full use of the knowledge of propagation pattern feature and text content feature of social network event without feature engineering and domain knowledge. The experimental results show that the feature aggregation model has achieved 94.4% of accuracy as the best performance in recent works.
State‐of‐the‐art adversarial attacks in the text domain have shown their power to induce machine learning models to produce abnormal outputs. The samples generated in these attacks have three important attributes: attack ability, transferability, and imperceptibility. However, compared with the other two attributes, the imperceptibility of adversarial examples has not been well investigated. Unlike the pixel‐level perturbations in images, adversarial perturbations in the text are usually traceable, reflecting changes in characters, words, or sentences. The generation of imperceptible samples in texts is more difficult than in images. Therefore, how to constrain adversarial perturbations added in the text is a crucial step to construct more natural adversarial texts. Unfortunately, recent studies merely select measurements to constrain the added adversarial perturbations, but none of them explain where these measurements are suitable, which one is better, and how they perform in different kinds of adversarial attacks. In this paper, we fill this gap by comparing the performance of these metrics in various attacks. Furthermore, we propose a stricter constraint for word‐level attacks to obtain more imperceptible samples. It is also helpful to enhance existing word‐level attacks for adversarial training.
Fuzzing is considered to be an essential approach to guarantee the reliability of deep neural networks (DNNs) based systems. The DNN fuzzing leverages various inputs prioritization methods to guide the testing process. The current research mainly focus on constructing testing metrics that symbolize the logical representation of the DNN to guide the generation of test cases, which neglects the potential performance brought by implementing heuristic algorithm. Moreover, the straightforward implementation of queue structure can not represent the metamorphic relationships between generated inputs in DNN fuzzing. Therefore, developing the appropriate heuristic algorithm‐based inputs prioritization method is critical to improve the performance of DNN fuzzers. In this paper, we propose a Monte Carlo Tree Search (MCTS) based inputs prioritization method called E
x
2 $E{x}^{2}$ (Exploration and Exploitation) that formulates DNN testing exploration as the sequential decision process. The technique introduces an innovative tree‐structure design that schedules inputs from the statistical perspective. Different from traditional DNN testing, the batch pool is maintained in the form of nodes in MCTS. The links between nodes precisely represent the metamorphic relationship between input batches, which indicates the potential value for in‐depth search. Furthermore, a novel simulation mechanism is implemented to adapt MCTS in DNN testing, which attain better coverage feedback. The effectiveness of our method is comprehensively investigated on six popular deep learning models from LeNet and VGG families. The comparison experiments are conducted between DeepHunter, TensorFuzz, and DeepSmartFuzzer to demonstrate efficacy on various testing metrics. The experimental results show that the E
x
2 $E{x}^{2}$ significantly enhance the coverage gain of DNN fuzzing up to 30% against the best performance in comparison groups.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.