Sentiment analysis plays an important role in the way companies, organizations, or political campaigns are run, making it an attractive target for attacks. In integrity attacks an attacker influences the data used to train the sentiment analysis classification model in order to decrease its accuracy. Previous work did not consider practical constraints dictated by the characteristics of data generated by a sentiment analysis application and relied on synthetic or preprocessed datasets inspired by spam, intrusion detection, or handwritten digit recognition. We identify and demonstrate integrity attacks against document-level sentiment analysis that take into account such practical constraints. Our attacks, while inspired by existing work, require novel improvements to function in a realistic environment where a victim performs typical steps such as data cleaning, labeling, and feature extraction prior to training the classification model. We demonstrate the effectiveness of the attacks on three datasets -two Twitter datasets and an Android dataset.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.