The use of bot messaging, that being artificially created messages, has increased since 2010. While not all bots are bad, many have been used to share extreme and divisive views on a range of topics, from policy discussion to brand electronic word of mouth. The issue with bot messaging and its prevalence is that it can affect researchers’ understanding of a topic. For example, if 25% of a dataset is fabricated, decision-making may result in a loss of profit or poor policy formation. To counteract the use of bots, this research note offers a framework to alleviate the potentially destructive nature of bot data and ensure the cleaning of data is thorough and beneficial to decision-making based on social media commentary. The framework is a four-step process, which includes thematic, automated, and characteristic identification stages. We provide three case studies to demonstrate the approach and conclude by providing key practical implications.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.