The pandemic has taken the world by storm. Almost the entire world went into lockdown to save the people from the deadly COVID-19. Scientists around the around have come up with several vaccines for the virus. Amongthem, Pfizer, Moderna, and AstraZeneca have become quite famous. General people however have been expressing their feelings about the safety and effectiveness of the vaccines on social media like Twitter. In this study, such tweets are being extracted from Twitter using a Twitter API authentication token. The raw tweets are stored and processed using NLP. The processed data is then classified using a supervised KNN classification algorithm. The algorithm classifies the data into three classes, positive, negative, and neutral. These classes refer to the sentiment of the general people whose Tweets are extracted for analysis. From the analysis it is seen that Pfizer shows 47.29%positive, 37.5% negative and 15.21% neutral, Moderna shows 46.16%positive, 40.71% negative, and 13.13% neutral, AstraZeneca shows 40.08%positive, 40.06% negative and 13.86% neutral sentiment.
Topic models were proposed to detect the underlying semantic structure of large collections of text documents to facilitate the process of browsing and accessing documents with similar ideas and topics. Applying topic models to short text documents to extract meaningful topics is challenging. The problem becomes even more complicated when dealing with short and noisy micro‐posts in Twitter that are about one general topic. In such a case, the goal of applying topic models is to extract subtopics. This results in topics represented by similar sets of keywords, which in turn makes the process of topic interpretation more confusing. In this paper we propose a new method that incorporates Twitter‐LDA, WordNet, and hashtags to enhance the keyword labels that represent each topic. We emphasize the importance of different keywords to different topics based on the semantic relationships and the co‐occurrences of keywords in hashtags. We also propose a method to find the best number of topics to represent the text document collection. Experiments on two real‐life Twitter datasets on fashion suggest that our method performs better than the original Twitter‐LDA in terms of perplexity, topic coherence, and the quality of keywords for topic labeling.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.