The Covid-19 pandemic that has occurred in Indonesia and even in the world has not yet ended. Various efforts have been made by the Indonesian government to minimize the spread of this virus, such as the implementation of a lockdown, Large-Scale Social Restrictions (PSBB), a ban on going home during the Eid al-Fitr holiday, and so on. One of the new policies issued by the government is the vaccination program, where the government has started implementing the program since early 2021 for the people of Indonesia, which aims to increase antibodies to avoid exposure to the Covid-19 virus. To find out opinions, comments, or feedback given by the public on this new policy, sentiment analysis can be done. The process of this sentiment analysis includes data collection, namely the crawled tweet data originating from the Twitter social media. The data is then selected for further pre-processing stage so that the data is clean and ready for classification. Furthermore, sentiment weighting is carried out for data labeling using a lexicon dictionary and negative words. Then after that, the terms or words are weighted with tf-idf and followed by the feature selection process using Information Gain. Furthermore, the classification process is carried out using the Naive Bayes Classifier algorithm to classify the data into 3 classes, namely positive, negative, and neutral sentiments. The results of this study are to produce a model accuracy rate of 78%, recall 80%, and an AUC score of 0.904.
In the era of the industrial revolution 4.0 as it is today, where the internet is a necessity for people to live their daily lives. The high intensity of internet use in the community, it causes the distribution of information in it to spread widely and quickly. The rapid distribution of information on the internet is also in line with the growing growth of digital data, so that the public opinions contained therein become important things. Because, from this digital data, it can be processed with sentiment analysis in order to obtain useful information about issues that are developing in the community or to find out public opinion on a company's product. The number of studies related to sentiment analysis that applies the Naive Bayes algorithm to solve the problem, so researchers are interested in conducting research on the use of feature selection for the algorithm. Therefore, this research was conducted to determine what feature selection is the most optimal when combined with the Naive Bayes algorithm using the Systematic Literature Review (SLR) research method. The results of this study concluded that the most optimal feature selection method when combined with the Naive Bayes algorithm is the Particle Swarm Optimization (PSO) method with an average accuracy value of 89.08%.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.