Social networks such as twitter have emerged as social platforms that can impart a massive knowledge base for people to share their unique ideas and perspectives on various topics and issues with friends and families. Sentiment analysis based on machine learning has been successful in discovering the opinion of the people using redundantly available data. However, recent studies have pointed out that imbalanced data can have a negative impact on the results. In this paper, we propose a framework for improved sentiment analysis through various ordered preprocessing steps with the combination of resampling of minority classes to produce greater performance. The performance of the technique can vary depending on the dataset as its initial focus is on feature selection and feature combination. Multiple machine learning algorithms are utilized for the classification of tweets into positive, negative, or neutral. Results have revealed that random minority oversampling can provide improved performance and it can tackle the issue of class imbalance.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.