Due to advancement in technology, enormous amount of data is generated every day. One of the main challenges of large amount of data is user overloaded with huge volume of data. Hence effective methods are highly required to help user to comprehend large amount of data. This research work proposes effective methods to extract and represent the data. The summarization is applicable to obtain a brief overview of the text and sentiment analysis can obtain emotions expressed in the text computationally. The combined text summarization and sentiment analysis is proposed on BBC news articles. A pronoun replacement based text summarization method is developed and VADER sentiment analyzer is used to determine sentiment information. The 3-D visualization schemes have been provided to represent the sentiment information. The sentiment analysis and classification are performed on original BBC news articles as well as on summarized articles using classifiers, such as Logistic Regression, Random Forest and Adaboost. On original news articles highest classification rate of 84.93%, using summarization of ratio 25%, 50% and 75% highest classification rates of 78.73%, 83.06% and 83.23%, respectively are observed.
Abstract-TheInternet has wide reachability making many users to buy the products online using e-commerce websites. Usually, users provide their opinions, comments, and reviews about the products in social media, e-commerce websites, blogs, etc. The product review comments provided by the customers have rich information about the usage of the products they bought and their sentiments towards those products. In this research, we have collected reviews from Amazon.com and performed sentiment analysis to collect sentiment information. We have proposed 3D visualizations to represent sentiment information, such as sentiment scores and statistics about words used in the reviews. The 3D visualizations are useful to represent large sentiment related information and to have an in-depth understanding of sentiments of users. We have developed a combined classifier using Logistic Regression, Decision Tree and Support Vector Machine. From the reviews, we formed N-gram features using a bag of words and performed sentiment classification using combined classifier. On 10 fold crossvalidation, a maximum classification rate for combined classifier of 90.22% is obtained for sentiment classification.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.