With the growth of social media, document sentiment classification has become an active area of research in this decade. It can be viewed as a special case of topical classification applied only to subjective portions of a document (sources of sentiment). Hence, the key task in document sentiment classification is extracting subjectivity. Existing approaches to extract subjectivity rely heavily on linguistic resources such as sentiment lexicons and complex supervised patterns based on part-of-speech (POS) information. This makes the task of subjective feature extraction complex and resource dependent. In this work, we try to minimize the dependency on linguistic resources in sentiment classification. We propose a simple and statistical methodology called review summary (RSUMM) and use it in combination with well-known feature selection methods to extract subjectivity. Our experimental results on a movie review dataset prove the effectiveness of the proposed methodology.
Many Knowledge workers are increasingly using online resources to find out latest developments in their specialty and articles of interest. To extract relevant information from such multiple online information sources summarization is being used. Current summarization systems produce a uniform version of summary for all users. However summaries which are generic in nature do not cater to the user's background and interests. In this paper we propose to make the summarization process user specific and present a design for generating personalized summaries of online articles that are tailored to each person's interest. The user's data available on web is used for model their background and interest. A controlled user-centered qualitative evaluation carried out on news articles of science and technology domain, indicates better user satisfaction with personalized summaries compared to generic summaries.
Social media has become highly popular in recent years that people are expressing their views, thoughts about any product, movie through reviews. Reviews are having a great influence on people and decisions made by them. This has led researchers and market analyzers to analyze the opinions of users in reviews and model their preferences. Sometimes reviews are also scored in terms of satisfaction score on any product or movie by customer (ratings). These ratings usually vary on a scale from one to five (stars) or very bad to excellent. In this paper we address the problem of attributing a numerical score (one to five stars) to a review. We view it as a multi-label classification (supervised learning) problem and present two approaches, using Naïve Bayes (NB) and Support Vector Machines (SVM's). We focus more on feature representations of reviews widely used; problems associated with them and present solutions which address them.
Abstract. Automatic Document summarization is proving to be an increasingly important task to overcome the information overload. The primary task of document summarization process is to pick subset of sentences as a representative of whole document set. We treat this as a decision making problem and estimate the risk involve in making this decision. We calculate the risk of information loss associated with each sentence and extract sentences based on ascending order of their risk. The experimental result shows that the proposed approach performs better than various state of the art approaches.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.