Information is exploding on the web at exponential pace, so online movie review is becoming a substantial information resource for online users. However, users post millions of movie reviews on regular basis, and it is not possible for users to summarize the reviews. Movie review classification and summarization is one of the challenging tasks in natural language processing. Therefore, an automatic approach is demanded to summarize the vast amount of movie reviews, and it will allow the users to speedily distinguish the positive and negative aspects of a movie. This study has proposed an approach for movie review classification and summarization. For movie review classification, bag-of-words feature extraction technique is used to extract unigrams, bigrams, and trigrams as a feature set from given review documents, and represent the review documents as a vector space model. Next, the Naïve Bayes algorithm is employed to classify the movie reviews (represented as a feature vector) into positive and negative reviews. For the task of movie review summarization, Word2vec feature extraction technique is used to extract features from classified movie review sentences, and then semantic clustering technique is used to cluster semantically related review sentences. Different text features are used to calculate the salience score of each review sentence in clusters. Finally, the top-ranked sentences are chosen based on highest salience scores to produce the extractive summary of movie reviews. Experimental results reveal that the proposed machine learning approach is superior than other state-of-the-art approaches.
With the growing information on web, online movie review is becoming a significant information resource for Internet users. However, online users post thousands of movie reviews on daily basis and it is hard for them to manually summarize the reviews. Movie review mining and summarization is one of the challenging tasks in natural language processing. Therefore, an automatic approach is desirable to summarize the lengthy movie reviews, and it will allow users to quickly recognize the positive and negative aspects of a movie. This study employs a feature extraction technique called bag of words (BoW) to extract features from movie reviews and represent the reviews as a vector space model or feature vector. The next phase uses Naïve Bayes machine learning algorithm to classify the movie reviews (represented as feature vector) into positive and negative. Next, an undirected weighted graph is constructed from the pairwise semantic similarities between classified review sentences in such a way that the graph nodes represent review sentences, while the edges of graph indicate semantic similarity weight. The weighted graph-based ranking algorithm (WGRA) is applied to compute the rank score for each review sentence in the graph. Finally, the top ranked sentences (graph nodes) are chosen based on highest rank scores to produce the extractive summary. Experimental results reveal that the proposed approach is superior to other state-of-the-art approaches.
Huge data on the web come from discussion forums, which contain millions of threads. Discussion threads are a valuable source of knowledge for Internet users, as they have information about numerous topics. The discussion thread related to single topic comprises a huge number of reply posts, which makes it hard for the forum users to scan all the replies and determine the most relevant replies in the thread. At the same time, it is also hard for the forum users to manually summarize the bulk of reply posts in order to get the gist of discussion thread. Thus, automatically extracting the most relevant replies from discussion thread and combining them to form a summary are a challenging task. With this motivation behind, this study has proposed a sentence embedding based clustering approach for discussion thread summarization. The proposed approach works in the following fashion: At first, word2vec model is employed to represent reply sentences in the discussion thread through sentence embeddings/sentence vectors. Next, K-medoid clustering algorithm is applied to group semantically similar reply sentences in order to reduce the overlapping reply sentences. Finally, different quality text features are utilized to rank the reply sentences in different clusters, and then the high-ranked reply sentences are picked out from all clusters to form the thread summary. Two standard forum datasets are used to assess the effectiveness of the suggested approach. Empirical results confirm that the proposed sentence based clustering approach performed superior in comparison to other summarization methods in the context of mean precision, recall, and F-measure.
Online forums have become the main source of knowledge over the Internet as data are constantly flooded into them. In most cases, a question in a web forum receives several responses, making it impossible for the question poster to obtain the most suitable answer. Thus, an important problem is how to automatically extract the most appropriate and high-quality answers in a thread. Prior studies have used different combinations of both lexical and nonlexical features to retrieve the most relevant answers from discussion forums, and hence, there is no standard/general set of features that could be effectively used for relevant answer/reply post classification. However, this study proposed an answer detection model that is exclusively relying on lexical features and employs a random forest classifier for classification of answers in discussion boards. Experimental results showed that the proposed answer detection model outperformed the baseline technique and other state-of-the-art machine learning algorithms in terms of classification accuracy on benchmark forum datasets.
In this paper we solve some fifth and sixth order boundary value problems (BVPs) by the improved residual power series method (IRPSM). IRPSM is a method that extends the residual power series method (RPSM) to (BVPs) without requiring exact solution. The presented method is capable to handle both linear and nonlinear boundary value problems (BVPs) effectively. The solutions provided by IRPSM are compared with the actual solution and with the existing solutions. The results demonstrate that the approach is extremely accurate and dependable.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.