Posted reviews on the relevant webpages about a product not only motivate the company to enhance quality but also it helps users to decide in favor of (or against) purchasing the product. These reviews are classified by different researchers through subjectivity based, entity based, or aspect based to find the polarity using the supervised or unsupervised technique. However, classification based on interrogatives and non-interrogatives is not handled yet. Datasets of interrogatives are analyzed as identifying Answer Seeking questions from Arabic tweets, question conveying and not conveying Information, Rhetorical Questions while here classifying the sentences into interrogatives and non-interrogatives is the preliminary step, which is a core contribution of proposed work. If detected questions are answered and moreover real time, it could not only motivate a user positively to buy the product but also users feel full duplex communication. In this work, we formulated this problem proposing linguistic and heuristic rules that automatically senses the interrogative and answer promptly based on the aforementioned aspect. If there is no aspect in an asked question, then LSI (Latent Semantic Indexing) generate answer using classified noninterrogatives. LSI is an efficient information retrieval algorithm, which finds the closest document to a given query. Experimental results using two publically available datasets show a precision of 95% and 96% which has 10% increased performance than alternatives machine learning methods Meta Filtered Classifier and Naive Bayes.
A word is a major attribute in the field of opinion/text mining. Based on this attribute, it is decided that whether it is a keyword, aspect, feature, entity, title, or topic? Lots of work has been done to detect such targets using both supervised and unsupervised approaches. These targets can be used in further processing such as text analytics, sentiment analysis, information retrieval, and searches, etc. Latent Dirichlet allocation (LDA) and nonnegative matrix factorization (NMF) are the major models used for detecting topics. Understanding the depth and details of them algorithms are necessary for those who want to extend these models. The research community of opinion/text mining uses them as a black box. However, there is a question about which model is the most accurate for detecting topics. Latent semantic indexing (LSI) is the best approach for detecting the best match for document in a given query. In this study, we analyzed the LDA and NMF models using LSI to determine the best model for opinion/text mining and found that both are very good, but NMF is slightly better than LDA.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.