Word embedding models become an increasingly important method that embeds words into a high dimensional space. These models have been widely utilized to extract semantic and syntactic features for sentiment analysis. However, using word embedding models cannot be sufficient for sentiment analysis tasks because they do not contain sentiment features. Therefore, word embedding models do not adequately meet the comprehensive needs of sentiment analysis applications that rely on recognizing the polarity of a sentence. In this paper, we propose a sentiment embedding model (Word2Sent model) to tackle the weaknesses of the existing word embedding models for sentiment analysis applications. We developed this model based on the Continuous Bag‐of‐Words model and SentiWordNet lexicon to learn sentiment embedding for each word from its surrounding context words. It preserves semantic and syntactic features and captures implicitly sentiment ones. Besides, it can predict sentiment features in a very low sentiment embeddings dimension than traditional ones. The proposed method provides an improved sentiment classification performance and lowers the computational complexity. Both the accuracy performance and processing time results obtained indicate that the proposed model is particularly promising.
<p>Semantic indexing and document similarity is an important information retrieval system problem in Big Data with broad applications. In this paper, we investigate MapReduce programming model as a specific framework for managing distributed processing in a large of amount documents. Then we study the state of the art of different approaches for computing the similarity of documents. Finally, we propose our approach of semantic similarity measures using WordNet as an external network semantic resource. For evaluation, we compare the proposed approach with other approaches previously presented by using our new MapReduce algorithm. Experimental results review that our proposed approach outperforms the state of the art ones on running time performance and increases the measurement of semantic similarity.</p>
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.