With the rapid development of Internet technology and social networks, a large number of comment texts are generated on the Web. In the era of big data, mining the emotional tendency of comments through artificial intelligence technology is helpful for the timely understanding of network public opinion. The technology of sentiment analysis is a part of artificial intelligence, and its research is very meaningful for obtaining the sentiment trend of the comments. The essence of sentiment analysis is the text classification task, and different words have different contributions to classification. In the current sentiment analysis studies, distributed word representation is mostly used. However, distributed word representation only considers the semantic information of word, but ignore the sentiment information of the word. In this paper, an improved word representation method is proposed, which integrates the contribution of sentiment information into the traditional TF-IDF algorithm and generates weighted word vectors. The weighted word vectors are input into bidirectional long short term memory (BiLSTM) to capture the context information effectively, and the comment vectors are better represented. The sentiment tendency of the comment is obtained by feedforward neural network classifier. Under the same conditions, the proposed sentiment analysis method is compared with the sentiment analysis methods of RNN, CNN, LSTM, and NB. The experimental results show that the proposed sentiment analysis method has higher precision, recall, and F 1 score. The method is proved to be effective with high accuracy on comments. INDEX TERMS Sentiment analysis, artificial intelligence, social network, weighted word vectors, BiLSTM.
The method of text sentiment analysis based on sentiment dictionary often has the problems that the sentiment dictionary doesn't contain enough sentiment words or omits some field sentiment words. In addition, due to the existence of some polysemic sentiment words with positivity, negativity, and neutrality, the words' polarity cannot be accurately expressed, so the accuracy of text sentiment analysis is reduced to some extent. In this paper, an extended sentiment dictionary is constructed. The extended sentiment dictionary contains the basic sentiment words, the field sentiment words, and the polysemic sentiment words, which improves the accuracy of sentiment analysis. The naive Bayesian classifier is used to determine the field of the text in which the polysemic sentiment word is. Thus, the sentiment value of the polysemic sentiment word in the field is obtained. By utilizing the extended sentiment dictionary and the designed sentiment score rules, the sentiment of the text is achieved. The experimental results prove that the proposed sentiment analysis method based on extended sentiment dictionary has certain feasibility and accuracy. The research is meaningful for the sentiment recognition of the comment texts. INDEX TERMS Chinese text sentiment analysis, text classification, naive Bayesian, sentiment dictionary.
With the rapid development of the Internet, the amount of data has grown exponentially. On the one hand, the accumulation of big data provides the basic support for artificial intelligence. On the other hand, in the face of such huge data information, how to extract the knowledge of interest from it has become a matter of general concern. Topic tracking can help people to explore the process of topic development from the huge and complex network texts information. By effectively organizing large-scale news documents, a method for the evolution of news topics over time is proposed in this paper to realize the tracking and evolution of topics in the news text set. First, the LDA (latent Dirichlet allocation) model is used to extract topics from news texts and the Gibbs Sampling method is used to speculate parameters. The topic mining using the K-means method is compared to highlight the advantages of using LDA for topic discovery. Second, the improved single-pass algorithm is used to track news topics. The JS (Jensen-Shannon) divergence is used to measure the topic similarity, and the time decay function is introduced to improve the similarity between topics with the similar time. Finally, the strength of the news topic and the content change of the topic in different time windows are analyzed. The experiments show that the proposed method can effectively detect and track the topic and clearly reflect the trend of topic evolution.
With the rapid development of the Internet, the number of Internet users has grown rapidly, and the Internet has become more and more influential on people's lives. As a result, the amount of network text is increasing rapidly, and it is difficult to extract interested event information from it only by manual reading. Therefore, event extraction technique automatically extracting useful information from a large amount of unstructured texts becomes increasingly important. Event detection is the first step of event extraction task and plays a vital role in it. However, current event detection research lacks comprehensive consideration of the context of the trigger words. A Chinese event detection method based on multi-feature fusion and BiLSTM is proposed in this paper. The contextual information of word is divided into sentence-level and document-level in the method. The contextual information is captured based on BiLSTM model. At the same time, a word representation method suitable for trigger word classification tasks is proposed in this paper. The word representation incorporates semantic information, grammar information, and document-level context information of word. The word vectors in the sentence are sequentially inputted into BiLSTM model to obtain output vectors containing sentence-level contextual information. Finally, output vectors of BiLSTM are inputted into the Softmax classifier to realize the identification of the trigger words. The experimental results show that Chinese Event Detection Based on Multi-feature Fusion and BiLSTM method proposed in this paper has high accuracy.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.