Purpose
The purpose of this paper is to identify salient topic categories and outline their evolution patterns and temporal trends in microblogs on a public health emergency across different stages. Comparisons were also examined to reveal the similarities and differences between those patterns and trends on microblog platforms of different languages and from different nations.
Design/methodology/approach
A total of 459,266 microblog entries about the Ebola outbreak in West Africa in 2014 on Twitter and Weibo were collected for nine months after the inception of the outbreak. Topics were detected by the latent Dirichlet allocation model and classified into several categories. The daily tweets were analyzed with the self-organizing map technique and labeled with the most salient topics. The investigated time span was divided into three stages, and the most salient topic categories were identified for each stage.
Findings
In total, 14 salient topic categories were identified in microblogs about the Ebola outbreak and were summarized as increasing, decreasing, fluctuating or ephemeral types. The topical evolution patterns of microblogs and temporal trends for topic categories vary on different microblog platforms. Twitter users were keen on the dynamics of the Ebola outbreak, such as status description, secondary events and so forth, while Weibo users focused on background knowledge of Ebola and precautions.
Originality/value
This study revealed evolution patterns and temporal trends of microblog topics on a public health emergency. The findings can help administrators of public health emergencies and microblog communities work together to better satisfy information needs and physical demands by the public when public health emergencies are in progress.
Purpose: Online reviews on tourism attractions provide important references for potential tourists to choose tourism spots. The main goal of this study is conducting sentiment analysis to facilitate users comprehending the large scale of the reviews, based on the comments about Chinese attractions from Japanese tourism website 4Travel.Design/methodology/approach: Different statistics-and rule-based methods are used to analyze the sentiment of the reviews. Three groups of novel statistics-based methods combining feature selection functions and the traditional term frequency-inverse document frequency (TF-IDF) method are proposed. We also make seven groups of different rulesbased methods. The macro-average and micro-average values for the best classification results of the methods are calculated respectively and the performance of the methods are shown.
Findings:We compare the statistics-based and rule-based methods separately and compare the overall performance of the two method. According to the results, it is concluded that the combination of feature selection functions and weightings can strongly improve the overall performance. The emotional vocabulary in the field of tourism (EVT), kaomojis, negative and transitional words can notably improve the performance in all of three categories. The rule-based methods outperform the statistics-based ones with a narrow advantage.Research limitation: Two limitations can be addressed: 1) the empirical studies to verify the validity of the proposed methods are only conducted on Japanese languages; and 2) the deep learning technology is not been incorporated in the methods.
Practical implications:The results help to elucidate the intrinsic characteristics of the Japanese language and the influence on sentiment analysis. These findings also provide practical usage guidelines within the field of sentiment analysis of Japanese online tourism reviews.
With the rapid development of the Internet, the computational analysis of social networks has grown to be a salient issue. Various research analyses social network topics, and a considerable amount of attention has been devoted to the issue of link prediction. Link prediction aims to predict the interactions that might occur between two entities in the network. To this aim, this study proposed a novel path and node combined approach and constructed a methodology for measuring node similarities. The method was illustrated with five real datasets obtained from different types of social networks. An extensive comparison of the proposed method against existing link prediction algorithms was performed to demonstrate that the path and node combined approach achieved much higher mean average precision (MAP) and area under the curve (AUC) values than those that only consider common nodes (e.g. Common Neighbours and Adamic/Adar) or paths (e.g. Random Walk with Restart and FriendLink). The results imply that two nodes are more likely to establish a link if they have more common neighbours of lower degrees. The weight of the path connecting two nodes is inversely proportional to the product of degrees of nodes on the pathway. The combination of node and topological features can substantially improve the performance of similarity-based link prediction, compared with node-dependent and path-dependent approaches. The experiments also demonstrate that the path-dependent approaches outperform the node-dependent appraoches. This indicates that topological features of networks may contribute more to improving performance than node features.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.