Online posts have gradually become a major carrier of network public opinion in social media, and the social network hotspots are the important basis for the study of network public opinion. Therefore, it is significant to extract hotspots for monitoring Internet public opinion from online posts textual big data. However, the current hotspot extraction methods are focused on the users’ features that are based on textual big data with spam and low-quality content. Meanwhile, these methods seldomly consider the time span of posts and the popularity of users. Accordingly, this article presents a hotspots information extraction hybrid solution of online posts’ textual data. Firstly, a filtering strategy to obtain more high-quality textual data is designed. Secondly, the topic hot degree is presented by considering the average number of replies and the popularity of the participant. Thirdly, an improved co-word analysis technology is used to search the same topic posts and Bisecting k-means clustering algorithm using repliers’ popularity and key posts are designed for studying and monitoring the hotspots of online posts in a valid big data environment. Finally, the proposed algorithms are verified in experiments by extracting the hotspots of online posts from the dataset. The results show that the data filtering strategy can help to obtain more valuable information and decrease the computing time. The results also demonstrate that the proposed solution can help to obtain hotspots comparing the traditional methods, and the hot degree can reflect the trend of the online post by comparing the traditional methods.
It is well-established that, in the past few years, internet users have rapidly increased. Meanwhile, various types of fake information (such as fake news or rumors) have been flooding social media platforms or online communities. The effective containing or controlling of fake news or rumor has drawn wide attention from areas such as academia to social media platforms. For that reason, numerous studies have focused on this subject from different perspectives, such as employing complex networks and spreading models. However, in the real online community, misinformation usually spreads quickly to thousands of users within minutes. Conventional studies are too theoretical or complicated to be applied to practical applications, and show a lack of fast responsiveness and poor containing effects. Therefore, in this work, a hybrid strategy exploiting the multi-dimensional data of users and content was proposed for the fast containing of fake information in the online community. The strategy is mainly composed of three steps: the fast detection of fake information by continuously updating the content comparison dataset according to the specific hot topic and the fake contents; creating spreading force models and user divisions via historical data, and limiting the propagation of fake information based on the content and user division. Finally, an experiment was set up online with BBS (Bulletin Board System), and the acquired results were analyzed by comparison with other methods in different metrics. From the extracted results, it has been demonstrated that the proposed solution clearly outperforms traditional methods.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.