The proliferation of mobile networked devices has made it easier and faster than ever for people to obtain and share information. However, this occasionally results in the propagation of erroneous information, which may be difficult to distinguish from the truth. The widespread diffusion of such information can result in irrational and poor decision making on potentially important issues. In 2020, this coincided with the global outbreak of Coronavirus Disease (COVID-19), a highly contagious and deadly virus. The proliferation of misinformation about COVID-19 on social media has already been identified as an “infodemic” by the World Health Organization (WHO), posing significant challenges for global governments seeking to manage the pandemic. This has driven an urgent need for methods to automatically detect and identify such misinformation. The research uses multiple deep learning model frameworks to detect misinformation in Chinese and English, and compare them based on different text feature selections. The model learns the textual characteristics of each type of true and misinformation for subsequent true/false prediction. The long and short-term memory (LSTM) model, the gated recurrent unit (GRU) model, and the bidirectional long and short-term memory (BiLSTM) model were selected for fake news detection. BiLSTM produces the best detection result, with detection accuracy reaching 94% for short-sentence English texts, and 99% for long-sentence English texts, while the accuracy for Chinese texts was 82%.
Since the beginning of 2020, the COVID-19 pandemic has killed millions of people around the world, leading to a worldwide panic that has fueled the rapid and widespread dissemination of COVID-19-related disinformation on social media. The phenomenon, described by the World Health Organization (WHO) as an "indodemic" presents a serious challenge to governments and public health authorities, but the spread of misinformation has made human detection less efficient than the rate of spread. While there have been many studies developing automated detection techniques for COVID-19 fake news, the results often refer to high accuracy but rarely to model detection time. This research uses fuzzy theory to extract features and uses multiple deep learning model frameworks to detect Chinese and English COVID-19 misinformation. With the reduction of text features, the detection time of the model is significantly reduced, and the model accuracy does not drop excessively. This study designs two different feature extraction methods based on fuzzy classification and compares the results with different deep learning models. BiLSTM was found to provide the best detection results for COVID-19 misinformation by directly using deep learning models, with 99% accuracy in English and 86% accuracy in Chinese. Applying fuzzy clustering to English COVID-19 fake news data features maintains 99% accuracy while reducing detection time by 10%. For Chinese misinformation, detection time is reduced by 15% at the cost of an 8% drop in accuracy.
Text mining has been common research field, with the emergence of Web 2.0 and the development of social software, the amount of text generated every day has increased dramatically. The texts contain a lot of valuable information, how to analysis for the information from text is very important. Therefore, many research have explored related methods and various fields of text mining, such as sentiment analysis, text clustering, text summarization, etc., However, unlike other numerical data that can be directly calculated in terms of character performance, the calculation must be performed after vector conversion, and the words may have polysemy, which provides more challenges for natural language processing. Due to the above challenges, various techniques are used for data preprocessing before analysis. In addition to the common statistical and discrete methods for text data, methods based on fuzzy logic provide another option for effective natural language analysis, Therefore, in recent years, more and more studies have added Fuzzy logic to additionally capture the context semantics of individual words to help more accurate natural language processing. This survey research discusses multiple text mining methods, The survey paper discusses multiple text mining methods, subfields and application fields, covering the literature published between 2010 and 2022, It is organized to the subtasks to be performed, the methods and natural language processing techniques used, and the application scenarios. At the end of this study, the research has provided the key point discussion and relevant suggestions for text mining combined with fuzzy logic.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.