Background
Coronavirus disease (COVID-19) has affected more than 200 countries and territories worldwide. This disease poses an extraordinary challenge for public health systems because screening and surveillance capacity is often severely limited, especially during the beginning of the outbreak; this can fuel the outbreak, as many patients can unknowingly infect other people.
Objective
The aim of this study was to collect and analyze posts related to COVID-19 on Weibo, a popular Twitter-like social media site in China. To our knowledge, this infoveillance study employs the largest, most comprehensive, and most fine-grained social media data to date to predict COVID-19 case counts in mainland China.
Methods
We built a Weibo user pool of 250 million people, approximately half the entire monthly active Weibo user population. Using a comprehensive list of 167 keywords, we retrieved and analyzed around 15 million COVID-19–related posts from our user pool from November 1, 2019 to March 31, 2020. We developed a machine learning classifier to identify “sick posts,” in which users report their own or other people’s symptoms and diagnoses related to COVID-19. Using officially reported case counts as the outcome, we then estimated the Granger causality of sick posts and other COVID-19 posts on daily case counts. For a subset of geotagged posts (3.10% of all retrieved posts), we also ran separate predictive models for Hubei province, the epicenter of the initial outbreak, and the rest of mainland China.
Results
We found that reports of symptoms and diagnosis of COVID-19 significantly predicted daily case counts up to 14 days ahead of official statistics, whereas other COVID-19 posts did not have similar predictive power. For the subset of geotagged posts, we found that the predictive pattern held true for both Hubei province and the rest of mainland China regardless of the unequal distribution of health care resources and the outbreak timeline.
Conclusions
Public social media data can be usefully harnessed to predict infection cases and inform timely responses. Researchers and disease control agencies should pay close attention to the social media infosphere regarding COVID-19. In addition to monitoring overall search and posting activities, leveraging machine learning approaches and theoretical understanding of information sharing behaviors is a promising approach to identify true disease signals and improve the effectiveness of infoveillance.
Fake or manipulated images propagated through the Web and social media have the capacity to deceive, emotionally distress, and influence public opinions and actions. Yet few studies have examined how individuals evaluate the authenticity of images that accompany online stories. This article details a 6-batch large-scale online experiment using Amazon Mechanical Turk that probes how people evaluate image credibility across online platforms. In each batch, participants were randomly assigned to 1 of 28 news-source mockups featuring a forged image, and they evaluated the credibility of the images based on several features. We found that participants' Internet skills, photo-editing experience, and social media use were significant predictors of image
This article proposes an empirical test of whether aggregate economic behavior maps from the real to the virtual. Transaction data from a large commercial virtual world -the first such data set provided to outside researchers -is used to calculate metrics for production, consumption and money supply based on real-world definitions. Movements in these metrics over time were examined for consistency with common theories of macroeconomic change. The results indicated that virtual economic behavior follows real-world patterns. Moreover, a natural experiment occurred, in that a new version of the virtual world with the same rules came online during the study. The new world's macroeconomic aggregates quickly grew to be nearly exact replicas of those of the existing worlds,
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.