Highlights Searches related to COVID-19 and face masks in Taiwan increased rapidly, following the announcements of Taiwan' first imported case and reached its peak as local cases were reported. Searches for handwashing were gradually increased in period of face masks shortage in Taiwan. Google Trends provides information on the most common knowledge needed by users and location of searches. In response to the ongoing outbreak, our results demonstrated that Google Trends could potentially define the proper timing and location for practicing appropriate risk communication strategies to the affected population. AbstractObjective: An emerging outbreak of COVID-19 has been detected in at least 26 countries worldwide.Given this pandemic situation, robust risk communication is urgently needed particularly in affected countries. Therefore, this study explored the potential use of Google Trends (GT) to monitor public restlessness toward COVID-19 epidemic infection in Taiwan. Methods:We retrieved GT data for the specific locations of Taiwan nationwide and subregions using defined search terms related to coronavirus, handwashing, and face masks. J o u r n a l P r e -p r o o fResults: Searches related to COVID-19 and face masks in Taiwan increased rapidly, following the announcements of Taiwan' first imported case and reached its peak as local cases were reported.However, searches for handwashing were gradually increased in period of face masks shortage.Moreover, high to moderate correlations between Google relative search volume (RSV) and COVID-19 cases were found in Taipei (lag-3), New Taipei (lag-2), Taoyuan (lag-2), Tainan (lag-1), Taichung (lag0), and Kaohsiung (lag0). Conclusion:In response to the ongoing outbreak, our results demonstrated that GT could potentially define the proper timing and location for practicing appropriate risk communication strategies to the affected population.
Background: Digital traces are rapidly used for health monitoring purposes in recent years. This approach is growing as the consequence of increased use of mobile phone, Internet, and machine learning. Many studies reported the use of Google Trends data as a potential data source to assist traditional surveillance systems. The rise of Internet penetration (54.7%) and the huge utilization of Google (98%) indicate the potential use of Google Trends in Indonesia. No study was performed to measure the correlation between country wide official dengue reports and Google Trends data in Indonesia. Objective: This study aims to measure the correlation between Google Trends data on dengue fever and the Indonesian national surveillance report. Methods: This research was a quantitative study using time series data (2012–2016). Two sets of data were analyzed using Moving Average analysis in Microsoft Excel. Pearson and Time lag correlations were also used to measure the correlation between those data. Results: Moving Average analysis showed that Google Trends data have a linear time series pattern with official dengue report. Pearson correlation indicated high correlation for three defined search terms with R-value range from 0.921 to 0.937 (p ≤ 0.05, overall period) which showed increasing trend in epidemic periods (2015–2016). Time lag correlation also indicated that Google Trends data can potentially be used for an early warning system and novel tool to monitor public reaction before the increase of dengue cases and during the outbreak. Conclusions: Google Trends data have a linear time series pattern and statistically correlated with annual official dengue reports. Identification of information-seeking behavior is needed to support the use of Google Trends for disease surveillance in Indonesia.
Background South Korea is among the best-performing countries in tackling the coronavirus pandemic by using mass drive-through testing, face mask use, and extensive social distancing. However, understanding the patterns of risk perception could also facilitate effective risk communication to minimize the impacts of disease spread during this crisis. Objective We attempt to explore patterns of community health risk perceptions of COVID-19 in South Korea using internet search data. Methods Google Trends (GT) and NAVER relative search volumes (RSVs) data were collected using COVID-19–related terms in the Korean language and were retrieved according to time, gender, age groups, types of device, and location. Online queries were compared to the number of daily new COVID-19 cases and tests reported in the Kaggle open-access data set for the time period of December 5, 2019, to May 31, 2020. Time-lag correlations calculated by Spearman rank correlation coefficients were employed to assess whether correlations between new COVID-19 cases and internet searches were affected by time. We also constructed a prediction model of new COVID-19 cases using the number of COVID-19 cases, tests, and GT and NAVER RSVs in lag periods (of 1-3 days). Single and multiple regressions were employed using backward elimination and a variance inflation factor of <5. Results The numbers of COVID-19–related queries in South Korea increased during local events including local transmission, approval of coronavirus test kits, implementation of coronavirus drive-through tests, a face mask shortage, and a widespread campaign for social distancing as well as during international events such as the announcement of a Public Health Emergency of International Concern by the World Health Organization. Online queries were also stronger in women (r=0.763-0.823; P<.001) and age groups ≤29 years (r=0.726-0.821; P<.001), 30-44 years (r=0.701-0.826; P<.001), and ≥50 years (r=0.706-0.725; P<.001). In terms of spatial distribution, internet search data were higher in affected areas. Moreover, greater correlations were found in mobile searches (r=0.704-0.804; P<.001) compared to those of desktop searches (r=0.705-0.717; P<.001), indicating changing behaviors in searching for online health information during the outbreak. These varied internet searches related to COVID-19 represented community health risk perceptions. In addition, as a country with a high number of coronavirus tests, results showed that adults perceived coronavirus test–related information as being more important than disease-related knowledge. Meanwhile, younger, and older age groups had different perceptions. Moreover, NAVER RSVs can potentially be used for health risk perception assessments and disease predictions. Adding COVID-19–related searches provided by NAVER could increase the performance of the model compared to that of the COVID-19 case–based model and potentially be used to predict epidemic curves. Conclusions The use of both GT and NAVER RSVs to explore patterns of community health risk perceptions could be beneficial for targeting risk communication from several perspectives, including time, population characteristics, and location.
Background Predictions in pregnancy care are complex because of interactions among multiple factors. Hence, pregnancy outcomes are not easily predicted by a single predictor using only one algorithm or modeling method. Objective This study aims to review and compare the predictive performances between logistic regression (LR) and other machine learning algorithms for developing or validating a multivariable prognostic prediction model for pregnancy care to inform clinicians’ decision making. Methods Research articles from MEDLINE, Scopus, Web of Science, and Google Scholar were reviewed following several guidelines for a prognostic prediction study, including a risk of bias (ROB) assessment. We report the results based on the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines. Studies were primarily framed as PICOTS (population, index, comparator, outcomes, timing, and setting): Population: men or women in procreative management, pregnant women, and fetuses or newborns; Index: multivariable prognostic prediction models using non-LR algorithms for risk classification to inform clinicians’ decision making; Comparator: the models applying an LR; Outcomes: pregnancy-related outcomes of procreation or pregnancy outcomes for pregnant women and fetuses or newborns; Timing: pre-, inter-, and peripregnancy periods (predictors), at the pregnancy, delivery, and either puerperal or neonatal period (outcome), and either short- or long-term prognoses (time interval); and Setting: primary care or hospital. The results were synthesized by reporting study characteristics and ROBs and by random effects modeling of the difference of the logit area under the receiver operating characteristic curve of each non-LR model compared with the LR model for the same pregnancy outcomes. We also reported between-study heterogeneity by using τ2 and I2. Results Of the 2093 records, we included 142 studies for the systematic review and 62 studies for a meta-analysis. Most prediction models used LR (92/142, 64.8%) and artificial neural networks (20/142, 14.1%) among non-LR algorithms. Only 16.9% (24/142) of studies had a low ROB. A total of 2 non-LR algorithms from low ROB studies significantly outperformed LR. The first algorithm was a random forest for preterm delivery (logit AUROC 2.51, 95% CI 1.49-3.53; I2=86%; τ2=0.77) and pre-eclampsia (logit AUROC 1.2, 95% CI 0.72-1.67; I2=75%; τ2=0.09). The second algorithm was gradient boosting for cesarean section (logit AUROC 2.26, 95% CI 1.39-3.13; I2=75%; τ2=0.43) and gestational diabetes (logit AUROC 1.03, 95% CI 0.69-1.37; I2=83%; τ2=0.07). Conclusions Prediction models with the best performances across studies were not necessarily those that used LR but also used random forest and gradient boosting that also performed well. We recommend a reanalysis of existing LR models for several pregnancy outcomes by comparing them with those algorithms that apply standard guidelines. Trial Registration PROSPERO (International Prospective Register of Systematic Reviews) CRD42019136106; https://www.crd.york.ac.uk/prospero/display_record.php?RecordID=136106
Public health agencies have suggested nonpharmaceutical interventions to curb the spread of the COVID-19 infections. The study intended to explore the information-seeking behavior and information needs on preventive measures for COVID-19 in the Philippine context. The search interests and related queries for COVID-19 terms and each of the preventive measures for the period from December 31, 2019 to April 6, 2020 were generated from Google Trends. The search terms employed for COVID-19 were coronavirus, ncov, covid-19, covid19 and “covid 19.” The search terms of the preventive measures considered for this study included “community quarantine”, “cough etiquette”, “face mask” or facemask, “hand sanitizer”, handwashing or “hand washing” and “social distancing.” Spearman’s correlation was employed between the new daily COVID-19 cases, COVID-19 terms and the different preventive measures. The relative search volume for the coronavirus disease showed an increase up to the pronouncement of the country’s first case of COVID-19. An uptrend was also evident after the country’s first local transmission was confirmed. A strong positive correlation (rs = .788, p < .001) was observed between the new daily cases and search interests for COVID-19. The search interests for the different measures and the new daily cases were also positively correlated. Similarly, the search interests for the different measures and the COVID-19 terms were all positively correlated. The search interests for “face mask” or facemask, “hand sanitizer” and handwashing or “hand washing” were more correlated with the search interest for COVID-19 than with the number of new daily COVID-19 cases. The search interests for “cough etiquette”, “social distancing” and “community quarantine” were more correlated with the number of new daily COVID-19 cases than with the search interest for COVID-19. The public sought for additional details such as type, directions for proper use, and where to purchase as well as do-it-yourself alternatives for personal protective items. Personal protective or community measures were expected to be accompanied with definitions and guidelines as well as be available in translated versions. Google Trends could be a viable option to monitor and address the information needs of the public during a disease outbreak. Capturing and analyzing the search interests of the public could support the design and timely delivery of appropriate information essential to drive preventive measures during a disease outbreak.
Background Given the ongoing COVID-19 pandemic situation, accurate predictions could greatly help in the health resource management for future waves. However, as a new entity, COVID-19’s disease dynamics seemed difficult to predict. External factors, such as internet search data, need to be included in the models to increase their accuracy. However, it remains unclear whether incorporating online search volumes into models leads to better predictive performances for long-term prediction. Objective The aim of this study was to analyze whether search engine query data are important variables that should be included in the models predicting new daily COVID-19 cases and deaths in short- and long-term periods. Methods We used country-level case-related data, NAVER search volumes, and mobility data obtained from Google and Apple for the period of January 20, 2020, to July 31, 2021, in South Korea. Data were aggregated into four subsets: 3, 6, 12, and 18 months after the first case was reported. The first 80% of the data in all subsets were used as the training set, and the remaining data served as the test set. Generalized linear models (GLMs) with normal, Poisson, and negative binomial distribution were developed, along with linear regression (LR) models with lasso, adaptive lasso, and elastic net regularization. Root mean square error values were defined as a loss function and were used to assess the performance of the models. All analyses and visualizations were conducted in SAS Studio, which is part of the SAS OnDemand for Academics. Results GLMs with different types of distribution functions may have been beneficial in predicting new daily COVID-19 cases and deaths in the early stages of the outbreak. Over longer periods, as the distribution of cases and deaths became more normally distributed, LR models with regularization may have outperformed the GLMs. This study also found that models performed better when predicting new daily deaths compared to new daily cases. In addition, an evaluation of feature effects in the models showed that NAVER search volumes were useful variables in predicting new daily COVID-19 cases, particularly in the first 6 months of the outbreak. Searches related to logistical needs, particularly for “thermometer” and “mask strap,” showed higher feature effects in that period. For longer prediction periods, NAVER search volumes were still found to constitute an important variable, although with a lower feature effect. This finding suggests that search term use should be considered to maintain the predictive performance of models. Conclusions NAVER search volumes were important variables in short- and long-term prediction, with higher feature effects for predicting new daily COVID-19 cases in the first 6 months of the outbreak. Similar results were also found for death predictions.
This is a PDF file of an article that has undergone enhancements after acceptance, such as the addition of a cover page and metadata, and formatting for readability, but it is not yet the definitive version of record. This version will undergo additional copyediting, typesetting and review before it is published in its final form, but we are providing this version to give early visibility of the article. Please note that, during the production process, errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
scite is a Brooklyn-based startup that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
334 Leonard St
Brooklyn, NY 11211
Copyright © 2023 scite Inc. All rights reserved.
Made with 💙 for researchers