2021
DOI: 10.1371/journal.pone.0256858
|View full text |Cite
|
Sign up to set email alerts
|

A general method for estimating the prevalence of influenza-like-symptoms with Wikipedia data

Abstract: Influenza is an acute respiratory seasonal disease that affects millions of people worldwide and causes thousands of deaths in Europe alone. Estimating in a fast and reliable way the impact of an illness on a given country is essential to plan and organize effective countermeasures, which is now possible by leveraging unconventional data sources like web searches and visits. In this study, we show the feasibility of exploiting machine learning models and information about Wikipedia’s page views of a selected g… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
7
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
3

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(7 citation statements)
references
References 38 publications
0
7
0
Order By: Relevance
“… 39 Caldwell et al evaluated internet traffic to influenza‐related pages on the US Centers for Disease Control (CDC) website against the percentage of visits for ILI across three influenza seasons. 37 They found strong correlations for some seasons, particularly at national and regional levels but no correlation in many other cases, particularly when looking at the state level. Depending on which pages were considered, the geographical unit and the season under study, optimal correlations were achieved either with no lag or using ILI data 1 week in advance of online search data.…”
Section: Resultsmentioning
confidence: 96%
See 2 more Smart Citations
“… 39 Caldwell et al evaluated internet traffic to influenza‐related pages on the US Centers for Disease Control (CDC) website against the percentage of visits for ILI across three influenza seasons. 37 They found strong correlations for some seasons, particularly at national and regional levels but no correlation in many other cases, particularly when looking at the state level. Depending on which pages were considered, the geographical unit and the season under study, optimal correlations were achieved either with no lag or using ILI data 1 week in advance of online search data.…”
Section: Resultsmentioning
confidence: 96%
“…Our search identified 10 articles addressing internet search queries or webpage views. 35 , 36 , 37 , 38 , 39 , 40 , 41 , 42 , 43 , 44 Four articles studied Baidu, a popular internet search engine in China. 38 , 39 , 41 , 44 Three of these provided data on both correlative value and timeliness.…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…The specific topics explored by the reviewed studies included Dementia (Brigo et al, 2015 ; Alibudbud, 2023b ), fencing response (Roe et al, 2023 ), tumors (e.g., pancreatic tumors, brain tumors, colorectal cancer) (Naik et al, 2021 ; Mondia et al, 2022 ; Gianfredi et al, 2023 ), substance use disorder (Alibudbud and Cleofas, 2022 ), epilepsy (Brigo et al, 2015 ; Okumura et al, 2016 ), schizophrenia (Adams et al, 2020 ), diabetes mellitus (Potapov et al, 2021 ), pain (e.g., migraine, low back pain, inflammation, sciatica) (Brigo et al, 2015 ; Szmuda et al, 2020 ; Ciaffi et al, 2021 ; Potapov et al, 2021 ), cardiovascular diseases (Potapov et al, 2021 ), gastrointestinal conditions (Potapov et al, 2021 ), dermatological agents (Potapov et al, 2021 ), viral infections (e.g., coronavirus, COVID-19, influenza, Chikungunya) (Laurent and Vickers, 2009 ; Mahroum et al, 2018 ; Provenzano et al, 2019 ; Qiu et al, 2019 ; Gozzi et al, 2020 ; O'Leary and Storey, 2020 ; De Toni et al, 2021 ; Gianfredi et al, 2021 ; Rutovic et al, 2021 ; Storey and O'Leary, 2022 ), autoimmune conditions (e.g., Systemic Lupus Erythematosus) (Sciascia and Radin, 2017 ) various medications (e.g., Abacavir, Zidovudine) (Sciascia and Radin, 2017 ; Apollonio et al, 2018 ; Darrow and Borisova, 2022 ), different diets (e.g., vegetarian) (Nucci et al, 2021 ), frostbite (Laurent and Vickers, 2009 ), hypothermia (Laurent and Vickers, 2009 ), carbon monoxide poisoning (Laurent and Vickers, 2009 ), hyperthermia (Laurent and Vickers, 2009 ), sunburn (Laurent and Vickers, 2009 ), insect bites (Laurent and Vickers, 2009 ), and women's health-related topic (e.g., discrimination) (Wang and Zhang, 2020 ). Therefore, the reviewed studies have been predominantly used in understanding health information utilization for various communicable and non-communicable diseases.…”
Section: Resultsmentioning
confidence: 99%
“…To further improve prediction accuracy, numerous methods often combine various data sources, including infuenza surveillance data [14], web search data [26,27], temperature data [16], air pollutant data [28,29], Wikipedia access data [30], Twitter posts [31], and electronic medical records [32]. For instance, the ARGO method combines Google search term data and the LASSO model to capture people's dynamic search behavior over time-it has been noted to exhibit excellent infuenza prediction performance.…”
Section: Introductionmentioning
confidence: 99%