Accurate real-time monitoring systems of influenza outbreaks help public health officials make informed decisions that may help save lives. We show that information extracted from cloud-based electronic health records databases, in combination with machine learning techniques and historical epidemiological information, have the potential to accurately and reliably provide near real-time regional estimates of flu outbreaks in the United States.
BackgroundInfluenza outbreaks pose major challenges to public health around the world, leading to thousands of deaths a year in the United States alone. Accurate systems that track influenza activity at the city level are necessary to provide actionable information that can be used for clinical, hospital, and community outbreak preparation.ObjectiveAlthough Internet-based real-time data sources such as Google searches and tweets have been successfully used to produce influenza activity estimates ahead of traditional health care–based systems at national and state levels, influenza tracking and forecasting at finer spatial resolutions, such as the city level, remain an open question. Our study aimed to present a precise, near real-time methodology capable of producing influenza estimates ahead of those collected and published by the Boston Public Health Commission (BPHC) for the Boston metropolitan area. This approach has great potential to be extended to other cities with access to similar data sources.MethodsWe first tested the ability of Google searches, Twitter posts, electronic health records, and a crowd-sourced influenza reporting system to detect influenza activity in the Boston metropolis separately. We then adapted a multivariate dynamic regression method named ARGO (autoregression with general online information), designed for tracking influenza at the national level, and showed that it effectively uses the above data sources to monitor and forecast influenza at the city level 1 week ahead of the current date. Finally, we presented an ensemble-based approach capable of combining information from models based on multiple data sources to more robustly nowcast as well as forecast influenza activity in the Boston metropolitan area. The performances of our models were evaluated in an out-of-sample fashion over 4 influenza seasons within 2012-2016, as well as a holdout validation period from 2016 to 2017.ResultsOur ensemble-based methods incorporating information from diverse models based on multiple data sources, including ARGO, produced the most robust and accurate results. The observed Pearson correlations between our out-of-sample flu activity estimates and those historically reported by the BPHC were 0.98 in nowcasting influenza and 0.94 in forecasting influenza 1 week ahead of the current date.ConclusionsWe show that information from Internet-based data sources, when combined using an informed, robust methodology, can be effectively used as early indicators of influenza activity at fine geographic resolutions.
The distribution of health care payments to insurance plans has substantial consequences for social policy. Risk adjustment formulas predict spending in health insurance markets in order to provide fair benefits and health care coverage for all enrollees, regardless of their health status. Unfortunately, current risk adjustment formulas are known to underpredict spending for specific groups of enrollees leading to undercompensated payments to health insurers. This incentivizes insurers to design their plans such that individuals in undercompensated groups will be less likely to enroll, impacting access to health care for these groups. To improve risk adjustment formulas for undercompensated groups, we expand on concepts from the statistics, computer science, and health economics literature to develop new fair regression methods for continuous outcomes by building fairness considerations directly into the objective function. We additionally propose a novel measure of fairness while asserting that a suite of metrics is necessary in order to evaluate risk adjustment formulas more fully. Our data application using the IBM MarketScan Research Databases and simulation studies demonstrate that these new fair regression methods may lead to massive improvements in group fairness (e.g., 98%) with only small reductions in overall fit (e.g., 4%).
BackgroundInfluenza causes an estimated 3000 to 50,000 deaths per year in the United States of America (US). Timely and representative data can help local, state, and national public health officials monitor and respond to outbreaks of seasonal influenza. Data from cloud-based electronic health records (EHR) and crowd-sourced influenza surveillance systems have the potential to provide complementary, near real-time estimates of influenza activity. The objectives of this paper are to compare two novel influenza-tracking systems with three traditional healthcare-based influenza surveillance systems at four spatial resolutions: national, regional, state, and city, and to determine the minimum number of participants in these systems required to produce influenza activity estimates that resemble the historical trends recorded by traditional surveillance systems.MethodsWe compared influenza activity estimates from five influenza surveillance systems: 1) patient visits for influenza-like illness (ILI) from the US Outpatient ILI Surveillance Network (ILINet), 2) virologic data from World Health Organization (WHO) Collaborating and National Respiratory and Enteric Virus Surveillance System (NREVSS) Laboratories, 3) Emergency Department (ED) syndromic surveillance from Boston, Massachusetts, 4) patient visits for ILI from EHR, and 5) reports of ILI from the crowd-sourced system, Flu Near You (FNY), by calculating correlations between these systems across four influenza seasons, 2012–16, at four different spatial resolutions in the US. For the crowd-sourced system, we also used a bootstrapping statistical approach to estimate the minimum number of reports necessary to produce a meaningful signal at a given spatial resolution.ResultsIn general, as the spatial resolution increased, correlation values between all influenza surveillance systems decreased. Influenza-like Illness rates in geographic areas with more than 250 crowd-sourced participants or with more than 20,000 visit counts for EHR tracked government-lead estimates of influenza activity.ConclusionsWith a sufficient number of reports, data from novel influenza surveillance systems can complement traditional healthcare-based systems at multiple spatial resolutions.Electronic supplementary materialThe online version of this article (10.1186/s12879-018-3322-3) contains supplementary material, which is available to authorized users.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.