Text and structural data mining of web and social media (WSM) provides a novel disease surveillance resource and can identify online communities for targeted public health communications (PHC) to assure wide dissemination of pertinent information. WSM that mention influenza are harvested over a 24-week period, 5 October 2008 to 21 March 2009. Link analysis reveals communities for targeted PHC. Text mining is shown to identify trends in flu posts that correlate to real-world influenza-like illness patient report data. We also bring to bear a graph-based data mining technique to detect anomalies among flu blogs connected by publisher type, links, and user-tags.
Susceptibles-infectives-removals (SIR) and its derivatives are the classic mathematical models for the study of infectious diseases in epidemiology. In order to model and simulate epidemics of an infectious disease, we use cellular automata (CA). The simplifying assumptions of SIR and naive CA limit their applicability to the real world characteristics. A global stochastic cellular automata paradigm (GSCA) is proposed, which incorporates geographic and demographic based interactions. The interaction measure between the cells is a function of population density and Euclidean distance, and has been extended to include geographic, demographic and migratory constraints. The progression of diseases using traditional CA and classic SIR are analyzed, and similar behavior to the SIR model is exhibited by GSCA, using the geographic information systems (GIS) gravity model for interactions. The limitations of the SIR and naive CA models of homogeneous population with uniform mixing are addressed by the GSCA model. The GSCA model is oriented to heterogeneous population, and can incorporate interactions based on geography, demography, environment and migration patterns. The progression of diseases can be modeled at higher levels of fidelity using the GSCA model, and facilitates optimal deployment of public health resources for prevention, control and surveillance of infectious diseases.
Analysis of Google influenza-like-illness (ILI) search queries has shown a strongly correlated pattern with Centers for Disease Control (CDC) and Prevention seasonal ILI reporting data. Web and social media provide another resource to detect increases in ILI. This paper evaluates trends in blog posts that discuss influenza. Our key finding is that from 5th October 2008 to 31st January 2009, a high correlation exists between the frequency of posts, containing influenza keywords, per week and CDC influenza-like-illness surveillance data.
BackgroundThe primary aim of the study reported here was to determine the effectiveness of utilizing local spatial variations in environmental data to uncover the statistical relationships between West Nile Virus (WNV) risk and environmental factors. Because least squares regression methods do not account for spatial autocorrelation and non-stationarity of the type of spatial data analyzed for studies that explore the relationship between WNV and environmental determinants, we hypothesized that a geographically weighted regression model would help us better understand how environmental factors are related to WNV risk patterns without the confounding effects of spatial non-stationarity.MethodsWe examined commonly mapped environmental factors using both ordinary least squares regression (LSR) and geographically weighted regression (GWR). Both types of models were applied to examine the relationship between WNV-infected dead bird counts and various environmental factors for those locations. The goal was to determine which approach yielded a better predictive model.ResultsLSR efforts lead to identifying three environmental variables that were statistically significantly related to WNV infected dead birds (adjusted R2 = 0.61): stream density, road density, and land surface temperature. GWR efforts increased the explanatory value of these three environmental variables with better spatial precision (adjusted R2 = 0.71).ConclusionsThe spatial granularity resulting from the geographically weighted approach provides a better understanding of how environmental spatial heterogeneity is related to WNV risk as implied by WNV infected dead birds, which should allow improved planning of public health management strategies.
Background. The primary aim of the study reported here was to determine the effectiveness of utilizing local spatial variations in environmental data to uncover the statistical relationships between West Nile Virus (WNV) risk and environmental factors. Because least squares regression methods do not account for spatial autocorrelation and non-stationarity of the type of spatial data analyzed for studies that explore the relationship between WNV and environmental determinants, we hypothesized that a geographically weighted regression model would help us better understand how environmental factors are related to WNV risk patterns without the confounding effects of spatial non-stationarity.
Methods. We examined commonly mapped environmental factors using both ordinary least squares regression (LSR) and geographically weighted regression (GWR). Both types of models were applied to examine the relationship between WNV-infected dead bird counts and various environmental factors for those locations. The goal was to determine which approach yielded a better predictive model.
Results. LSR efforts lead to identifying three environmental variables that were statistically significantly related to WNV infected dead birds (adjusted R2=0.61): stream density, road density, and land surface temperature. GWR efforts increased the explanatory value of these three environmental variables with better spatial precision (adjusted R2 = 0.71).
Conclusions. The spatial granularity resulting from the geographically weighted approach provides a better understanding of how environmental spatial heterogeneity is related to WNV risk as implied by WNV infected dead birds, which should allow improved planning of public health management strategies.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.