To understand how extraction of different energy sources impacts water resources requires assessment of how water chemistry has changed in comparison with the background values of pristine streams. With such understanding, we can develop better water quality standards and ecological interpretations. However, determination of pristine background chemistry is difficult in areas with heavy human impact. To learn to do this, we compiled a master dataset of sulfate and barium concentrations ([SO], [Ba]) in Pennsylvania (PA, USA) streams from publically available sources. These elements were chosen because they can represent contamination related to oil/gas and coal, respectively. We applied changepoint analysis (i.e., likelihood ratio test) to identify pristine streams, which we defined as streams with a low variability in concentrations as measured over years. From these pristine streams, we estimated the baseline concentrations for major bedrock types in PA. Overall, we found that 48,471 data values are available for [SO] from 1904 to 2014 and 3243 data for [Ba] from 1963 to 2014. Statewide [SO] baseline was estimated to be 15.8 ± 9.6 mg/L, but values range from 12.4 to 26.7 mg/L for different bedrock types. The statewide [Ba] baseline is 27.7 ± 10.6 µg/L and values range from 25.8 to 38.7 µg/L. Results show that most increases in [SO] from the baseline occurred in areas with intensive coal mining activities, confirming previous studies. Sulfate inputs from acid rain were also documented. Slight increases in [Ba] since 2007 and higher [Ba] in areas with higher densities of gas wells when compared to other areas could document impacts from shale gas development, the prevalence of basin brines, or decreases in acid rain and its coupled effects on [Ba] related to barite solubility. The largest impacts on PA stream [Ba] and [SO] are related to releases from coal mining or burning rather than oil and gas development.
Water pollution is a major global environmental problem, and it poses a great environmental risk to public health and biological diversity. This work is motivated by assessing the potential environmental threat of coal mining through increased sulfate concentrations in river networks, which do not belong to any simple parametric distribution. However, existing network models mainly focus on binary or discrete networks and weighted networks with known parametric weight distributions. We propose a principled nonparametric weighted network model based on exponential-family random graph models and local likelihood estimation, and study its model-based clustering with application to large-scale water pollution network analysis. We do not require any parametric distribution assumption on network weights. The proposed method greatly extends the methodology and applicability of statistical network models. Furthermore, it is scalable to large and complex networks in large-scale environmental studies. The power of our proposed methods is demonstrated in simulation studies and a real application to sulfate pollution network analysis in Ohio watershed located in Pennsylvania, United States.
Chemical spills in streams can impact ecosystem or human health. Typically, the public learns of spills from reports from industry, media, or government rather than monitoring data. For example, ∼1300 spills (76 ≥ 400 gallons or ∼1500 L) were reported from 2007 to 2014 by the regulator for natural gas wellpads in the Marcellus shale region of Pennsylvania (U.S.), a region of extensive drilling and hydraulic fracturing. Only one such incident of stream contamination in Pennsylvania has been documented with water quality data in peer-reviewed literature. This could indicate that spills (1) were small or contained on wellpads, (2) were diluted, biodegraded, or obscured by other contaminants, (3) were not detected because of sparse monitoring, or (4) were not detected because of the difficulties of inspecting data for complex stream networks. As a first step in addressing the last problem, we developed a geospatial-analysis tool, GeoNet, that analyzes stream networks to detect statistically significant changes between background and potentially impacted sites. GeoNet was used on data in the Water Quality Portal for the Pennsylvania Marcellus region. With the most stringent statistical tests, GeoNet detected 0.2% to 2% of the known contamination incidents (Na ± Cl) in streams. With denser sensor networks, tools like GeoNet could allow real-time detection of polluting events.
Causality analysis, beyond "mere" correlations, has become increasingly important for scientific discoveries and policy decisions. Many of these real-world applications involve time series data. A key observation is that the causality between time series could vary significantly over time. For example, a rain could cause severe traffic jams during the rush hours, but has little impact on the traffic at midnight. However, previous studies mostly look at the whole time series when determining the causal relationship between them. Instead, we propose to detect the partial time intervals with causality. As it is time consuming to enumerate all time intervals and test causality for each interval, we further propose an efficient algorithm that can avoid unnecessary computations based on the bounds of F -test in the Granger causality test. We use both synthetic datasets and real datasets to demonstrate the efficiency of our pruning techniques and that our method can effectively discover interesting causal intervals in the time series data.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.