Growing awareness of ongoing and rapid changes in Earth's carbon cycle is motivating a new era of research aimed at improving our understanding of ecosystems as both responders to, and drivers of larger-scale biogeochemical dynamics. In the case of streams and rivers, this has often taken the form of elucidating their role as processors of organic carbon (OC), a capacity that far exceeds their meager size and significantly influences the export of continental OC to marine environments (Cole et al. 2007, Battin et al. 2009, Aufdenkampe et al. 2011). Amplified OC processing has been inferred from observations of smaller export loads relative to inputs, rates of ecosystem respiration that exceed gross primary production, and/or occurrence of supersaturated concentrations of the products of OC decomposition, namely, carbon dioxide (CO 2) and methane (CH 4).
Streams and rivers can substantially modify organic carbon (OC) inputs from terrestrial landscapes, and much of this processing is the result of microbial respiration. While carbon dioxide (CO2) is the major end‐product of ecosystem respiration, methane (CH4) is also present in many fluvial environments even though methanogenesis typically requires anoxic conditions that may be scarce in these systems. Given recent recognition of the pervasiveness of this greenhouse gas in streams and rivers, we synthesized existing research and data to identify patterns and drivers of CH4, knowledge gaps, and research opportunities. This included examining the history of lotic CH4 research, creating a database of concentrations and fluxes (MethDB) to generate a global‐scale estimate of fluvial CH4 efflux, and developing a conceptual framework and using this framework to consider how human activities may modify fluvial CH4 dynamics. Current understanding of CH4 in streams and rivers has been strongly influenced by goals of understanding OC processing and quantifying the contribution of CH4 to ecosystem C fluxes. Less effort has been directed towards investigating processes that dictate in situ CH4 production and loss. CH4 makes a meager contribution to watershed or landscape C budgets, but streams and rivers are often significant CH4 sources to the atmosphere across these same spatial extents. Most fluvial systems are supersaturated with CH4 and we estimate an annual global emission of 26.8 Tg CH4, equivalent to ~15‐40% of wetland and lake effluxes, respectively. Less clear is the role of CH4 oxidation, methanogenesis, and total anaerobic respiration to whole ecosystem production and respiration. Controls on CH4 generation and persistence can be viewed in terms of proximate controls that influence methanogenesis (organic matter, temperature, alternative electron acceptors, nutrients) and distal geomorphic and hydrologic drivers. Multiple controls combined with its extreme redox status and low solubility result in high spatial and temporal variance of CH4 in fluvial environments, which presents a substantial challenge for understanding its larger‐scale dynamics. Further understanding of CH4 production and consumption, anaerobic metabolism, and ecosystem energetics in streams and rivers can be achieved through more directed studies and comparison with knowledge from terrestrial, wetland, and aquatic disciplines.
The rapid growth of data in water resources has created new opportunities to accelerate knowledge discovery with the use of advanced deep learning tools. Hybrid models that integrate theory with state‐of‐the art empirical techniques have the potential to improve predictions while remaining true to physical laws. This paper evaluates the Process‐Guided Deep Learning (PGDL) hybrid modeling framework with a use‐case of predicting depth‐specific lake water temperatures. The PGDL model has three primary components: a deep learning model with temporal awareness (long short‐term memory recurrence), theory‐based feedback (model penalties for violating conversation of energy), and model pretraining to initialize the network with synthetic data (water temperature predictions from a process‐based model). In situ water temperatures were used to train the PGDL model, a deep learning (DL) model, and a process‐based (PB) model. Model performance was evaluated in various conditions, including when training data were sparse and when predictions were made outside of the range in the training data set. The PGDL model performance (as measured by root‐mean‐square error (RMSE)) was superior to DL and PB for two detailed study lakes, but only when pretraining data included greater variability than the training period. The PGDL model also performed well when extended to 68 lakes, with a median RMSE of 1.65 °C during the test period (DL: 1.78 °C, PB: 2.03 °C; in a small number of lakes PB or DL models were more accurate). This case‐study demonstrates that integrating scientific knowledge into deep learning tools shows promise for improving predictions of many important environmental variables.
Although there are considerable site-based data for individual or groups of ecosystems, these datasets are widely scattered, have different data formats and conventions, and often have limited accessibility. At the broader scale, national datasets exist for a large number of geospatial features of land, water, and air that are needed to fully understand variation among these ecosystems. However, such datasets originate from different sources and have different spatial and temporal resolutions. By taking an open-science perspective and by combining site-based ecosystem datasets and national geospatial datasets, science gains the ability to ask important research questions related to grand environmental challenges that operate at broad scales. Documentation of such complicated database integration efforts, through peer-reviewed papers, is recommended to foster reproducibility and future use of the integrated database. Here, we describe the major steps, challenges, and considerations in building an integrated database of lake ecosystems, called LAGOS (LAke multi-scaled GeOSpatial and temporal database), that was developed at the sub-continental study extent of 17 US states (1,800,000 km2). LAGOS includes two modules: LAGOSGEO, with geospatial data on every lake with surface area larger than 4 ha in the study extent (~50,000 lakes), including climate, atmospheric deposition, land use/cover, hydrology, geology, and topography measured across a range of spatial and temporal extents; and LAGOSLIMNO, with lake water quality data compiled from ~100 individual datasets for a subset of lakes in the study extent (~10,000 lakes). Procedures for the integration of datasets included: creating a flexible database design; authoring and integrating metadata; documenting data provenance; quantifying spatial measures of geographic data; quality-controlling integrated and derived data; and extensively documenting the database. Our procedures make a large, complex, and integrated database reproducible and extensible, allowing users to ask new research questions with the existing database or through the addition of new data. The largest challenge of this task was the heterogeneity of the data, formats, and metadata. Many steps of data integration need manual input from experts in diverse fields, requiring close collaboration.Electronic supplementary materialThe online version of this article (doi:10.1186/s13742-015-0067-4) contains supplementary material, which is available to authorized users.
Understanding the factors that affect water quality and the ecological services provided by freshwater ecosystems is an urgent global environmental issue. Predicting how water quality will respond to global changes not only requires water quality data, but also information about the ecological context of individual water bodies across broad spatial extents. Because lake water quality is usually sampled in limited geographic regions, often for limited time periods, assessing the environmental controls of water quality requires compilation of many data sets across broad regions and across time into an integrated database. LAGOS-NE accomplishes this goal for lakes in the northeastern-most 17 US states.LAGOS-NE contains data for 51 101 lakes and reservoirs larger than 4 ha in 17 lake-rich US states. The database includes 3 data modules for: lake location and physical characteristics for all lakes; ecological context (i.e., the land use, geologic, climatic, and hydrologic setting of lakes) for all lakes; and in situ measurements of lake water quality for a subset of the lakes from the past 3 decades for approximately 2600–12 000 lakes depending on the variable. The database contains approximately 150 000 measures of total phosphorus, 200 000 measures of chlorophyll, and 900 000 measures of Secchi depth. The water quality data were compiled from 87 lake water quality data sets from federal, state, tribal, and non-profit agencies, university researchers, and citizen scientists. This database is one of the largest and most comprehensive databases of its type because it includes both in situ measurements and ecological context data. Because ecological context can be used to study a variety of other questions about lakes, streams, and wetlands, this database can also be used as the foundation for other studies of freshwaters at broad spatial and ecological scales.
Abstract. Lake water quality is affected by local and regional drivers, including lake physical characteristics, hydrology, landscape position, land cover, land use, geology, and climate. Here, we demonstrate the utility of hypothesis testing within the landscape limnology framework using a random forest algorithm on a national-scale, spatially explicit data set, the United States Environmental Protection Agency's 2007 National Lakes Assessment. For 1026 lakes, we tested the relative importance of water quality drivers across spatial scales, the importance of hydrologic connectivity in mediating water quality drivers, and how the importance of both spatial scale and connectivity differ across response variables for five important in-lake water quality metrics (total phosphorus, total nitrogen, dissolved organic carbon, turbidity, and conductivity). By modeling the effect of water quality predictors at different spatial scales, we found that lake-specific characteristics (e.g., depth, sediment area-tovolume ratio) were important for explaining water quality (54-60% variance explained), and that regionalization schemes were much less effective than lake specific metrics (28-39% variance explained). Basin-scale land use and land cover explained between 45-62% of variance, and forest cover and agricultural land uses were among the most important basin-scale predictors. Water quality drivers did not operate independently; in some cases, hydrologic connectivity (the presence of upstream surface water features) mediated the effect of regional-scale drivers. For example, for water quality in lakes with upstream lakes, regional classification schemes were much less effective predictors than lake-specific variables, in contrast to lakes with no upstream lakes or with no surface inflows. At the scale of the continental United States, conductivity was explained by drivers operating at larger spatial scales than for other water quality responses. The current regulatory practice of using regionalization schemes to guide water quality criteria could be improved by consideration of lake-specific characteristics, which were the most important predictors of water quality at the scale of the continental United States. The spatial extent and high quality of contextual data available for this analysis makes this work an unprecedented application of landscape limnology theory to water quality data. Further, the demonstrated importance of lake morphology over other controls on water quality is relevant to both aquatic scientists and managers.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.