Focal areas are on data acquisition and assimilation enabled by AI, advanced methods including experimental/network design/optimization, unsupervised learning (including deep learning), and hardware-related efforts involving AI (e.g., edge computing)
Science ChallengeThe transformational science challenge that we address is the following: -Ensuring, in near real-time, that the data collected from distributed sensor networks is accurate and contains useful information to identify, quantify, and predict watershed and ecosystem dynamic responses to short-and long-term perturbations.
RationaleResearch needs and challenges: The increased prevalence of extreme disturbance events such as droughts, floods, wildfires, rain-on-snow events, and extreme temperatures is having a profound impact on watershed hydrology and biogeochemical cycles 1 (e.g., timescale in the order of days to years). Additionally, a watershed's hydro-biogeochemistry is also altered considerably by changes in long-term mean climate perturbations such as rising temperatures, changes in the magnitude and frequency of precipitation, earlier snowmelt in mountainous regions, reduced capacity to sequester carbon by the loss of wetlands 2-5 , and agricultural intensification (e.g., through enhanced nutrient loading 6 ).To better understand the variable ecosystem response (e.g., biogeochemical stocks and fluxes) under such a wide range of environmental conditions and ecological stressors, a variety of environmental datasets are actively being acquired. These experimental and observational datasets are commonly used in process models in a coupled modeling-experimental (ModEx) approach. The focus is to understand watershed function and key hydro-biogeochemical processes under different environmental and climate stressors. However, there are four major challenges associated with this traditional ModEx approach. 1Q. The first major challenge is related to the quality of the collected data. Current data acquisition techniques are frequently performed manually at the site where data is collected. This is often an expensive and time-consuming process, requiring high power output to ensure data collection devices provide a continuous and reliable data stream. Moreover, the acquired data can be of large volume, if sampled at medium-to high-frequency range (e.g., 10s of Hz to kHz). The low sampling densities, gaps in datasets, sensor fouling over time, and signal drifting pose another set of challenges. This reduced data quality from multiple sensors can result in poor sensor netting 7 . That is, predictability of watershed's response is diminished due to poor overlapping coverage from two or more underperforming sensors. Due to these issues, a number of data processing techniques must be applied to fill in data gaps or interpolate data across space and/or time, which leads to high levels of uncertainty. As a result, data validation and data-worth analysis can take several days after acquisition before it is actively integrated into the process models. 2Q. The second m...