You-Wei Cheah scite author profile

The FLUXNET2015 dataset provides ecosystem-scale data on CO 2 , water, and energy exchange between the biosphere and the atmosphere, and other meteorological and biological measurements, from 212 sites around the globe (over 1500 site-years, up to and including year 2014). These sites, independently managed and operated, voluntarily contributed their data to create global datasets. Data were quality controlled and processed using uniform methods, to improve consistency and intercomparability across sites. The dataset is already being used in a number of applications, including ecophysiology studies, remote sensing studies, and development of ecosystem and Earth system models. FLUXNET2015 includes derived-data products, such as gap-filled time series, ecosystem respiration and photosynthetic uptake estimates, estimation of uncertainties, and metadata about the measurements, presented for the first time in this paper. In addition, 206 of these sites are for the first time distributed under a Creative Commons (CC-BY 4.0) license. This paper details this enhanced dataset and the processing methods, now made available as open-source codes, making the dataset more accessible, transparent, and reproducible.

show abstract

FLUXNET-CH4: A global, multi-ecosystem dataset and analysis of methane seasonality from freshwater wetlands

Delwiche¹,

Knox²,

Malhotra³

et al. 2021

Preprint

View full text Add to dashboard Cite

Abstract. Methane (CH4) emissions from natural landscapes constitute roughly half of global CH4 contributions to the atmosphere, yet large uncertainties remain in the absolute magnitude and the seasonality of emission quantities and drivers. Eddy covariance (EC) measurements of CH4 flux are ideal for constraining ecosystem-scale CH4 emissions, including their seasonality, due to quasi-continuous and high temporal resolution of flux measurements, coincident measurements of carbon, water, and energy fluxes, lack of ecosystem disturbance, and increased availability of datasets over the last decade. Here, we 1) describe the newly published dataset, FLUXNET-CH4 Version 1.0, the first global dataset of CH4 EC measurements (available at https://fluxnet.org/data/fluxnet-ch4- community-product/). FLUXNET-CH4 includes half-hourly and daily gap-filled and non gap-filled aggregated CH4 fluxes and meteorological data from 79 sites globally: 42 freshwater wetlands, 6 brackish and saline wetlands, 7 formerly drained ecosystems, 7 rice paddy sites, 2 lakes, and 15 uplands. Then, we 2) evaluate FLUXNET-CH4 representativeness for freshwater wetland coverage globally, because the majority of sites in FLUXNET-CH4 Version 1.0 are freshwater wetlands and because freshwater wetlands are a substantial source of total atmospheric CH4 emissions; and 3) provide the first global estimates of the seasonal variability and seasonality predictors of freshwater wetland CH4 fluxes. Our representativeness analysis suggests that the freshwater wetland sites in the dataset cover global wetland bioclimatic attributes (encompassing energy, moisture, and vegetation-related parameters) in arctic, boreal, and temperate regions, but only sparsely cover humid tropical regions. Seasonality metrics of wetland CH4 emissions vary considerably across latitudinal bands. In freshwater wetlands (except those between 20° S to 20° N) the spring onset of elevated CH4 emissions starts three days earlier, and the CH4 emission season lasts 4 days longer, for each degree C increase in mean annual air temperature. On average, the onset of increasing CH4 emissions lags soil warming by one month, with very few sites experiencing increased CH4 emissions prior to the onset of soil warming. In contrast, roughly half of these sites experience the spring onset of rising CH4 emissions prior to the spring increase in gross primary productivity (GPP). The timing of peak summer CH4 emissions does not correlate with the timing for either peak summer temperature or peak GPP. Our results provide seasonality parameters for CH4 modeling, and highlight seasonality metrics that cannot be predicted by temperature or GPP (i.e., seasonality of CH4 peak). The FLUXNET-CH4 dataset provides an open-access resource for CH4 flux synthesis, has a range of applications, and is unique in that it includes coupled measurements of important CH4 drivers such as GPP and temperature. Although FLUXNET-CH4 could certainly be improved by adding more sites in tropical ecosystems and by increasing the number of site-years at existing sites, it is a powerful new resource for diagnosing and understanding the role of terrestrial ecosystems and climate drivers in the global CH4 cycle. All seasonality parameters are available at https://doi.org/10.5281/zenodo.4408468. Additionally, raw FLUXNET-CH4 data used to extract seasonality parameters can be downloaded from https://fluxnet.org/data/fluxnet-ch4-community-product/, and a complete list of the 79 individual site data DOIs is provided in Table 2 in the Data Availability section of this document.

show abstract

Fault Tolerance and Scaling in e-Science Cloud Applications: Observations from the Continuing Development of MODISAzure

Humphrey

Cheah

et al. 2010

View full text Add to dashboard Cite

*Abstract-It can be natural to believe that many of the traditional issues of scale have been eliminated or at least greatly reduced via cloud computing. That is, if one can create a seemingly wellfunctioning cloud application that operates correctly on small or moderatesized problems, then the very nature of cloud programming abstractions means that the same application will run as well on potentially significantly larger problems. In this paper, we present our experiences taking MODISAzure, our satellite data processing system built on the Windows Azure cloud computing platform, from the proof-of-concept stage to a point of being able to run on significantly larger problem sizes (e.g., from national-scale data sizes to global-scale data sizes). To our knowledge, this is the longest-running eScience application on the nascent Windows Azure platform. We found that while many infrastructure-level issues were thankfully masked from us by the cloud infrastructure, it was valuable to design additional redundancy and fault-tolerance capabilities such as transparent idempotent task retry and logging to support debugging of user code encountering unanticipated data issues. Further, we found that using a commercial cloud means anticipating inconsistent performance and black-box behavior of virtualized compute instances, as well as leveraging changing platform capabilities over time. We believe that the experiences presented in this paper can help future eScience cloud application developers on Windows Azure and other commercial cloud providers.

show abstract

Hunting Data Rogues at Scale: Data Quality Control for Observational Data in Research Infrastructures

Pastorello

Gunter

Chu

et al. 2017

View full text Add to dashboard Cite

Data quality control is one of the most time consuming activities within Research Infrastructures (RIs), especially when involving observational data and multiple data providers. In this work we report on our ongoing development of data rogues, a scalable approach to manage data quality issues for observational data within RIs. The motivation for this work started with the creation of the FLUXNET2015 dataset, which includes carbon, water, and energy fluxes plus micrometeorological and ancillary data measured in over 200 sites around the world. To create an uniform dataset, including derived data products, extensive work on data quality control was needed. The unpredictable nature of observational data quality issues makes the automation of data quality control inherently difficult. Developed based on this experience, the data rogues methodology allows for increased automation of quality control activities by systematically identifying, cataloging, and documenting implementations of solutions to data issues. We believe this methodology can be extended and applied to others domains and types of data, making the automation of data quality control a more tractable problem.

show abstract

Visualization of network data provenance

Chen

Plale

Cheah

et al. 2012

View full text Add to dashboard Cite

Visualization facilitates the understanding of scientific data both through exploration and explanation of the visualized data. Provenance also contributes to the understanding of data by containing the contributing factors behind a result. The visualization of provenance, although supported in existing workflow management systems, generally focuses on small (medium) sized provenance data, lacking techniques to deal with big data with high complexity. This paper discusses visualization techniques developed for exploration and explanation of provenance, including layout algorithm, visual style, graph abstraction techniques, and graph matching algorithm, to deal with the high complexity. We demonstrate through application to two extensively analyzed case studies that involved provenance capture and use over three year projects, the first involving provenance of a satellite imagery ingest processing pipeline and the other of provenance in a large-scale computer network testbed.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

You-Wei Cheah

The FLUXNET2015 dataset and the ONEFlux processing pipeline for eddy covariance data

FLUXNET-CH4: A global, multi-ecosystem dataset and analysis of methane seasonality from freshwater wetlands

Fault Tolerance and Scaling in e-Science Cloud Applications: Observations from the Continuing Development of MODISAzure

Hunting Data Rogues at Scale: Data Quality Control for Observational Data in Research Infrastructures

Visualization of network data provenance

Contact Info

Product

Resources

About