Deborah Agarwal scite author profile

Physical samples are foundational entities for research across biological, Earth, and environmental sciences. Data generated from sample-based analyses are not only the basis of individual studies, but can also be integrated with other data to answer new and broader-scale questions. Ecosystem studies increasingly rely on multidisciplinary team-science to study climate and environmental changes. While there are widely adopted conventions within certain domains to describe sample data, these have gaps when applied in a multidisciplinary context. In this study, we reviewed existing practices for identifying, characterizing, and linking related environmental samples. We then tested practicalities of assigning persistent identifiers to samples, with standardized metadata, in a pilot field test involving eight United States Department of Energy projects. Participants collected a variety of sample types, with analyses conducted across multiple facilities. We address terminology gaps for multidisciplinary research and make recommendations for assigning identifiers and metadata that supports sample tracking, integration, and reuse. Our goal is to provide a practical approach to sample management, geared towards ecosystem scientists who contribute and reuse sample data.

show abstract

Database Maintenance, Data Sharing Policy, Collaboration

Papale

Agarwal

Baldocchi

et al. 2011

View full text Add to dashboard Cite

National Alliance for Water Innovation (NAWI) Master Technology Roadmap

Sedlak¹,

Mauter²,

Macknick³

et al. 2021

View full text Add to dashboard Cite

Imputation of Contiguous Gaps and Extremes of Subhourly Groundwater Time Series Using Random Forests

Dwivedi

Mital

Faybishenko

et al. 2022

J Mach Learn Model Comput

View full text Add to dashboard Cite

Machine learning can provide sustainable solutions to gap-fill groundwater (GW) data needed to adequately constrain watershed models. However, imputing missing extremes is more challenging than other parts of a hydrograph. To impute missing subhourly data, including extremes, within GW time-series data collected at multiple wells in the East River watershed, located in southwestern Colorado, we consider a single-well imputation (SWI) and a multiple-well imputation (MWI) approach. SWI gap-fills missing GW entries in a well using the same well's time-series data; MWI gap-fills a specific well's missing GW entry using the time series of neighboring wells. SWI takes advantage of linear interpolation and random forest (RF) approaches, whereas MWI exploits only the RF approach. We also use an information entropy framework to develop insights into how missing data patterns impact imputation. We discovered that if gaps were at random intervals, SWI could accurately impute up to 90% of missing data over an approximately two-year period. Contiguous gaps constituted more complex scenarios for imputation and required the use of MWI. Information entropy suggested that if gaps were contiguous, up to 50% of missing GW data could be estimated accurately over an approximately two-year period. The RF-feature importance suggested that a time feature (months) and a space feature (neighboring wells) were the most important predictors in the SWI and MWI. We also noted that neither SWI nor MWI methods could capture the missing extremes of a hydrograph. To counter this, we developed a new sequential approach and demonstrated the imputation of missing extremes in a GW time series with high accuracy.

show abstract

Enabling FAIR data in Earth and environmental science with community-centric (meta)data reporting formats

et al. 2022

View full text Add to dashboard Cite

Research can be more transparent and collaborative by using Findable, Accessible, Interoperable, and Reusable (FAIR) principles to publish Earth and environmental science data. Reporting formats—instructions, templates, and tools for consistently formatting data within a discipline—can help make data more accessible and reusable. However, the immense diversity of data types across Earth science disciplines makes development and adoption challenging. Here, we describe 11 community reporting formats for a diverse set of Earth science (meta)data including cross-domain metadata (dataset metadata, location metadata, sample metadata), file-formatting guidelines (file-level metadata, CSV files, terrestrial model data archiving), and domain-specific reporting formats for some biological, geochemical, and hydrological data (amplicon abundance tables, leaf-level gas exchange, soil respiration, water and sediment chemistry, sensor-based hydrologic measurements). More broadly, we provide guidelines that communities can use to create new (meta)data formats that integrate with their scientific workflows. Such reporting formats have the potential to accelerate scientific discovery and predictions by making it easier for data contributors to provide (meta)data that are more interoperable and reusable.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Deborah Agarwal

Sample Identifiers and Metadata to Support Data Management and Reuse in Multidisciplinary Ecosystem Sciences

Database Maintenance, Data Sharing Policy, Collaboration

National Alliance for Water Innovation (NAWI) Master Technology Roadmap

Imputation of Contiguous Gaps and Extremes of Subhourly Groundwater Time Series Using Random Forests

Enabling FAIR data in Earth and environmental science with community-centric (meta)data reporting formats

Contact Info

Product

Resources

About