2022
DOI: 10.1111/2041-210x.13982
|View full text |Cite
|
Sign up to set email alerts
|

Implementing GitHub Actions continuous integration to reduce error rates in ecological data collection

Abstract: Accurate field data are essential to understanding ecological systems and forecasting their responses to global change. Yet, data collection errors are common, and data analysis often lags far enough behind its collection that many errors can no longer be corrected, nor can anomalous observations be revisited. Needed is a system in which data quality assurance and control (QA/QC), along with the production of basic data summaries, can be automated immediately following data collection. Here, we implement and t… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
6
1
1

Relationship

0
8

Authors

Journals

citations
Cited by 9 publications
(4 citation statements)
references
References 52 publications
0
4
0
Order By: Relevance
“…We must also keep in mind that the patterns that define the current state of pollination are dependent on many complex variables that overviews such as ours are unable to measure, such as the size of wild insect populations near crop fields and the decisions growers make about honey bee stocking levels, and thus must be monitored closely as future conditions change. By creating a pipeline which automatically updates the main results as more data are added to the database under version control (Kim et al., 2022), we give the first step to obtain robust iterative conclusions on the long term.…”
Section: Discussionmentioning
confidence: 99%
“…We must also keep in mind that the patterns that define the current state of pollination are dependent on many complex variables that overviews such as ours are unable to measure, such as the size of wild insect populations near crop fields and the decisions growers make about honey bee stocking levels, and thus must be monitored closely as future conditions change. By creating a pipeline which automatically updates the main results as more data are added to the database under version control (Kim et al., 2022), we give the first step to obtain robust iterative conclusions on the long term.…”
Section: Discussionmentioning
confidence: 99%
“…Within the MESI initiative (Figure 6), we follow the FAIR (Findable, Accessible, Interoperable, Reusable) and TRUST (Transparency, Responsibility, User focus, Sustainability, Technology) principles for data stewardship and repositories (Kim et al, 2022; Lin et al, 2020; Wilkinson et al, 2016). We host the MESI database on GitHub (http://github.com/MESI-organization/mesi-db), from where versions are managed, tagged, and released to Zenodo (Van Sundert et al, 2022—https://doi.org/10.5281/zenodo.7153253), under open access license CC‐BY‐4, meaning that the database can be freely used and edited, provided that the present study and the database at Zenodo are properly cited.…”
Section: Discussionmentioning
confidence: 99%
“…https://github.com/azave a/raste r-vision; Yuan et al, 2020). Similarly, routine data transformations and quality checks can be automated using continuous integration/continuous deployment (CI/CD) tools like Github actions (Kim et al, 2022). While some of these data products may currently not be good enough to drive forecasts on their own, advances in data integration allow them to be used to constrain models in time-steps between more comprehensive data inputs (e.g.…”
Section: Creating and Feeding The Data Pipelinementioning
confidence: 99%