2021 IEEE 17th International Conference on eScience (eScience) 2021
DOI: 10.1109/escience51609.2021.00025
|View full text |Cite
|
Sign up to set email alerts
|

Leveraging Machine Learning to Detect Data Curation Activities

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
10
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
4
3

Relationship

2
5

Authors

Journals

citations
Cited by 13 publications
(10 citation statements)
references
References 26 publications
0
10
0
Order By: Relevance
“…Definitions of data quality assurance and data curation also partially intersect; for example, data curation is also often tied to the idea of producing data that are fit for a specific purpose (CASRAI 2022a). Aspects of quality assurance are sometimes subsumed under data curation activities (Lafia et al 2021). However, conceptualising data quality assurance as simply an aspect of data stewardship or data curation makes it difficult to analyse and understand specific characteristics of data quality assurance.…”
Section: Data Quality Assurancementioning
confidence: 99%
“…Definitions of data quality assurance and data curation also partially intersect; for example, data curation is also often tied to the idea of producing data that are fit for a specific purpose (CASRAI 2022a). Aspects of quality assurance are sometimes subsumed under data curation activities (Lafia et al 2021). However, conceptualising data quality assurance as simply an aspect of data stewardship or data curation makes it difficult to analyse and understand specific characteristics of data quality assurance.…”
Section: Data Quality Assurancementioning
confidence: 99%
“…In 2018, ICPSR implemented standardized curation levels and terminology (ICPSR, 2020); we have harmonized curation level information from 2017 to the 2018 levels. We understand that higher levels of data curation at ICPSR are more extensive, demanding more effort and staff time spent on curation activities (Lafia et al, 2021). Level 1 studies receive ICPSR's base level of curation and can generally be disseminated more quickly, while Level 3 is ICPSR's most extensive level of curation.…”
Section: Data Overviewmentioning
confidence: 99%
“…IS scholars and practioners define data curation as the "the active and on-going management of data through its life cycle of interest and usefulness to scholarship, science, and education" [Cragin et al 2010] (see also Lafia et al [2021]; Palmer et al [2013]; Yakel [2007]). 3 Research in data curation has included substantial scholarship on data practices (the ways in which people work with, share, and reuse data; examples include Crooks and Currie [2021]; Stvilia et al [2013]; Thomer [2022]; Yan et al [2020]; Zimmerman [2008]) and on understanding the full "life cycle" of data use (e.g., Ball [2012]; ; Tenopir et al [2015]).…”
Section: Prior Workmentioning
confidence: 99%
“…We manually coded a randomly selected proportional sample of Jira ticket worklog entries stratified by curation level. These were coded in brat software [Stenetorp et al 2012] Lafia et al [2021]). We trained a computational model with 0.75 accuracy to assign each worklog entry one of the eight categories of curatorial actions (summarized in Figure 3).…”
Section: Research Sitementioning
confidence: 99%
See 1 more Smart Citation