Interleukin 12 administration induces T helper type 1 cells and accelerates autoimmune diabetes in NOD mice.

The collection, organization, and long-term preservation of resources are the raison d'être of archives and archivists. The archival community, however, has largely neglected science data, assuming they were outside the bounds of their professional concerns. Scientists, on the other hand, increasingly recognize that they lack the skills and expertise needed to meet the demands being placed on them with regard to data curation and are seeking the help of ''data archivists'' and ''data curators.'' This represents a significant opportunity for archivists and archival scholars but one that can only be realized if they better understand the scientific context.

show abstract

Towards Sustainable Curation and Preservation: The SEAD Project's Data Services Approach

Myers

Hedstrom

Akmon

et al. 2015

View full text Add to dashboard Cite

Only with your permission: how rights holders respond (or don’t respond) to requests to display archival materials online

Akmon

2010

Arch Sci

View full text Add to dashboard Cite

The study found that significant time is required to contact and negotiate with rights holders and that the biggest obstacle to getting permission is non-response. Of those requests that get a response, the vast majority are to grant permission. While few of the requests were met with denial, the data suggest that commercial copyright holders are much more likely to deny permission than other types of copyright holders. The data also show that adherence to the common policy of only displaying online those documents with explicit permission will likely result in substantially incomplete online collections.

show abstract

Building Tools to Support Active Curation: Lessons Learned from SEAD

Akmon¹,

Hedstrom²,

Myers³

et al. 2018

IJDC

View full text Add to dashboard Cite

SEAD – a project funded by the US National Science Foundation’s DataNet program – has spent the last five years designing, building, and deploying an integrated set of services to better connect scientists’ research workflows to data publication and preservation activities. Throughout the project, SEAD has promoted the concept and practice of “active curation,” which consists of capturing data and metadata early and refining it throughout the data life cycle. In promoting active curation, our team saw an opportunity to develop tools that would help scientists better manage data for their own use, improve team coordination around data, implement practices that would serve the data better over time, and seamlessly connect with data repositories to ease the burden of sharing and publishing. SEAD has worked with 30 projects, dozens of researchers, and hundreds of thousands of files, providing us with ample opportunities to learn about data and metadata, integrating with researchers’ workflows, and building tools and services for data. In this paper, we discuss the lessons we have learned and suggest how this might guide future data infrastructure development efforts.

show abstract

Leveraging Machine Learning to Detect Data Curation Activities

Lafia

Thomer

Bleckley

et al. 2021

View full text Add to dashboard Cite

A Data-Driven Approach to Appraisal and Selection at a Domain Data Repository

Pienta¹,

Akmon²,

Noble³

et al. 2018

IJDC

View full text Add to dashboard Cite

Social scientists are producing an ever-expanding volume of data, leading to questions about appraisal and selection of content given finite resources to process data for reuse. We analyze users’ search activity in an established social science data repository to better understand demand for data and more effectively guide collection development. By applying a data-driven approach, we aim to ensure curation resources are applied to make the most valuable data findable, understandable, accessible, and usable. We analyze data from a domain repository for the social sciences that includes over 500,000 annual searches in 2014 and 2015 to better understand trends in user search behavior. Using a newly created search-to-study ratio technique, we identified gaps in the domain data repository’s holdings and leveraged this analysis to inform our collection and curation practices and policies. The evaluative technique we propose in this paper will serve as a baseline for future studies looking at trends in user demand over time at the domain data repository being studied with broader implications for other data repositories.

show abstract

How do properties of data, their curation, and their funding relate to reuse?

Hemphill

Pienta

Lafia

et al. 2022

Asso for Info Science & Tech

View full text Add to dashboard Cite

Despite large public investments in facilitating the secondary use of data, there is little information about the specific factors that predict data's reuse. Using data download logs from the Inter-university Consortium for Political and Social Research (ICPSR), this study examines how data properties, curation decisions, and repository funding models relate to data reuse. We find that datasets deposited by institutions, subject to many curatorial tasks, and whose access and preservation is funded externally are used more often. Our findings confirm that investments in data collection, curation, and preservation are associated with more data reuse.

show abstract

The Craft and Coordination of Data Curation: Complicating Workflow Views of Data Science

Thomer

Akmon²,

York

et al. 2022

Proc. ACM Hum.-Comput. Interact.

View full text Add to dashboard Cite

Data curation is the process of making a dataset fit-for-use and archivable. It is critical to data-intensive science because it makes complex data pipelines possible, studies reproducible, and data reusable. Yet the complexities of the hands-on, technical, and intellectual work of data curation is frequently overlooked or downplayed. Obscuring the work of data curation not only renders the labor and contributions of data curators invisible but also hides the impact that curators' work has on the later usability, reliability, and reproducibility of data. To better understand the work and impact of data curation, we conducted a close examination of data curation at a large social science data repository, the Inter-university Consortium for Political and Social Research (ICPSR). We asked: What does curatorial work entail at ICPSR, and what work is more or less visible to different stakeholders and in different contexts? And, how is that curatorial work coordinated across the organization? We triangulated accounts of data curation from interviews and records of curation in Jira tickets to develop a rich and detailed account of curatorial work. While we identified numerous curatorial actions performed by ICPSR curators, we also found that curators rely on a number of craft practices to perform their jobs. The reality of their work practices defies the rote sequence of events implied by many life cycle or workflow models. Further, we show that craft practices are needed to enact data curation best practices and standards. The craft that goes into data curation is often invisible to end users, but it is well recognized by ICPSR curators and their supervisors. Explicitly acknowledging and supporting data curators as craftspeople is important in creating sustainable and successful curatorial infrastructures.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Dharma Akmon

The application of archival concepts to a data-intensive environment: working with scientists to understand data management and preservation needs

Towards Sustainable Curation and Preservation: The SEAD Project's Data Services Approach

Only with your permission: how rights holders respond (or don’t respond) to requests to display archival materials online

Building Tools to Support Active Curation: Lessons Learned from SEAD

Leveraging Machine Learning to Detect Data Curation Activities

A Data-Driven Approach to Appraisal and Selection at a Domain Data Repository

How do properties of data, their curation, and their funding relate to reuse?

The Craft and Coordination of Data Curation: Complicating Workflow Views of Data Science

Contact Info

Product

Resources

About