Data curation practices in institutional repositories: An exploratory study

Lee, Dong Joon; Stvilia, Besiki

doi:10.1002/meet.2014.14505101085

Cited by 6 publications

(4 citation statements)

References 12 publications

(11 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The literature on research data management (RDM) is growing rapidly. Current studies focus on understanding the current situation, storing research data, the role of libraries and data warehouses in the process, opinions toward RDM, and so on (Faniel & Jacobsen, 2010;Tenopir et al, 2011;Corrall et al, 2013;Faniel et al, 2013;Calvert, 2015;Lee, 2015;Surkis and Read, 2015;Steiner, 2015;Cox et al, 2016;AL-Omar and Cox, 2016). That the full potential of this new era is being utilized is difficult to argue.…”

Section: Literature Reviewmentioning

confidence: 99%

“…Likewise, in the scientific arena, data have become so prominent that it has been given a new name in "The Fourth Paradigm: Data-Intensive Scientific Discovery" in which "all of the science literature is online, all of the science data is online, and they interoperate with each other" (Hey et al, 2009). In previous paradigms scientific activities were driven process, opinions toward RDM, and so on (Faniel and Jacobsen, 2010;Tenopir et al, 2011;Corrall et al, 2013;Faniel et al, 2013;Calvert, 2015;Lee, 2015;Surkis and Read, 2015;Steiner, 2015;Cox et al, 2016;AL-Omar and Cox, 2016).…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Research data management in Turkey: perceptions and practices

Aydınoğlu

Doğan

2017

LHT

View full text Add to dashboard Cite

Introduction With the penetration and suffusion of information and communication technology (ICT) in our lives, scientific research has evolved as well. As such, scientific research is more data intensive and derives information from massive volumes of digitized data. As of 2013, 2.5 quintillion bytes of data are being produced every day (https://www-01.ibm.com/software/data/bigdata/what-is-big-data.html), 90% of which was produced in the last two years (SINTEF, 2013). A correct assumption is that the amount of data being produced will continue to increase. For instance, Internet users numbered 2.8 billion in 2013, whereas today, they number more than 3.5 billion (http://www.internetlivestats.com/internet-users/). The use of social media has increased the amount of data being produced. The total amount of data in the world is expected to be 4.1 zetabytes in 2016 and is estimated to be 40 zetabytes in 2020. Therefore, data management has become an important issue. Likewise, in the scientific arena, data has become so prominent that it has been given a new name in "The Fourth Paradigm: Data-Intensive Scientific Discovery" in which "all of the science literature is online, all of the science data is online, and they interoperate with each other" (Hey et al., 2009). In previous paradigms scientific activities were driven by experimentation, theory, and computation (Hey et al., 2009). The traditional hypothesis-based scientific approach has been gradually replaced by the analyses of electronic databases that can hold large amounts of information. As papers, lab books, tapes, and photographic films have moved to digital archives, cloud storages, and data warehouses, science has gone beyond the boundaries of hypotheses. Analyses are built on the collections themselves, and patterns, anomalies, and diversities on which questions will be posed later are sought. Hence, the term "data-intensive science" has emerged, and this practice derives information from the datasets collected by various computerized modeling and simulation systems, imaging devices, sensors and sensor networks, and other data gathering and storage techniques (Hey et al., 2009; Knyazkov et al., 2012). The vision is to have "all of the science literature online, all of the science data online, and interoperate with each other" (Hey et al., 2009). These mega-scale databases consist of data captured by various novel scientific tools, sometimes on a realtime basis. With this continuous flow of electronic information, the need to collect, store, curate, integrate, and analyze data in a way that could help inter-institutional and interdisciplinary collaboration has gained importance for the advancement of science in the twenty-first century. According to Birnholtz and Bietz's study (2003, p. 339), data is an evidence for validation of scientific contribution and it makes a social contribution to the establishment of practice. Therefore, understanding the importance of the data is vital to design, sustain and curate well-structured research data management syst...

show abstract

Section: Literature Reviewmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Research data management in Turkey: perceptions and practices

Aydınoğlu

Doğan

2017

LHT

View full text Add to dashboard Cite

show abstract

“…Kim (2013) has studied multiple disciplines across the STEM (Science, Technology, Engineering, and Math) fields because, as he noted, without consideration of disciplinary factors, scientific data sharing behavior in general cannot be studied. Gaps in previous studies regarding data re-use include a focus on users' trust judgment (Yoon, 2015) and persistent identification (Lee, 2015); in these cases, interviews represent the main method used. With regard to data citation from the perspective of data re-use, there are relatively few studies because research has instead focused on data sharing (Helbig, Hausstein, & Toepfer, 2015), for example in the context of GIS data citation (LaBonte, 2015).…”

Section: Literature Reviewmentioning

confidence: 99%

An examination of research data sharing and re-use: implications for data citation practice

Park

Wolfram

2017

Scientometrics

View full text Add to dashboard Cite

This study examines characteristics of data sharing and data re-use in Genetics and Heredity, where data citation is most common. This study applies an exploratory method because data citation is a relatively new area. The Data Citation Index (DCI) on the Web of Science was selected because DCI provides a single access point to over 500 data repositories worldwide and to over two million data studies and datasets across multiple disciplines and monitors quality research data through a peer review process. We explore data citations for Genetics and Heredity, as a case study by examining formal citations recorded in the DCI and informally by sampling a selection of papers for implicit data citations within publications. Citer-based analysis is conducted in order to remedy self-citation in the data citation phenomena. We explore 148 sampled citing articles in order to identify factors that influence data sharing and data re-use, including references, main text, supplementary data/information, acknowledgments, funding information, author information, and web/author resources. This study is unique in that it relies on a citer-based analysis approach and by analyzing peer-reviewed and published data, data repositories, and citing articles of highly productive authors where data sharing is most prevalent. This research is intended to provide a methodological and practical contribution to the study of data citation.

show abstract

“…Analysis of the data was guided by Cultural-Historical Activity Theory (CHAT) (Engeström, 1987). CHAT is a body of theory that emerged from Soviet psychology in the 1920s and since the 1990s has been used in diverse fields such as human-computer interaction (Nardi, 1996), information systems (Ditsa and Davis, 2000), information seeking behaviour (Wilson, 2006), and digital curation (Lee, 2015). Based on the work of Vygotsky (1962Vygotsky ( , 1971Vygotsky ( , 1978, its focus is on understanding goal-oriented individual and collective social activities.…”

Section: Methodsmentioning

confidence: 99%

Open government data (OGD): challenging the concept of a “Designated Community”

Moles

2020

RMJ

View full text Add to dashboard Cite

Purpose This paper aims to explore the curation of government-produced datasets for release as open government data (OGD) from the perspective of the digital curation and preservation concept of a “Designated Community”. Specifically, it explores how digital curation functions when there is no clear Designated Community to which curation services can be targeted. Design/methodology/approach The research was conducted through a case study of the City of Toronto’s efforts to revitalize their OGD program. Data was collected using three methods: semi-structured interviews, non-participative observation and document analysis. Findings The curators of OGD responded to the absence of a Designated Community through two complementary methods. The first was to draw from the discourse that defines the OGD domain. The second was to take a participatory approach that incorporated members of the community surrounding OGD and various other stakeholders into the process of developing a plan for the revitalization of the program. Research limitations/implications This study opens new directions for investigating the application of the Designated Community concept and its role in digital curation and preservation. Practical implications The approach used by OGD curators in this case has the potential to be used in other curation situations where there is no clearly defined user group. Originality/value The findings presented in this paper contribute empirical insights to on-going discussions on the concept of a Designated Community in digital curation and preservation.

show abstract

Data curation practices in institutional repositories: An exploratory study

Cited by 6 publications

References 12 publications

Research data management in Turkey: perceptions and practices

Research data management in Turkey: perceptions and practices

An examination of research data sharing and re-use: implications for data citation practice

Open government data (OGD): challenging the concept of a “Designated Community”

Contact Info

Product

Resources

About