2014
DOI: 10.3233/isu-140746
|View full text |Cite
|
Sign up to set email alerts
|

Metadata for Big Data: A preliminary investigation of metadata quality issues in research data repositories

Abstract: Abstract. Data-driven approaches to scientific research have generated new types of repositories that provide scientists the means necessary to store, share and re-use big data-sets generated at various stages of the research process. As the number and heterogeneity of research data repositories increase, it becomes critical for scientists to solve data quality problems associated to the data-sets stored in these repositories. To date, several authors have been focused on the data quality issues associated to … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
14
0
4

Year Published

2015
2015
2022
2022

Publication Types

Select...
6
3
1

Relationship

0
10

Authors

Journals

citations
Cited by 23 publications
(18 citation statements)
references
References 14 publications
(12 reference statements)
0
14
0
4
Order By: Relevance
“…There has been considerable discussion about including enriched metadata to make data more discoverable in the context of a data publication to provide detailed metadata and description of individual datasets [ 35 , 36 ]. Some research has begun to examine the quality of metadata used in scientific data repositories [ 37 ], but more research is needed to determine what metadata would enable efficient discovery of various types of data. Analysis of current use patterns of existing repositories that accommodate disparate datasets may shed light on what types of data and descriptive metadata are most useful.…”
Section: Discussionmentioning
confidence: 99%
“…There has been considerable discussion about including enriched metadata to make data more discoverable in the context of a data publication to provide detailed metadata and description of individual datasets [ 35 , 36 ]. Some research has begun to examine the quality of metadata used in scientific data repositories [ 37 ], but more research is needed to determine what metadata would enable efficient discovery of various types of data. Analysis of current use patterns of existing repositories that accommodate disparate datasets may shed light on what types of data and descriptive metadata are most useful.…”
Section: Discussionmentioning
confidence: 99%
“…To address these problems, the authors recommend unique creator IDs, standardized formatting, and predetermined lists during metadata entry. 12 Another study evaluates the quality of metadata in HealthData.gov, an open government data repository in the United States. HealthData.gov's metadata requirements include the unique ID, title, description, and URL of each dataset, as well as the author-namely, the federal agency that submitted the data.…”
Section: ) Metadata Quality In Data Repositoriesmentioning
confidence: 99%
“…Many researches explore and analyze the quality of metadata stored in repositories under different perspectives (Bui and Park, 2006;Díaz et al, 2012;Rousidis et al, 2014;Balatsoukas et al, 2018). In some of them (Bui and Park, 2006;Rousidis et al, 2014;Balatsoukas et al, 2018), the harvesting of the repository metadata is done using the OAI-PMH protocol (Open Archives Initiative Protocol for Metadata Harvesting) for the following analysis. The OAI-PMH Validator & Data extractor is able to download all records from digital libraries, parallelly or individually, and analyze compliance of metadata with OAI-PMH, Dublin Core (DC) and other standards (Banos, 2011).…”
Section: Related Work On Metadata Quality and Metadata Standards 21 Metadata Qualitymentioning
confidence: 99%