2010
DOI: 10.1002/meet.14504701240
|View full text |Cite
|
Sign up to set email alerts
|

Definitions of dataset in the scientific and technical literature

Abstract: The integration of heterogeneous data in varying formats and from diverse communities requires an improved understanding of the concept of a dataset, and of key related concepts, such as format, encoding, and version. Ultimately, a normative formal framework of such concepts will be needed to support the effective curation, integration, and use of shared multi-disciplinary scientific data. To prepare for the development of this framework we reviewed the definitions of dataset found in technical documentation a… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
34
0
6

Year Published

2011
2011
2017
2017

Publication Types

Select...
8
1
1

Relationship

2
8

Authors

Journals

citations
Cited by 50 publications
(40 citation statements)
references
References 6 publications
(13 reference statements)
0
34
0
6
Order By: Relevance
“…As Borgman (2012b) notes, however, the dimensions of data identity are unsettled; there is not yet a clearly agreed upon set of dimensions for data identity, nor for the levels needed to read, interpret, combine, compute upon and trust data. Similarly, Wynholds (2011) states that data are currently "unruly and poorly bounded objects" within scholarly works, and Renear, Sacchi, and Wickett (2010) assert that "Although definitions of data sets do appear to fit a common pattern, with recurring phrases and semantically similar terms, it is clear that there is no single well-defined concept of data set" (p. 3).…”
Section: Identitymentioning
confidence: 99%
“…As Borgman (2012b) notes, however, the dimensions of data identity are unsettled; there is not yet a clearly agreed upon set of dimensions for data identity, nor for the levels needed to read, interpret, combine, compute upon and trust data. Similarly, Wynholds (2011) states that data are currently "unruly and poorly bounded objects" within scholarly works, and Renear, Sacchi, and Wickett (2010) assert that "Although definitions of data sets do appear to fit a common pattern, with recurring phrases and semantically similar terms, it is clear that there is no single well-defined concept of data set" (p. 3).…”
Section: Identitymentioning
confidence: 99%
“…For instance, a dataset may have set semantics (e.g. "Set of RDF triples") [20], i.e. following the mathematical definition of a set, or it may have collection semantics, which means that deletion and addition of data does not have any effect on the dataset's identity.…”
Section: Dataset (Terminology)mentioning
confidence: 99%
“…This is particularly difficult and problematic for data because of the imprecise and ambiguous notion of what is a "data set". As [39] point out "the notion of 'data set' found in the literature cannot itself be provided with a precise formal definition". Consequently, the decision about an entity/identifier association is necessarily heuristic, user-driven (i.e., what do the users of the system conceptually consider to be a data set, motivated in part by that which they wish to cite), and application-specific rather than technical and algorithmic.…”
Section: Identifier Strategymentioning
confidence: 99%