Proceedings of the 2016 International Conference on Management of Data 2016
DOI: 10.1145/2882903.2899389
|View full text |Cite
|
Sign up to set email alerts
|

Constance

Abstract: As the challenge of our time, Big Data still has many research hassles, especially the variety of data. The high diversity of data sources often results in information silos, a collection of non-integrated data management systems with heterogeneous schemas, query languages, and APIs. Data Lake systems have been proposed as a solution to this problem, by providing a schema-less repository for raw data with a common access interface. However, just dumping all data into a data lake without any metadata management… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
44
0
1

Year Published

2016
2016
2022
2022

Publication Types

Select...
4
2
1

Relationship

1
6

Authors

Journals

citations
Cited by 155 publications
(51 citation statements)
references
References 7 publications
0
44
0
1
Order By: Relevance
“…The framework is also extensible as new types of data sources can be easily integrated as we have shown in the evaluation. Metadata extraction is one of the core features of our data lake implementation [27]. Based on the extracted metadata and schema information, further methods for the semantic enrichment of the data lake are currently being implemented.…”
Section: Discussionmentioning
confidence: 99%
“…The framework is also extensible as new types of data sources can be easily integrated as we have shown in the evaluation. Metadata extraction is one of the core features of our data lake implementation [27]. Based on the extracted metadata and schema information, further methods for the semantic enrichment of the data lake are currently being implemented.…”
Section: Discussionmentioning
confidence: 99%
“…Constance 84,85 is a Data Lake (DL) system with sophisticated metadata management over raw data extracted from heterogeneous data sources. Constance discovers, extracts, and summarizes the structural metadata from the data sources, and annotates data and metadata with semantic information to avoid ambiguities.…”
Section: Related Workmentioning
confidence: 99%
“…We identify in the literature six main functionalities that should ideally be provided by the metadata system of a data lake. Semantic enrichment (SE), also called semantic annotation [11] or semantic profiling [2], consists in generating a description of the context of data, e.g., with tags, to make them more interpretable and understandable [27]. It is done using knowledge bases such as ontologies.…”
Section: Expected Featuresmentioning
confidence: 99%
“…In this context, the concept of data lake [6] appears as a solution to big data heterogeneity problems. A data lake provides integrated data storage without predefined schema [11]. In the absence of a data schema, an effective metadata system becomes essential to make data queryable and thus prevent the lake from turning into a data swamp, i.e., an inexploitable data lake [1,11,26].…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation