2018
DOI: 10.2200/s00878ed1v01y201810dtm052
|View full text |Cite
|
Sign up to set email alerts
|

Data Profiling

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
58
0

Year Published

2020
2020
2022
2022

Publication Types

Select...
5
1

Relationship

1
5

Authors

Journals

citations
Cited by 19 publications
(58 citation statements)
references
References 188 publications
0
58
0
Order By: Relevance
“…Traditionally, data dependencies stem from data modeling and schema design, e.g., 3NF synthesis, or BCNF decomposition, but data profiling identifies data dependencies from the data themselves, independently of such processes. Because the discovery of data dependencies (of any type) is NP-complete [5] and sometimes even W[2]-to W[3]-complete [35], mining data dependencies is challenging. For this reason, we also provide pointers to the most recent automatic discovery and maintenance algorithms, which in practice are sufficiently fast to be useful in the context of query optimization on realworld datasets.…”
Section: Data Dependenciesmentioning
confidence: 99%
See 2 more Smart Citations
“…Traditionally, data dependencies stem from data modeling and schema design, e.g., 3NF synthesis, or BCNF decomposition, but data profiling identifies data dependencies from the data themselves, independently of such processes. Because the discovery of data dependencies (of any type) is NP-complete [5] and sometimes even W[2]-to W[3]-complete [35], mining data dependencies is challenging. For this reason, we also provide pointers to the most recent automatic discovery and maintenance algorithms, which in practice are sufficiently fast to be useful in the context of query optimization on realworld datasets.…”
Section: Data Dependenciesmentioning
confidence: 99%
“…For this reason, we also provide pointers to the most recent automatic discovery and maintenance algorithms, which in practice are sufficiently fast to be useful in the context of query optimization on realworld datasets. An introductory overview of data profiling techniques can be found in [5], while a comprehensive survey is given in [3].…”
Section: Data Dependenciesmentioning
confidence: 99%
See 1 more Smart Citation
“…• Task (5) was not implemented at all, as data were stored locally in OS files. As a consequence, data were not shared and access to data was non-optimal.…”
Section: Motivation: Real Casesmentioning
confidence: 99%
“…These metadata is used to select a type of technology and not a specific DBMS. -Data characteristics |4|: describing the set of data characteristics that are required to automate the transformation from the logical to the physical model, obtained by traditional data profiling techniques [5]. -Rules for data cleaning and deduplication |5|: describing domain rules for data cleaning and removing duplicates.…”
Section: Challenge 2: Metadata Managementmentioning
confidence: 99%