2019
DOI: 10.1007/s12599-019-00608-0
|View full text |Cite
|
Sign up to set email alerts
|

Discovering Data Quality Problems

Abstract: Existing methodologies for identifying data quality problems are typically user-centric, where data quality requirements are first determined in a top-down manner following well-established design guidelines, organizational structures and data governance frameworks. In the current data landscape, however, users are often confronted with new, unexplored datasets that they may not have any ownership of, but that are perceived to have relevance and potential to create value for them. Such repurposed datasets can … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
19
0
2

Year Published

2020
2020
2024
2024

Publication Types

Select...
5
4

Relationship

0
9

Authors

Journals

citations
Cited by 48 publications
(27 citation statements)
references
References 36 publications
1
19
0
2
Order By: Relevance
“…For this purpose, we apply the two-stage LANG approach to check for semantic and syntactic data constraints in Sect. 4 (Zhang et al 2019).…”
Section: Methodology and Study Designmentioning
confidence: 99%
“…For this purpose, we apply the two-stage LANG approach to check for semantic and syntactic data constraints in Sect. 4 (Zhang et al 2019).…”
Section: Methodology and Study Designmentioning
confidence: 99%
“…It gives a complete summary that describes the length, data type, length, variance, uniqueness, null ratio, and domain range. It shows the full view of data quality related to all data source attributes [39], [40]. Data profiling is an important step in the building process of DW and data mart.…”
Section: Data Profilingmentioning
confidence: 99%
“…erefore, we define the quality dimensions according to the quality problems and the use of annotation. Inspired by some existing work [14][15][16][17][18][19], the dimensions of completeness, accuracy, and consistency are selected as the core set of the data quality dimensions. By considering the theory of cognitive perception, we redefine some elements based on annotation characteristics.…”
Section: Annotation Quality Assessment Frameworkmentioning
confidence: 99%
“…Besides, data quality is of multidimensional characteristics. By reviewing the related literature [14][15][16][17][18][19], a core set of data quality dimensions is defined, including the completeness, accuracy, and consistency. Moreover, there are a fair number of researches about annotation quality.…”
Section: Introductionmentioning
confidence: 99%