Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data 2020
DOI: 10.1145/3318464.3389708
|View full text |Cite
|
Sign up to set email alerts
|

Learning Over Dirty Data Without Cleaning

Abstract: Real-world datasets are dirty and contain many errors. Examples of these issues are violations of integrity constraints, duplicates, and inconsistencies in representing data values and entities. Learning over dirty databases may result in inaccurate models. Users have to spend a great deal of time and effort to repair data errors and create a clean database for learning. Moreover, as the information required to repair these errors is not often available, there may be numerous possible clean versions for a dirt… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

0
0
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
3
1
1

Relationship

1
4

Authors

Journals

citations
Cited by 6 publications
references
References 46 publications
(96 reference statements)
0
0
0
Order By: Relevance