Proceedings of 16th International Conference on Data Engineering (Cat. No.00CB37073)
DOI: 10.1109/icde.2000.839429
|View full text |Cite
|
Sign up to set email alerts
|

An extensible Framework for Data Cleaning

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
43
0

Publication Types

Select...
7
2

Relationship

0
9

Authors

Journals

citations
Cited by 47 publications
(43 citation statements)
references
References 12 publications
0
43
0
Order By: Relevance
“…Data cleaning and metadata enrichment should be taken before translating base schemas into mediated schema [4,5].…”
Section: Analyze Of Existing Integration Methodsmentioning
confidence: 99%
“…Data cleaning and metadata enrichment should be taken before translating base schemas into mediated schema [4,5].…”
Section: Analyze Of Existing Integration Methodsmentioning
confidence: 99%
“…However, until recently few cross-references could be found between the statistical and the computer science community. While statisticians and epidemiologists speak of record or data linkage [17], the computer science and database communities often refer to the same process as data or field matching, data scrubbing, data cleaning [18,35], data cleansing [28], preprocessing, duplicate detection [5], entity uncertainty or as the object identity problem. In commercial processing of customer databases or business mailing lists, data linkage is sometimes called merge/purge processing [23], data integration [11], list washing or ETL (extraction, transformation and loading).…”
Section: Data Linkage Techniquesmentioning
confidence: 99%
“…Another possibility is to use an SQL like language [18] that allows approximate joins and cluster building of similar records, as well as decision functions that decide if two records represent the same entity. A generic knowledge-based framework based on rules and an expert system is presented in [26].…”
Section: Modern Approachesmentioning
confidence: 99%
“…[95,21,72,98,7,57,1,90,42,46,39,88]). Earlier solutions employ manually specified rules to match objects [46].…”
Section: Related Workmentioning
confidence: 99%