Fifth IEEE International Conference on Data Mining (ICDM'05)
DOI: 10.1109/icdm.2005.7
|View full text |Cite
|
Sign up to set email alerts
|

A Heterogeneous Field Matching Method for Record Linkage

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
23
0

Publication Types

Select...
6
2

Relationship

0
8

Authors

Journals

citations
Cited by 33 publications
(24 citation statements)
references
References 9 publications
0
23
0
Order By: Relevance
“…The experimental results show that trainable similarity measures are capable of learning the specific notion of similarity that is appropriate for a specific domain. While this approach focuses on homogenous string-based transformation, [15] uses heterogeneous set of models to relate complex domain specific relationships between two values. One technical issue for these learningbased approach is the selection of training datasets, especially the negative instances.…”
Section: Relatedmentioning
confidence: 99%
“…The experimental results show that trainable similarity measures are capable of learning the specific notion of similarity that is appropriate for a specific domain. While this approach focuses on homogenous string-based transformation, [15] uses heterogeneous set of models to relate complex domain specific relationships between two values. One technical issue for these learningbased approach is the selection of training datasets, especially the negative instances.…”
Section: Relatedmentioning
confidence: 99%
“…Traditional record matching techniques are then used over the standardized records. Minton et al [13] propose a machine learning based approach to using transformations, where labeled examples are used to learn which applications of transformations are most useful for matching a pair of records. All of these papers assume that transformations are provided as an explicit input.…”
Section: Related Workmentioning
confidence: 99%
“…While previous work [1,7,13] describes how a set of transformations can be exploited for record matching, it does not address how to identify suitable transformations in a record matching setting. For a real-world record matching task, hundreds of string transformations could be relevant and it is a challenging task for a programmer to compile the set of relevant transformations.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…The use of supervised (training-based) approaches or learners aims at automating the process of entity matching to reduce the required manual effort. Training-based approaches, e.g., Naïve Bayes [49], logistic regression [46], Support Vector Machine (SVM) [11,43,49] or decision trees [63,29,49,53,54,56] have so far been used for some subtasks, e.g., determining suitable parameterizations for matchers or adjusting combination functions parameters (weights for matchers, offsets). However, training-based approaches require suitable training data and providing such data typically involves manual effort.…”
Section: Combination Of Matchersmentioning
confidence: 99%