2018
DOI: 10.48550/arxiv.1809.11084
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Reuse and Adaptation for Entity Resolution through Transfer Learning

Abstract: Entity resolution (ER) is one of the fundamental problems in data integration, where machine learning (ML) based classifiers often provide the state-of-the-art results. Considerable human effort goes into feature engineering and training data creation. In this paper, we investigate a new problem: Given a dataset DT for ER with limited or no training data, is it possible to train a good ML classifier on DT by reusing and adapting the training data of dataset DS from same or related domain? Our major contributio… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
5
0

Year Published

2020
2020
2021
2021

Publication Types

Select...
2

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(5 citation statements)
references
References 25 publications
(53 reference statements)
0
5
0
Order By: Relevance
“…• TLER [34] is a non-deep transfer learning framework that defines a standard feature space and reuses the seen data to train models for the new domain.…”
Section: Methodsmentioning
confidence: 99%
See 2 more Smart Citations
“…• TLER [34] is a non-deep transfer learning framework that defines a standard feature space and reuses the seen data to train models for the new domain.…”
Section: Methodsmentioning
confidence: 99%
“…A popular approach is to adapt the pre-trained model for the new task through fine-tuning [20], or by adding new functions to specific tasks such as object detection [15]. In terms of EL, TLER [34] is a non-deep method that reuses and adopts seen data from the source domain to train models for the new domain. Auto-EM [40] proposes to pre-train models for both attribute-type (i.e., schema) and attribute value matching based on word-and character-level similarity.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Entity matching (EM) [16], which is to identify data instances that refer to the same real-world entity, is also related. Some EM works also employ a deep learning-based approach [24], [37], [42], [49], [57], [73], [82]. Mudgal and et al [57] evaluates and compares the performance of different deep learning models applied to EM with three types of data: structured data, textual data, and dirty data (with missing value, inconsistent attributes and/or miss-placed values).…”
Section: Schema/entity Matchingmentioning
confidence: 99%
“…In the past few years, deep learning (DL) has become the most popular direction in machine learning and artificial intelligence [46], [65], and has transformed a lot of research areas, such as image recognition, computer vision, speech recognition, natural language processing, etc.. In recent years, DL has been applied to database systems and applications to facilitate parameter tuning [47], [71], [76], [81], indexing [21], [43], partitioning [34], [86], cardinality estimation and query optimization [39], [44], and entity matching [24], [37], [42], [57], [73], [82]. While predictions based on deep learning cannot guarantee correctness, in the Big Data era, errors in data integration are usually tolerable as long as most of the data is correct, which is another motivation of our work.…”
Section: Introductionmentioning
confidence: 99%