2016
DOI: 10.3233/ida-160814
|View full text |Cite
|
Sign up to set email alerts
|

Entity resolution in disjoint graphs: An application on genealogical data

Abstract: Entity Resolution (ER) is the process of identifying references referring to the same entity from one or more data sources. In the ER process, most existing approaches exploit the content information of references, categorized as contentbased ER, or additionally consider linkage information among references, categorized as context-based ER. However, in new applications of ER, such as in the genealogical domain, the very limited linkage information among references results in a disjoint graph in which the exist… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
6
0

Year Published

2017
2017
2019
2019

Publication Types

Select...
2
2
1

Relationship

2
3

Authors

Journals

citations
Cited by 9 publications
(6 citation statements)
references
References 45 publications
(49 reference statements)
0
6
0
Order By: Relevance
“…The use of contextual information to address the challenge of assessing the quality of links discovered has also been investigated by [10]. It uses a hybrid similarity measure that combines contentbased and context-based similarities, the latter computed using steady state probability of a random walk with restart.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…The use of contextual information to address the challenge of assessing the quality of links discovered has also been investigated by [10]. It uses a hybrid similarity measure that combines contentbased and context-based similarities, the latter computed using steady state probability of a random walk with restart.…”
Section: Related Workmentioning
confidence: 99%
“…Clearly, the inclusion of just one false positive node in a perfect iln certainly results in a wrong iln. Approaches such as [2,10] compensate for the lack of sufficient identity criteria in the data at hand by combining a number of potentially weak atomic attribute values in the quest of matching with the ultimate goal to disambiguate matched entities. However, even by doing so, many ilns are still corrupted with false positives.…”
Section: Introductionmentioning
confidence: 99%
“…Buccafurry et al [13] also used the common neighbors of two references to predict location of the me links that connect two references to the same entity. Rahmani et al [43] used random-walk agents to count the number of paths in the network between two similar references which pass similar neighbor nodes. Existence of such paths corresponds to a high probability that two references refer to the same entity.…”
Section: Identity Resolutionmentioning
confidence: 99%
“…(a) HiDER allows for identity resolution across different data sources; (b) the changes in input data and identity resolution algorithm can be incorporated in real time; (c) by using inverted indexing, both structured and unstructured data are handled, and fuzzy search allows for compensating missing data and spelling variations and (d) it visualizes complex family networks in an interpretable way which can not be visualized with traditional methods. According to evaluations by experts of the BHIC center, using HiDER for conducting identity resolution over MiSS data generates precise results (e.g., precision above 90% for identity resolution in civil registers as reported by Rahmani et al (2016)). Also the extraction of named entities and relation extraction from unstructured data has a high precision above 70% as discussed in the previous chapter.…”
Section: Discussionmentioning
confidence: 99%
“…For instance, two references are considered to refer to the same entity if their names have edit distance less than or equal to 2 and they have at least one similar household member. Also, the year of document issue and other details are used to avoid any mismatch (for more details please see the work of Rahmani et al (2016)).…”
Section: Real-time Processingmentioning
confidence: 99%