2012
DOI: 10.1016/j.inffus.2011.04.004
|View full text |Cite
|
Sign up to set email alerts
|

Quantifying the correctness, computational complexity, and security of privacy-preserving string comparators for record linkage

Abstract: Record linkage is the task of identifying records from disparate data sources that refer to the same entity. It is an integral component of data processing in distributed settings, where the integration of information from multiple sources can prevent duplication and enrich overall data quality, thus enabling more detailed and correct analysis. Privacy-preserving record linkage (PPRL) is a variant of the task in which data owners wish to perform linkage without revealing identifiers associated with the records… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
35
0

Year Published

2012
2012
2023
2023

Publication Types

Select...
4
4
2

Relationship

1
9

Authors

Journals

citations
Cited by 51 publications
(35 citation statements)
references
References 37 publications
0
35
0
Order By: Relevance
“…Records matched by one rule were removed from the pool of records to be matched with subsequent rules in both datasets. The Jaro-Winkler string comparator (JW) [30], which is particularly well-suited for personal names [31], returning values between 0 (complete disagreement) and 1 (exact agreement) as a measure of similarity between two strings [30,32], was used to accommodate typographical errors on surnames. We set a cut-off for designating pairs of surnames as matches to a JW score ≥ 0.85, which is higher than in previous studies [30,33].…”
Section: Record Linkage Proceduresmentioning
confidence: 99%
“…Records matched by one rule were removed from the pool of records to be matched with subsequent rules in both datasets. The Jaro-Winkler string comparator (JW) [30], which is particularly well-suited for personal names [31], returning values between 0 (complete disagreement) and 1 (exact agreement) as a measure of similarity between two strings [30,32], was used to accommodate typographical errors on surnames. We set a cut-off for designating pairs of surnames as matches to a JW score ≥ 0.85, which is higher than in previous studies [30,33].…”
Section: Record Linkage Proceduresmentioning
confidence: 99%
“…Unfortunately, such secure multi-party computation based techniques incur substantial computational resources. According to a recent survey [11], secure edit distance computations [10] require over two years to compute similarity between two datasets of 1000 strings each, on a commodity server. It is apparent that we need efficient methods to perform similarity search over large amount of encrypted data.…”
Section: Introductionmentioning
confidence: 99%
“…Record linkage is highly used to identify data that being linked, so all datasets under consideration should ideally undergo a matching process prior to the record linkage [13]. Even though there are studies have been carried out in various issues such as software [14,15], data privacy [16,17] and security [13], we suggest it is important to develop a standard set of approach to show the relationship between data, attributes and organizational goals. Even though the term is defined as record linkage but we use the term data linkage as an effort to identify the relationship between data, attributes and organizational goals because we attempt to identify the linkage for each data and attributes that relates to the organizational goals.…”
Section: B Dependency Relationship Of Organizational Data To the Orgmentioning
confidence: 99%