2017
DOI: 10.1186/s12874-017-0370-0
|View full text |Cite
|
Sign up to set email alerts
|

Estimating parameters for probabilistic linkage of privacy-preserved datasets

Abstract: BackgroundProbabilistic record linkage is a process used to bring together person-based records from within the same dataset (de-duplication) or from disparate datasets using pairwise comparisons and matching probabilities. The linkage strategy and associated match probabilities are often estimated through investigations into data quality and manual inspection. However, as privacy-preserved datasets comprise encrypted data, such methods are not possible. In this paper, we present a method for estimating the pr… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
18
0

Year Published

2017
2017
2023
2023

Publication Types

Select...
7

Relationship

0
7

Authors

Journals

citations
Cited by 11 publications
(18 citation statements)
references
References 21 publications
0
18
0
Order By: Relevance
“…Numerous techniques exist for estimating m and u probabilities for a particular dataset and for estimating the designated threshold [28]. These include Jaro's method for estimating u-probabilities, the expectation-maximisation estimation algorithm [29] and the iterative refinement procedure first described by Newcombe [30].…”
Section: Probabilistic Linkage Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…Numerous techniques exist for estimating m and u probabilities for a particular dataset and for estimating the designated threshold [28]. These include Jaro's method for estimating u-probabilities, the expectation-maximisation estimation algorithm [29] and the iterative refinement procedure first described by Newcombe [30].…”
Section: Probabilistic Linkage Methodsmentioning
confidence: 99%
“…From the basic probabilistic model, it is possible to iterate through all possible combination of field state comparisons for a pair of records [28]. We will consider a simplified model, whereby a field comparison can either agree or disagree.…”
Section: Methodsmentioning
confidence: 99%
“…Further methods of interest use Bloom filter pairs. The method of Brown et al, exhibits good error tolerance [91], while that of Ranbaduge and Christen [92] includes the temporal information in records in its hashing process; this Australian contribution is all the more interesting in the light of extensive national data linking guidelines by the government [93].…”
Section: Privacy: Deidentification Distributed Computation Blockchainmentioning
confidence: 99%
“…With growing demand for linked data, it has been critical for record linkage centres to implement methods which protect privacy, yet maximise the benefits that can be derived from data assets. As a result, research around privacy-preserving record linkage (PPRL) methods has become a pressing area of inquiry, with much focus on the use of Bloom filters [1][2][3][4][5][6][7]. Much research has focussed on the security aspect of the Bloom filters, such as cryptographic analyses of encoding methods, modifications, and hashing variations [3,[7][8][9][10][11][12].…”
Section: Introductionmentioning
confidence: 99%
“…There is little mention in the literature of Bloom filters being used in the context of probabilistic record linkage where the field similarity score is converted into a partial agreement weight during the calculation of a pair-wise score [1,15,28]. Several issues remain unclear: What is the effect of approximate matching on the linkage quality using Bloom filters?…”
Section: Introductionmentioning
confidence: 99%