2012
DOI: 10.1136/amiajnl-2011-000461
|View full text |Cite
|
Sign up to set email alerts
|

Missing values in deduplication of electronic patient data

Abstract: The results support the ad-hoc solution for missing values 'replace NA by the value of inequality'. This conclusion is based on a limited amount of data and on a specific deduplication method. Nevertheless, the authors are confident that their results should be confirmed by other empirical analyses and applications.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
11
0

Year Published

2012
2012
2022
2022

Publication Types

Select...
5
2
2

Relationship

0
9

Authors

Journals

citations
Cited by 22 publications
(11 citation statements)
references
References 18 publications
0
11
0
Order By: Relevance
“…In the literature, researchers often treat missing data as disagreements, i.e., γ k ( i , j ) = 0 if δ k ( i , j ) = 1 (e.g., Goldstein and Harron 2015; Ong et al 2014; Sariyar, Borg, and Pommerening 2012). This procedure is problematic because a true match can contain missing values.…”
Section: The Proposed Methodologymentioning
confidence: 99%
See 1 more Smart Citation
“…In the literature, researchers often treat missing data as disagreements, i.e., γ k ( i , j ) = 0 if δ k ( i , j ) = 1 (e.g., Goldstein and Harron 2015; Ong et al 2014; Sariyar, Borg, and Pommerening 2012). This procedure is problematic because a true match can contain missing values.…”
Section: The Proposed Methodologymentioning
confidence: 99%
“… 6 For example, although Goldstein and Harron (2015) suggest the possibility of treating a comparison that involves a missing value as a separate agreement value, but Sariyar, Borg, and Pommerening (2012) find that this approach does not outperform the standard method of treating missing values as disagreements.…”
mentioning
confidence: 99%
“…Past work has primarily focused on identifying relevant features in existing data to infer missing data using classification tree models Prather et al [1997], Sariyar et al [2011]. This representation of a hierarchical structure is achieved by inducing a classification tree on labeled training data, i.e., typically a manually selected subset of the existing data.…”
Section: Data Cleansingmentioning
confidence: 99%
“…There are two alternative approaches to deal with missing values: Impute and Ignore. Imputation treatments [2] [3] fill in attributes in the instance vector using statistical techniques and the complete vector is fed to the predictor. Ignore treatments (also called reduced model or ensemble classifier treatments) [4] [5] overlook the missing attributes, produce a vector based on the available attributes, and feed that vector to a predictor trained on those particular attributes.…”
Section: Introductionmentioning
confidence: 99%