2017
DOI: 10.1016/j.jbi.2016.11.001
|View full text |Cite
|
Sign up to set email alerts
|

Anonymizing datasets with demographics and diagnosis codes in the presence of utility constraints

Abstract: Publishing data about patients that contain both demographics and diagnosis codes is essential to perform large-scale, low-cost medical studies. However, preserving the privacy and utility of such data is challenging, because it requires: (i) guarding against identity disclosure (re-identification) attacks based on both demographics and diagnosis codes, (ii) ensuring that the anonymized data remain useful in intended analysis tasks, and (iii) minimizing the information loss, incurred by anonymization, to prese… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
21
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
6
2
1

Relationship

0
9

Authors

Journals

citations
Cited by 29 publications
(22 citation statements)
references
References 56 publications
0
21
0
Order By: Relevance
“…A final limitation is the potential risk to patients from re-identifying a de-identified dataset by linking it to a dataset containing PHI. Often the data use agreements for research datasets forbid any attempts at doing record linkage for this reason, but the use of formal privacy models could help mitigate that risk 29,30 .…”
Section: Discussionmentioning
confidence: 99%
“…A final limitation is the potential risk to patients from re-identifying a de-identified dataset by linking it to a dataset containing PHI. Often the data use agreements for research datasets forbid any attempts at doing record linkage for this reason, but the use of formal privacy models could help mitigate that risk 29,30 .…”
Section: Discussionmentioning
confidence: 99%
“…The authors in [5], [11] chose to avoid alignment by selecting trajectories with the highest similarity as representatives of clusters. Poulis et al [12] investigated applying restriction on the amount of generalization that can be applied by proposing a userdefined utility metric. Takahashi et al [13] proposed an approach termed as CMAO to anonymize the real-time publication of spatiotemporal trajectories.…”
Section: Generalization Techniquementioning
confidence: 99%
“…Many publicly available datasets have already aggregated individual addresses to a census tract. When hospitals collaborate with researchers they are required to aggregate data to a geographic level that protects patient confidentiality (Comer, Grannis, Dixon, Bodenhamer, & Wiehe, ; Poulis, Loukides, Skiadopoulos, & Gkoulalas‐Divanis, ). When primary geospatial data are collected, several steps can be taken to ensure confidentiality for participants.…”
Section: Unique Considerations For a Geospatial Approachmentioning
confidence: 99%