The platform will undergo maintenance on Sep 14 at about 7:45 AM EST and will be unavailable for approximately 2 hours.
2014
DOI: 10.1016/j.artmed.2014.03.006
|View full text |Cite
|
Sign up to set email alerts
|

De-identification of health records using Anonym: Effectiveness and robustness across datasets

Abstract: Results: Anonym identifies and removes up to 96.6% of personal health identifiers (recall) with a precision of up to 98.2% on the i2b2 dataset, outperforming the best system proposed in the i2b2 challenge. The effectiveness of Anonym across datasets is found to depend on the amount of information available for training.Conclusion: Findings show that Anonym compares to the best approach from the 2006 i2b2 shared task. It is easy to retrain Anonym with new datasets; if retrained, the system is robust to variatio… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
10
0

Year Published

2014
2014
2024
2024

Publication Types

Select...
7
2

Relationship

0
9

Authors

Journals

citations
Cited by 22 publications
(10 citation statements)
references
References 14 publications
0
10
0
Order By: Relevance
“…Some are more generalizable than others, and certain methods perform better with some types of PHI than others [71,72]. Recent examples such as MIST [73], BoB [74], Anonym [75], and several systems developed for the i2b2 NLP challenges [76,77], allow for good accuracy and very limited impact on clinical information. [78] Replacing PHI with realistic surrogates [79] and adding biomedical scientific literature text [80] allowed for improved performance.…”
Section: Definitionsmentioning
confidence: 99%
“…Some are more generalizable than others, and certain methods perform better with some types of PHI than others [71,72]. Recent examples such as MIST [73], BoB [74], Anonym [75], and several systems developed for the i2b2 NLP challenges [76,77], allow for good accuracy and very limited impact on clinical information. [78] Replacing PHI with realistic surrogates [79] and adding biomedical scientific literature text [80] allowed for improved performance.…”
Section: Definitionsmentioning
confidence: 99%
“…However, most existing studies on de-identification of clinical text were conducted in a single-institute setting, where the training data and test data were from the same institution. Up until now, there is limited study to explore automated de-identification of clinical notes under cross-institute settings [11–13].…”
Section: Introductionmentioning
confidence: 99%
“…Even more striking differences may be noticed with respect to definitions of de-identification provided by NIST [7] and Zuccona et al [36]. Similar issues exist for definitions of pseudonymisation in the GDPR [33] and by NIST [7].…”
Section: Terminology and Conceptsmentioning
confidence: 97%