2010
DOI: 10.1016/j.ijmedinf.2010.09.007
|View full text |Cite
|
Sign up to set email alerts
|

The MITRE Identification Scrubber Toolkit: Design, training, and assessment

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
108
0

Year Published

2015
2015
2022
2022

Publication Types

Select...
3
3
2

Relationship

1
7

Authors

Journals

citations
Cited by 110 publications
(110 citation statements)
references
References 9 publications
1
108
0
Order By: Relevance
“…PMS_DN uses a combination of two independent anonymizers—the MITRE MIST anonymizer (Aberdeen et al, 2010) and the Scrubber toolkit (McMurry et al, 2013) in the Apache cTAKES NLP engine—to remove PHI elements from the clinical notes. In a study (McMurry et al, 2013), the Scrubber toolkit in Apache cTAKES identified and removed approximately 98% of the PHI elements (Recall = 98%) from a test corpus of clinical notes selected from the i2b2 De‐Identification Challenge dataset (Uzuner, Luo, & Szolovits, 2007).…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…PMS_DN uses a combination of two independent anonymizers—the MITRE MIST anonymizer (Aberdeen et al, 2010) and the Scrubber toolkit (McMurry et al, 2013) in the Apache cTAKES NLP engine—to remove PHI elements from the clinical notes. In a study (McMurry et al, 2013), the Scrubber toolkit in Apache cTAKES identified and removed approximately 98% of the PHI elements (Recall = 98%) from a test corpus of clinical notes selected from the i2b2 De‐Identification Challenge dataset (Uzuner, Luo, & Szolovits, 2007).…”
Section: Discussionmentioning
confidence: 99%
“…Then, the MITRE MIST tool (Aberdeen et al, 2010) and the Scrubber toolkit (McMurry, Fitch, Savova, Kohane, & Reis, 2013) in the Apache cTAKES NLP engine were used to erase Protected Health Information (PHI) elements from the text. Following de‐identification, the Apache cTAKES NLP engine (Savova et al, 2010) was deployed to extract knowledge by identifying occurrences of concepts defined in the Unified Medical Language System (UMLS) (Bodenreider, 2004) in the text.…”
Section: Methodsmentioning
confidence: 99%
“…Some are more generalizable than others, and certain methods perform better with some types of PHI than others [71,72]. Recent examples such as MIST [73], BoB [74], Anonym [75], and several systems developed for the i2b2 NLP challenges [76,77], allow for good accuracy and very limited impact on clinical information. [78] Replacing PHI with realistic surrogates [79] and adding biomedical scientific literature text [80] allowed for improved performance.…”
Section: Impact On Reusementioning
confidence: 99%
“…For example, Stanford NER (Finkel et al, 2005), ABNER (Settles, 2005), the MITRE Identification Scrubber Toolkit (MIST) (Aberdeen et al, 2010), (Boag et al, 2015), BANNER (Leaman et al, 2008) and NERsuite (Cho et al, 2010) rely on CRFs. GAPSCORE uses SVMs (Chang et al, 2004).…”
Section: Related Workmentioning
confidence: 99%