2017
DOI: 10.1016/j.jbi.2017.06.005
|View full text |Cite
|
Sign up to set email alerts
|

Learning to identify Protected Health Information by integrating knowledge- and data-driven algorithms: A case study on psychiatric evaluation notes

Abstract: De-identification of clinical narratives is one of the main obstacles to making healthcare free text available for research. In this paper we describe our experience in expanding and tailoring two existing tools as part of the 2016 CEGS N-GRID Shared Tasks Track 1, which evaluated de-identification methods on a set of psychiatric evaluation notes for up to 25 different types of Protected Health Information (PHI). The methods we used rely on machine learning on either a large or small feature space, with additi… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
6
0

Year Published

2017
2017
2024
2024

Publication Types

Select...
6
1

Relationship

2
5

Authors

Journals

citations
Cited by 9 publications
(6 citation statements)
references
References 19 publications
(18 reference statements)
0
6
0
Order By: Relevance
“…While automated methods for stripping text of identifiers (of both patients and third parties) exist, they are not perfect, performing at 81%-99% sensitivity (recall) and 43%-99% precision, 23 24 and consequently many data custodians refuse to share text outside of the clinical environment. In contrast, the few UK research groups that are situated within healthcare trusts and can access medical text which remains within the clinical environment, have established good track records in terms of technology development, 25 protecting patient privacy 26 and generating clinical insights. [27][28][29]…”
Section: Why Is Medical Free Text Important For Research?mentioning
confidence: 99%
“…While automated methods for stripping text of identifiers (of both patients and third parties) exist, they are not perfect, performing at 81%-99% sensitivity (recall) and 43%-99% precision, 23 24 and consequently many data custodians refuse to share text outside of the clinical environment. In contrast, the few UK research groups that are situated within healthcare trusts and can access medical text which remains within the clinical environment, have established good track records in terms of technology development, 25 protecting patient privacy 26 and generating clinical insights. [27][28][29]…”
Section: Why Is Medical Free Text Important For Research?mentioning
confidence: 99%
“…Psychiatric notes were used mainly in an NLP community challenge to extract protected health information and symptom severity [23,27,42,53,58,65,78,83,92]. These narratives are key enablers of mental health informatics as the fine-grained context of actionable information does not readily lend itself to predefined coding schemes.…”
Section: Types Of Narrativesmentioning
confidence: 99%
“…Similarly, as a subtask of IE, NER can be used to support structuring text into predefined templates, whose slots need to be filled with named entities of relevant types. The majority of NER studies were related to NLP community challenges such as those described in studies by Uzuner et al [123], Suominen et al [126], and Stubbs et al [131] [20,49,67,96,104]; disorders [54,57,88,98,114]; and protected health information [27,58,65]. Unlike NER, the more complex task of IE found a wider variety of clinical applications, the most prominent of which include prognosis and care improvement.…”
Section: Clinical Applicationsmentioning
confidence: 99%
“…In the 2016 i2b2 shared task, ensemble with rule-based models became more popular. Lee et al [12], Dehghan et al [13], Bui et al [14], and Liu et al [15] all employed rule-based models as a component of their hybrid systems. However, despite the wide use of rules, all the works did not investigate the effect of rule-based models in hybrid architecture.…”
Section: Prior Workmentioning
confidence: 99%