2014
DOI: 10.1016/j.jbi.2014.01.011
|View full text |Cite
|
Sign up to set email alerts
|

Text de-identification for privacy protection: A study of its impact on clinical text information content

Abstract: As more and more electronic clinical information is becoming easier to access for secondary uses such as clinical research, approaches that enable faster and more collaborative research while protecting patient privacy and confidentiality are becoming more important. Clinical text de-identification offers such advantages but is typically a tedious manual process. Automated Natural Language Processing (NLP) methods can alleviate this process, but their impact on subsequent uses of the automatically de-identifie… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
28
0

Year Published

2015
2015
2024
2024

Publication Types

Select...
5
3
1
1

Relationship

0
10

Authors

Journals

citations
Cited by 48 publications
(30 citation statements)
references
References 13 publications
1
28
0
Order By: Relevance
“…The majority of the ratings in our study had low impact score (1 1), indicating that PHI redaction had a minor impact on readability in general. This finding is supported by other studies that demonstrated text de-identification minimally reduce informativeness of clinical texts [28,29].…”
Section: Discussionsupporting
confidence: 84%
“…The majority of the ratings in our study had low impact score (1 1), indicating that PHI redaction had a minor impact on readability in general. This finding is supported by other studies that demonstrated text de-identification minimally reduce informativeness of clinical texts [28,29].…”
Section: Discussionsupporting
confidence: 84%
“…However, creating an annotated clinical corpus that can be used for research, whether inside or outside a medical center, raises issues of privacy. This is why automatic de-identification [3][4][5][6][7] remains a very active area of research. While de-identification aims at masking identifying information, while keeping a high sensitivity in its detection, the risk that it also removes information useful for subsequent research must be assessed [5].…”
Section: Foundational Methods In Clinical Nlpmentioning
confidence: 99%
“…Most of the works around text de-identification are based on pattern matching or machine learning, or even a combination of both. Whereas pattern matching does not account for the context of a word and is unaware of typographical errors, machine learning techniques require a large corpus of annotated text (17). Since our radiology reports were mostly free text with sensible data outside headers, we opted for annotating our own corpus and developing a Named Entity Recognition (NER) based de-identification method.…”
Section: R a F Tmentioning
confidence: 99%