2022
DOI: 10.3233/shti220153
|View full text |Cite
|
Sign up to set email alerts
|

Protected Health Information Recognition of Unstructured Code-Mixed Electronic Health Records in Taiwan

Abstract: Electronic health records (EHRs) at medical institutions provide valuable sources for research in both clinical and biomedical domains. However, before such records can be used for research purposes, protected health information (PHI) mentioned in the unstructured text must be removed. In Taiwan’s EHR systems the unstructured EHR texts are usually represented in the mixing of English and Chinese languages, which brings challenges for de-identification. This paper presented the first study, to the best of our k… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
6
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
4
1

Relationship

2
3

Authors

Journals

citations
Cited by 5 publications
(6 citation statements)
references
References 0 publications
0
6
0
Order By: Relevance
“…Neural networks are advantageous because they can be initialized with PLMs acquired from extensive unlabeled data, resulting in faster optimization and superior performance. BERT (Bidirectional Encoder Representations from Transformers) pretrained on English corpora (EN-BERT) [28] is one such example of a monolingual transformer model pretrained on the BookCorpus [29] and English Wikipedia in a self-supervised fashion, which has achieved exceptional precision in various NLP tasks including the deidentification task [8,30].…”
Section: Deidentification Methods and Approaches For Tackling Code-mi...mentioning
confidence: 99%
See 3 more Smart Citations
“…Neural networks are advantageous because they can be initialized with PLMs acquired from extensive unlabeled data, resulting in faster optimization and superior performance. BERT (Bidirectional Encoder Representations from Transformers) pretrained on English corpora (EN-BERT) [28] is one such example of a monolingual transformer model pretrained on the BookCorpus [29] and English Wikipedia in a self-supervised fashion, which has achieved exceptional precision in various NLP tasks including the deidentification task [8,30].…”
Section: Deidentification Methods and Approaches For Tackling Code-mi...mentioning
confidence: 99%
“…1. Unique code-mixed deidentification data set: We significantly extended our original corpus compiled in our previous work [8] by incorporating an additional 900 discharge summaries. Furthermore, we created a manipulated subset of the resynthesized data set available for research purposes.…”
Section: Goal Of This Studymentioning
confidence: 99%
See 2 more Smart Citations
“…Method learning supervised machine could learn structure complex use training data set and implementing knowledge that for predict results from the situation is not observed [14]. Artificial Neural Networks are from learning supervised machines [15]. Network simulates structure and function system nerves, collect knowledge with detect pattern and relationship between data and learning experience [11].…”
Section: Introductionmentioning
confidence: 99%