2020
DOI: 10.1055/s-0040-1716403
|View full text |Cite
|
Sign up to set email alerts
|

Designing an openEHR-Based Pipeline for Extracting and Standardizing Unstructured Clinical Data Using Natural Language Processing

Abstract: Background Merging disparate and heterogeneous datasets from clinical routine in a standardized and semantically enriched format to enable a multiple use of data also means incorporating unstructured data such as medical free texts. Although the extraction of structured data from texts, known as natural language processing (NLP), has been researched at least for the English language extensively, it is not enough to get a structured output in any format. NLP techniques need to be used together with clinical inf… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
11
0
1

Year Published

2021
2021
2023
2023

Publication Types

Select...
7
2

Relationship

1
8

Authors

Journals

citations
Cited by 15 publications
(14 citation statements)
references
References 41 publications
0
11
0
1
Order By: Relevance
“…In the direct future work, we will focus on integrating these text blocks into standardized pattern so that they can be assessed properly. Here, natural language processing (NLP) algorithms can be integrated 40 . In fact, this will be crucial for the outbreak detection as additional relevant information for cluster detection is documented here.…”
Section: Discussionmentioning
confidence: 99%
“…In the direct future work, we will focus on integrating these text blocks into standardized pattern so that they can be assessed properly. Here, natural language processing (NLP) algorithms can be integrated 40 . In fact, this will be crucial for the outbreak detection as additional relevant information for cluster detection is documented here.…”
Section: Discussionmentioning
confidence: 99%
“…Researchers have solved the task of extracting relevant text from medical text using a variety of different methods. Various ML and NLP techniques such as Naı €ve Bayes classifier, support vector machine, convolutional neural network (CNN) (Tran & Kavuluru, 2017), recurrent neural network (Liu, Tang, Wang, & Chen, 2017), attention model (Gao et al, 2018), topic modeling (Rumshisky et al, 2016), rule based model (Weissman et al, 2016;Wulff et al, 2020), hybrid model with a combination of rule-based and ML methods (Byrd, Steinhubl, Sun, Ebadollahi, & Stewart, 2014;Chen, Song, Shao, Li, & Ding, 2019) and transfer learning (Giorgi & Bader, 2018) were used heavily in previous approaches. Of all the ML techniques, DL models have provided better results.…”
Section: Related Workmentioning
confidence: 99%
“…Indeed, there is a significant number of existing corpora, datasets and resources available in English. Yet, we observe an increasing number of publications dedicated to other languages and a greater variety of languages: Arabic [ 20 ], Chinese [ 21 22 23 24 25 26 ], Croatian [ 27 ], Finnish [ 28 , 29 ], French [ 30 , 31 ], German [ 32 33 34 ], Hebrew [ 35 ], Italian [ 36 37 38 ], Japanese [ 39 , 40 ], Korean [ 41 , 42 ], Norwegian [ 43 ], Portuguese [ 44 ], Spanish [ 45 46 47 48 ], Swedish [ 49 ], and Turkish [ 28 ]. Overall, we believe that the trend observed in previous years is continuing.…”
Section: Current Trends In Biomedical Nlpmentioning
confidence: 99%