2016
DOI: 10.1093/jamia/ocw109
|View full text |Cite
|
Sign up to set email alerts
|

A long journey to short abbreviations: developing an open-source framework for clinical abbreviation recognition and disambiguation (CARD)

Abstract: CARD detected 27 317 and 107 303 distinct abbreviations from discharge summaries and clinic visit notes, respectively. Two sense inventories were constructed for the 1000 most frequent abbreviations in these 2 corpora. Using the sense inventories created from discharge summaries, CARD achieved an F1 score of 0.755 for identifying and disambiguating all abbreviations in a corpus from the VUMC discharge summaries, which is superior to MetaMap and Apache's clinical Text Analysis Knowledge Extraction System (cTAKE… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
45
0

Year Published

2018
2018
2024
2024

Publication Types

Select...
5
3
1

Relationship

0
9

Authors

Journals

citations
Cited by 55 publications
(47 citation statements)
references
References 21 publications
0
45
0
Order By: Relevance
“…al. [33]), many of which are made freely available and open source, have been intensively investigated in mining free-text medical records [10,[34][35][36]. To provide guidance in the efficient reuse of pre-trained NLP models, we have here proposed an approach that can automatically (i) identify easy cases in a new task for the reused model, on which it can achieve good performance with high confidence; (ii) classify the remainder of the cases so that the validation or retraining on them can be conducted much more efficiently, compared to adapting the model on all cases.…”
Section: Principal Resultsmentioning
confidence: 99%
“…al. [33]), many of which are made freely available and open source, have been intensively investigated in mining free-text medical records [10,[34][35][36]. To provide guidance in the efficient reuse of pre-trained NLP models, we have here proposed an approach that can automatically (i) identify easy cases in a new task for the reused model, on which it can achieve good performance with high confidence; (ii) classify the remainder of the cases so that the validation or retraining on them can be conducted much more efficiently, compared to adapting the model on all cases.…”
Section: Principal Resultsmentioning
confidence: 99%
“…Type of abbreviation Search rule Search scope [14] Acronym LCS of initial Any subsequence in ten words before the acronym [15] Acronym LCS of initial, heuristics rules Any subsequence in the sentence [16] Acronym SVM Any sequences in the sentence exceed the number of characters in the acronym in length [17] Acronym LCS and the same initials min(|A| + 5, |A| * 2) [18] Acronym CRF Any subsequence in the sentence [19] Acronym LNCRF All the subsequence in the document [20] Abbreviation Semantic distance of context All the words in the text the performance of the normalizer is directly tied to the performance of a downstream dependency parser. In [28], Wu Y et al presented a framework called CARD(clinical abbreviation recognition and disambiguation) to handle the abbreviation in clinical data that leverages previously developed methods, including:…”
Section: Referencesmentioning
confidence: 99%
“…For this experiment, we leveraged the ShARe (Shared Annotated Resources) corpus, a subset of de-identified discharge summary, electrocardiogram, echocardiogram, and radiology reports from about 30,000 ICU (Intensive Care Unit) patients provided by the MIMIC (Multiparameter Intelligent Monitoring in Intensive Care) [44]. In this experiment, a data set with 80 clinical texts and 3024 abbreviations is selected from the corpus and three recent works (CARD [28], UTHealthCCB [29], LIMSI [30]) are used to normalize the text. The same measurements as Section 5.3 are used to evaluate the result.…”
Section: Text Normalization Improvementmentioning
confidence: 99%
“…The biomedical abbreviations are extracted through the link-topic model algorithm (statistical model) [4]. In addition, Yonghui et al were organized open-source framework for clinical abbreviation recognition and disambiguation (including ML, clustering, and KE) [14]. De-identification is the process of removing private health information from medical discharge records.…”
Section: Applications Of Clinical Nermentioning
confidence: 99%