2011
DOI: 10.1136/amiajnl-2011-000302
|View full text |Cite
|
Sign up to set email alerts
|

A knowledge discovery and reuse pipeline for information extraction in clinical notes

Abstract: A complete pipeline system for constructing language processing models that can be used to process multiple practical detection tasks of language structures of clinical records is presented.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
31
0

Year Published

2014
2014
2024
2024

Publication Types

Select...
4
3

Relationship

0
7

Authors

Journals

citations
Cited by 53 publications
(31 citation statements)
references
References 13 publications
(9 reference statements)
0
31
0
Order By: Relevance
“…More recent projects and contests have consistently shown that ML is a promising methodology in the context of text mining of clinical narratives [7]. In the cancer domain, this is especially true for both for the detection of reportable cases [8,9] and the actual extraction of relevant information [10,11]. Those studies also seem to confirm that ML techniques are best deployed alongside rule-based methodologies.…”
Section: Introductionmentioning
confidence: 85%
“…More recent projects and contests have consistently shown that ML is a promising methodology in the context of text mining of clinical narratives [7]. In the cancer domain, this is especially true for both for the detection of reportable cases [8,9] and the actual extraction of relevant information [10,11]. Those studies also seem to confirm that ML techniques are best deployed alongside rule-based methodologies.…”
Section: Introductionmentioning
confidence: 85%
“…One computational linguist was trained to annotate the corpus, and then a senior computational linguist reviewed each annotation, as a validator for the development of the gold standards. Although intraannotator agreement cannot be evaluated in this method, it is assumed that high level of consistency has been attained, as the overall F-score on the self-validation (a 100% train and test strategy: applies the computed model with baseline feature set to validate the training data until no improvement of the performance can be made [17]) is about 99.9%. This is probably because computational linguists can reliably achieve higher consistency than pathologists in annotating a large corpus of pathology reports, which was indicated in Patrick et al's study [19].…”
Section: Methodsmentioning
confidence: 99%
“…It can be seen that most of them are adapted from the similar ideas in the two rule-based approaches. This is motivated by Patrick et al's work that converted a baseline rule-based method to a statistical approach based on the same idea for assertion classification, which produced better performance [17].…”
Section: Machine-learning-based Approachmentioning
confidence: 98%
See 1 more Smart Citation
“…Surprisingly, Markov blanket (MB) attribute selection was only discovered three times in the search for relevant publications although, it is a very useful hybrid approach for attribute selection and classification and is used in this dissertation to study attribute selection in connection with early DRG classification of inpatients. [6, 7, 14, 17, 28, 39, 48, 50, 58, 72, 90, 97, 100, 109, 124, 129, 133, 139, 150, 164, 178, 187, 189-192, 202, 208, 210, 211, 215, 217, 220, 230, 237, 240, 245] Correlation-based [5,66,71,73,88,148] Information gain [5,9,16,66,68,91,94,99,104,128,148] Markov blanket [16,88,148] Other attribute selection or evaluation techniques [5, 9, 11, 12, 15, 19, 25, 37, 38, 40, 41, 47, 49, 52, 56, 59-61, 66-68,71,73-75,85,88,89,94,96,98,108,110,118,122,123,128, 130, 132, 141-143, 146, 155-158, 168-170, 179, 180, 182, 183, 188, 209, 212, 221, 223, 225, 226, 232, 234-236, 241, 244, 246] Principal component [11, 119-121, 143, 213, 219] Relief algorithms [40,41,66,148] Wrapper [40,41,43,66,71,88,91,126,127,148,153,159,…”
Section: Selection Criteria and Search For Relevant Literaturementioning
confidence: 99%