2021
DOI: 10.1038/s41467-021-22328-4
|View full text |Cite
|
Sign up to set email alerts
|

Ontology-driven weak supervision for clinical entity classification in electronic health records

Abstract: In the electronic health record, using clinical notes to identify entities such as disorders and their temporality (e.g. the order of an event relative to a time index) can inform many important analyses. However, creating training data for clinical entity tasks is time consuming and sharing labeled data is challenging due to privacy concerns. The information needs of the COVID-19 pandemic highlight the need for agile methods of training machine learning models for clinical notes. We present Trove, a framework… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

2
44
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
4
2
2
2

Relationship

1
9

Authors

Journals

citations
Cited by 59 publications
(46 citation statements)
references
References 48 publications
(29 reference statements)
2
44
0
Order By: Relevance
“…Because correctly labeling every medical record is laborious, and takes in our experience on the order of minutes per case, it is hard to train rich models like deep learning approaches 35 with little tagged data. Weak supervision 36 offers an appealing approach, and our optimized rules (Fig 4) can serve as a set of labeling functions to introduce into such a model. The 4,259 patients MonoMiner flags, covering 560 different monogenic diseases and 534 different causal genes, are a treasure trove.…”
Section: Discussionmentioning
confidence: 99%
“…Because correctly labeling every medical record is laborious, and takes in our experience on the order of minutes per case, it is hard to train rich models like deep learning approaches 35 with little tagged data. Weak supervision 36 offers an appealing approach, and our optimized rules (Fig 4) can serve as a set of labeling functions to introduce into such a model. The 4,259 patients MonoMiner flags, covering 560 different monogenic diseases and 534 different causal genes, are a treasure trove.…”
Section: Discussionmentioning
confidence: 99%
“…To search clinical notes content, the TEXT command allows users to specify a word or phrase to search for as well as modifiers (including the kind of note, whether the word/phrase is negated, and whether the word/phrase occurs in the context of family history). The set of possible words and phrases that can be searched depends on the text processing system used to generate the processed data that ACE ingests during ETL, such as Trove 35 (a system we developed for concept and relation extraction that is OMOP CDM compatible), cTAKES, 36 and MedLEE, 37 among many others. ACE’s feature-specific commands can be modified for different needs since they exist separately from the language algebra.…”
Section: Methodsmentioning
confidence: 99%
“…Weakly supervised learning can build desired labels with only partial participation of domain experts, and may potentially preserve resource use. One example was provided by performing weakly supervised classification tasks using medical ontologies and expert-driven rules on patients visiting the emergency department with COVID-19 related symptoms [ 43 ]. When ontology-based weak supervision was coupled with pretrained language models, the engineering cost of creating classifiers was reduced more than for simple weakly supervised learning, showing an improved performance compared to a majority vote classifier.…”
Section: Future Tasks Of Ai In Critical Carementioning
confidence: 99%