2021
DOI: 10.1093/jamiaopen/ooab028
|View full text |Cite
|
Sign up to set email alerts
|

Comparative effectiveness of medical concept embedding for feature engineering in phenotyping

Abstract: Objective Feature engineering is a major bottleneck in phenotyping. Properly learned medical concept embeddings (MCEs) capture the semantics of medical concepts, thus are useful for retrieving relevant medical features in phenotyping tasks. We compared the effectiveness of MCEs learned from knowledge graphs and electronic healthcare records (EHR) data in retrieving relevant medical features for phenotyping tasks. Materials and Methods … Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
6

Relationship

0
6

Authors

Journals

citations
Cited by 7 publications
(3 citation statements)
references
References 29 publications
0
3
0
Order By: Relevance
“…8 Previous work in this domain used supervised and unsupervised machine learning to derive phenotypes for several diseases, with different strengths and limitations (see literature review in Note S1). [9][10][11][12][13][14][15][16][17][18][19][20][21][22][23] Supervised models rely on classifiers based on manually labeled gold standards for each specific disease, which is time-consuming and not scalable. Unsupervised approaches discover phenotypes purely from the data, trying to aggregate medical concepts commonly appearing together in the patient records.…”
Section: Introductionmentioning
confidence: 99%
“…8 Previous work in this domain used supervised and unsupervised machine learning to derive phenotypes for several diseases, with different strengths and limitations (see literature review in Note S1). [9][10][11][12][13][14][15][16][17][18][19][20][21][22][23] Supervised models rely on classifiers based on manually labeled gold standards for each specific disease, which is time-consuming and not scalable. Unsupervised approaches discover phenotypes purely from the data, trying to aggregate medical concepts commonly appearing together in the patient records.…”
Section: Introductionmentioning
confidence: 99%
“…Although symbolic representation enables quantitative reasoning based on statistical probability, its inclusion in machine learning models that execute numerical operations is challenging. KRL aims to convert objects of interest (entities and relations in KG) into a continuous low-dimensional vector space [75,134] to efficiently measure the semantic correlations between entities and relations and to significantly improve knowledge acquisition, fusion, and inference performance. The KRL model of TransE has demonstrated remarkable outcomes in KG reasoning research.…”
Section: Reasoning Based On Krlmentioning
confidence: 99%
“…Some of the studies evaluated the effectiveness of word embedding models on multiple tasks. Lee et al [135] evaluated Node2Vec, singular value decomposition, Language Identification for Named Entities, Word2Vec, and global vectors for word representation (GloVe) in retrieving relevant medical features for phenotyping tasks. The authors demonstrated that GloVe, when trained on EHR data, outperforms other embedding methods.…”
Section: Language Modelingmentioning
confidence: 99%