2015
DOI: 10.1016/j.jbi.2015.10.001
|View full text |Cite
|
Sign up to set email alerts
|

Learning probabilistic phenotypes from heterogeneous EHR data

Abstract: We present the Unsupervised Phenome Model (UPhenome), a probabilistic graphical model for large-scale discovery of computational models of disease, or phenotypes. We tackle this challenge through the joint modeling of a large set of diseases and a large set of clinical observations. The observations are drawn directly from heterogeneous patient record data (notes, laboratory tests, medications, and diagnosis codes), and the diseases are modeled in an unsupervised fashion. We apply UPhenome to two qualitatively… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

1
107
0

Year Published

2017
2017
2022
2022

Publication Types

Select...
7
2

Relationship

0
9

Authors

Journals

citations
Cited by 125 publications
(108 citation statements)
references
References 27 publications
1
107
0
Order By: Relevance
“…Increasing availability of genetic data led to the growing interest in personalizing medical treatment using precision medicine approaches [8]. To a large degree, these advances have been influenced by the extensive body of work on clinical decision-making and the role of data in improving medical reasoning and judgment [912] coupled with novel informatics and data science methods for analyzing clinical data [13,14]. …”
Section: Background and Significancementioning
confidence: 99%
“…Increasing availability of genetic data led to the growing interest in personalizing medical treatment using precision medicine approaches [8]. To a large degree, these advances have been influenced by the extensive body of work on clinical decision-making and the role of data in improving medical reasoning and judgment [912] coupled with novel informatics and data science methods for analyzing clinical data [13,14]. …”
Section: Background and Significancementioning
confidence: 99%
“…One of the points of the PopKLD algorithm to create a reasonable, interpretable, stable, automatable, ordinal summary of laboratory data that can be used in high-throughput situations where machine learning [17,16,15] is used to categorize humans. Based on the results from the clinical evaluation, we believe that PopKLD will be very useful in phenotyping studies.…”
Section: Discussionmentioning
confidence: 99%
“…modeling phenotypes from anchor variables [44,45] and silver-standard training data [46]. We also note that from the perspective of learning shared representations of diseases (such as the abstraction feature representation in this study), contemporary phenotyping effort has led to a growing body of work that learns phenotypes from population-scale clinical data using the methodology of representation learning [47,48] including i) spectral learning such as non-negative tensor factorization [5], ii) probabilistic mixture models [8], and additionally, when temporal phenotypic patterns are considered, iii) unsupervised feature learning using autoencoders [32] and latent medical concepts [49], etc., and iv) deep learning [6]. …”
Section: Discussionmentioning
confidence: 99%
“…In the general landscape of computational phenotyping research, many endeavors have been made to progressively replace the predominant use of rule-based phenotyping algorithms, comprising the majority in eMERGE network [4], by the predictive analytical approach based on machine learning and natural language processing (NLP) [58]. While this data-driven analytic methodology alleviates the tedious process of manually selecting features and their logical combinations that match phenotype definitions in an ad hoc fashion, predictive analytics is not without its own outstanding challenges.…”
Section: Introductionmentioning
confidence: 99%