2019
DOI: 10.1186/s12911-019-0985-7
|View full text |Cite
|
Sign up to set email alerts
|

Representation learning for clinical time series prediction tasks in electronic health records

Abstract: BackgroundElectronic health records (EHRs) provide possibilities to improve patient care and facilitate clinical research. However, there are many challenges faced by the applications of EHRs, such as temporality, high dimensionality, sparseness, noise, random error and systematic bias. In particular, temporal information is difficult to effectively use by traditional machine learning methods while the sequential information of EHRs is very useful.MethodIn this paper, we propose a general-purpose patient repre… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
29
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 36 publications
(30 citation statements)
references
References 30 publications
0
29
0
Order By: Relevance
“…The representation of how a patient's condition evolves over time using ML has been predominantly studied from the perspective of temporal disease trajectories through consecutive clinical encounters [19,33,34]. These representations are derived mainly from clinical diagnosis codes, which can be merged with other categorical information such as medications, demographics or non-numeric descriptions of laboratory test results [8,18]. In this work we focus on a narrower time-window, hospital admissions in a general inpatient population, and use laboratory blood tests and vital signs to form a numeric representation of patients' physiological status and illness acuity over time.…”
Section: Discussionmentioning
confidence: 99%
“…The representation of how a patient's condition evolves over time using ML has been predominantly studied from the perspective of temporal disease trajectories through consecutive clinical encounters [19,33,34]. These representations are derived mainly from clinical diagnosis codes, which can be merged with other categorical information such as medications, demographics or non-numeric descriptions of laboratory test results [8,18]. In this work we focus on a narrower time-window, hospital admissions in a general inpatient population, and use laboratory blood tests and vital signs to form a numeric representation of patients' physiological status and illness acuity over time.…”
Section: Discussionmentioning
confidence: 99%
“…We considered as benchmarks five supervised methods using the labeled set alone: (i) LASSO-penalized logistic regression [16,17,34,3739], (ii) random forest (RF) [40,41], (iii) linear discriminant analysis (LDA) [42], and (iv) LSTM-gated recurrent neural network (RNN) [24,39,43,44] trained with raw feature counts C i , t , as well as (v) LDA trained with patient-timepoint embeddings generated without weights , which we refer to as LDA embed . In addition, we considered a semi-supervised benchmark: hidden markov model (HMM) [2629,45,46] with a multivariate gaussian emission trained with the weight-free embeddings .…”
Section: Methodsmentioning
confidence: 99%
“…Recurrent neural networks (RNNs), which are designed for sequence data and well-conditioned to high feature dimensions, have enjoyed particularly widespread application to prediction using EHR data. [21][22][23][24][25] However, these models require large numbers of training labels to achieve stable performance, which is not feasible for phenotypes necessitating manual labeling. Consequently, existing applications of RNNs to EHR-based prediction all use readily available outcome measures such as discharge billing codes, limiting application to phenotypes with reliable codified proxies.…”
Section: Introductionmentioning
confidence: 99%
“…The importance of diagnosis codes should be taken seriously. A recurrent neural network based denoising autoencoder, proposed in [29], was employed to encode in-hospital records of each patient into a low dimensional dense vector. The patient representation they learned is used to the prediction of clinical events.…”
Section: Representation Learning In Ehrsmentioning
confidence: 99%