2019
DOI: 10.1080/02664763.2019.1582614
|View full text |Cite
|
Sign up to set email alerts
|

Propensity score prediction for electronic healthcare databases using super learner and high-dimensional propensity score methods

Abstract: The optimal learner for prediction modeling varies depending on the underlying data-generating distribution. Super Learner (SL) is a generic ensemble learning algorithm that uses cross-validation to select among a "library" of candidate prediction models. The SL is not restricted to a single prediction model, but uses the strengths of a variety of learning algorithms to adapt to different databases. While the SL has been shown to perform well in a number of settings, it has not been thoroughly evaluated in lar… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
48
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
7
1

Relationship

2
6

Authors

Journals

citations
Cited by 43 publications
(49 citation statements)
references
References 33 publications
0
48
0
Order By: Relevance
“…Super Learner has also been considered in the context of longitudinal data, where it has been found useful in the presence of model misspecification . Overall, Super Learner is now being widely adopted in the causal inference literature and in applications …”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…Super Learner has also been considered in the context of longitudinal data, where it has been found useful in the presence of model misspecification . Overall, Super Learner is now being widely adopted in the causal inference literature and in applications …”
Section: Introductionmentioning
confidence: 99%
“…14 Overall, Super Learner is now being widely adopted in the causal inference literature and in applications. [15][16][17][18] In much of this literature, and certainly in most applications, adjustment via the propensity score is achieved in a singly robust fashion, that is, designed to give consistent inference on the causal estimand under an assumption of correct specification (at least within a class of parametric, flexible, or ensemble procedures). Doubly robust procedures that provide consistent estimation if either the propensity score or a proposed conditional outcome mean model-as would be utilized in standard regression-are correctly specified are also well established in the statistical literature, but such procedures are not as widely adopted in practice.…”
Section: Introductionmentioning
confidence: 99%
“…This is especially true when computationally intensive learners, such as bagged CART or boosted CART , are included in the candidate library . In general, the computation time for SL is at least twice the sum of all the candidate learners' computation time, considering fitting on the training sets, computing the corresponding weights from the validation sets, and fitting the entire data eventually . Similar to other MSCM simulation studies , we computed robust sandwich standard error in this paper.…”
Section: Discussionmentioning
confidence: 99%
“…Therefore, the weights of the super learner are calculated by minimizing the single-split cross-validated loss as suggested in [9]. Ju et al [34] show the success of the single-split super learner on three large healthcare databases.…”
Section: Cross-validationmentioning
confidence: 99%