2014
DOI: 10.1093/aje/kwu253
|View full text |Cite
|
Sign up to set email alerts
|

Improving Propensity Score Estimators' Robustness to Model Misspecification Using Super Learner

Abstract: The consistency of propensity score (PS) estimators relies on correct specification of the PS model. The PS is frequently estimated using main-effects logistic regression. However, the underlying model assumptions may not hold. Machine learning methods provide an alternative nonparametric approach to PS estimation. In this simulation study, we evaluated the benefit of using Super Learner (SL) for PS estimation. We created 1,000 simulated data sets (n = 500) under 4 different scenarios characterized by various … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2

Citation Types

2
116
0

Year Published

2016
2016
2022
2022

Publication Types

Select...
8
1

Relationship

2
7

Authors

Journals

citations
Cited by 128 publications
(120 citation statements)
references
References 57 publications
2
116
0
Order By: Relevance
“…For many of the available approaches, it may be possible to improve performance if more care is taken in how the methods are fine-tuned to select a solution. For example, use of the super learning method which simultaneously runs multiple machine learning methods (including GBM) to estimate propensity scores is currently fine-tuned to select as optimal the combination of machine learners that yields the best prediction (van der Laan 2014, Pirracchio, Petersen, and van der Laan 2015). It may be feasible to improve the already high performance of this method by fine-tuning it so it selects as optimal the combination of machine-learners that yields the best balance.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…For many of the available approaches, it may be possible to improve performance if more care is taken in how the methods are fine-tuned to select a solution. For example, use of the super learning method which simultaneously runs multiple machine learning methods (including GBM) to estimate propensity scores is currently fine-tuned to select as optimal the combination of machine learners that yields the best prediction (van der Laan 2014, Pirracchio, Petersen, and van der Laan 2015). It may be feasible to improve the already high performance of this method by fine-tuning it so it selects as optimal the combination of machine-learners that yields the best balance.…”
Section: Discussionmentioning
confidence: 99%
“…In turn, the popularity of propensity scores has given rise to great methodological interest on how best to estimate them. Methods considered have included parametric methods such as logistic regression with or without explicit controls for covariate balance, machine learning methods such as generalized boosted models (GBM), random forests (RF), Bayesian adaptive regression trees (BART), super learning, high dimensional propensity score (hd-PS) methodology, and entropy balancing (van der Laan 2014, Breiman 2001, Hill 2011, Imai and Ratkovic 2014, Liaw and Wiener 2002, McCaffrey, Ridgeway, and Morral 2004b, Pirracchio, Petersen, and van der Laan 2015, Hainmueller 2012). …”
Section: Introductionmentioning
confidence: 99%
“…This was indeed the case in the present sample. Finally, PS model specification is of paramount importance both in terms of the variables included in the model11 and the functional form of the relationship between treatment allocation and the explanatory variables 22. After matching, the best PS model is then the one that offers the best balance across groups.…”
Section: Discussionmentioning
confidence: 99%
“…These methods eliminate reliance on a simple parametric logistic regression model and do not require the researcher to determine which pre-treatment covariates and their respective interactions should be included in the model. It has been shown that the resulting weights from these approaches yield more precise treatment effect estimates and lower mean squared error than traditional logistic regression methods (Harder et al, 2010; Lee et al, 2010; Pirracchio et al, 2015). …”
Section: Introductionmentioning
confidence: 99%