Propensity score estimation: neural networks, support vector machines, decision trees (CART), and meta-classifiers as alternatives to logistic regression

Westreich, Daniel; Lessler, Justin; Funk, Michele Jönsson

doi:10.1016/j.jclinepi.2009.11.020

Cited by 418 publications

(309 citation statements)

References 36 publications

Supporting

Mentioning

280

Contrasting

Unclassified

Order By: Relevance

“…Thus, if we knew e(x), we would have access to a simple unbiased estimator for τ (x); this observation lies at the heart of methods based on propensity weighting [e.g., Hirano et al, 2003]. Many early applications of machine learning to causal inference effectively reduce to estimating e(x) using, e.g., boosting, a neural network, or even random forests, and then transforming this into an estimate for τ (x) using (3) [e.g., McCaffrey et al, 2004, Westreich et al, 2010. In this paper, we take a more indirect approach: We show that, under regularity assumptions, causal forests can use the unconfoundedness assumption (2) to achieve consistency without needing to explicitly estimate the propensity e(x).…”

Section: Treatment Estimation With Unconfoundednessmentioning

confidence: 99%

Estimation and Inference of Heterogeneous Treatment Effects using Random Forests

Wager

Athey

2018

Journal of the American Statistical Association

1,952

1,780

View full text Add to dashboard Cite

Many scientific and engineering challenges-ranging from personalized medicine to customized marketing recommendations-require an understanding of treatment effect heterogeneity. In this paper, we develop a non-parametric causal forest for estimating heterogeneous treatment effects that extends Breiman's widely used random forest algorithm. In the potential outcomes framework with unconfoundedness, we show that causal forests are pointwise consistent for the true treatment effect, and have an asymptotically Gaussian and centered sampling distribution. We also discuss a practical method for constructing asymptotic confidence intervals for the true treatment effect that are centered at the causal forest estimates. Our theoretical results rely on a generic Gaussian theory for a large family of random forest algorithms. To our knowledge, this is the first set of results that allows any type of random forest, including classification and regression forests, to be used for provably valid statistical inference. In experiments, we find causal forests to be substantially more powerful than classical methods based on nearest-neighbor matching, especially in the presence of irrelevant covariates.

show abstract

Section: Treatment Estimation With Unconfoundednessmentioning

confidence: 99%

Estimation and Inference of Heterogeneous Treatment Effects using Random Forests

Wager

Athey

2018

Journal of the American Statistical Association

1,952

1,780

View full text Add to dashboard Cite

show abstract

“…31 Indeed, some researchers have been advocating for the use of data-adaptive methods, including the Super Learner, for effect estimation via singlyrobust methods, depending on estimation of either the conditional mean outcome or the propensity score. [21][22][23][24][25][26][27] While flexible algorithms can reduce the risk of bias due to regression model misspecification, a serious concern is that the use of data-adaptive algorithms in this context can result in invalid statistical inference (i.e. misleading confidence intervals).…”

Section: Discussionmentioning

confidence: 99%

Stacked Generalization: An Introduction to Super Learning

Naimi

Balzer

2017

Preprint

View full text Add to dashboard Cite

Stacked generalization is an ensemble method that allows researchers to combine several different prediction algorithms into one. Since its introduction in the early 1990s, the method has evolved several times into what is now known as “Super Learner”. Super Learner uses V -fold cross-validation to build the optimal weighted combination of predictions from a library of candidate algorithms. Optimality is defined by a user-specified objective function, such as minimizing mean squared error or maximizing the area under the receiver operating characteristic curve. Although relatively simple in nature, use of the Super Learner by epidemiologists has been hampered by limitations in understanding conceptual and technical details. We work step-by-step through two examples to illustrate concepts and address common concerns.

show abstract

“…Further, trees can handle a variety of input variable types as well as missing values. Finally, regression models require the assessment of the linearity assumption, which is typically overlooked (Westreich et al, 2010), whereas trees do not make this assumption.…”

Section: A Classification and Regression Tree Approachmentioning

confidence: 99%

“…Scholars have discussed the potential of this approach in epidemiological (Little and Rubin, 2000), sociological (Winship and Sobel, 2004), and econometric (Dehejja and Wahba, 2002;Heckman et al, 1997Heckman et al, , 1998 literature, and this method has also found promising applications in management and information systems research to assess causal effects at the individual and firm levels (e.g., Rubin and Waterman, 2006;Mithas and Almirall, 2006;Mithas et al, 2005;Mithas and Lucas, 2010). In almost all these applications, researchers use a logistic or a probit model to compute propensity scores, with a small emergent research (Lee et al, 2010;Westreich et al, 2010) on the use of classification trees and their variants for computing propensity scores instead of logistic regression.…”

Section: Introductionmentioning

confidence: 99%

Impact Assessment in Observational Studies: A Classification and Regression Tree Approach

Shmueli

Mani

2013

SSRN Journal

View full text Add to dashboard Cite

We introduce a tree-based approach for assessing the performance impact of diverse self-selected interventions in management research. Our approach, which takes advantage of "Big Data", or observational data with large sample sizes and a large number of variables, offers important advantages over traditional propensity score matching.In particular, the tree-based approach to assessing the impact of interventions offers a data-driven methodology that applies to a wide range of intervention types (binary, polytomous, continuous), allows for examination of nascent interventions whose selection cannot be theoretically specified a priori, identifies pre-intervention variables that correlate with the self-selected intervention, and presents comparisons of ensuing performance in visuals that are easy to discern and understand. We illustrate the method and the insights that it yields in the context of two studies: analysis of the impact of an eGov service in India, and comparison of performance across different contractual pricing mechanisms and contract durations in the outsourcing of technology and technology-enabled business functions.

show abstract

Propensity score estimation: neural networks, support vector machines, decision trees (CART), and meta-classifiers as alternatives to logistic regression

Cited by 418 publications

References 36 publications

Estimation and Inference of Heterogeneous Treatment Effects using Random Forests

Estimation and Inference of Heterogeneous Treatment Effects using Random Forests

Stacked Generalization: An Introduction to Super Learning

Impact Assessment in Observational Studies: A Classification and Regression Tree Approach

Contact Info

Product

Resources

About