IN LINEAR REGRESSION FOR TREATMENT EFFECT ESTIMATIONThis paper investigates the use of regularization priors in the context of treatment effect estimation using observational data where the number of control variables is large relative to the number of observations. First, the phenomenon of "regularization-induced confounding" is introduced, which refers to the tendency of regularization priors to adversely bias treatment effect estimates by over-shrinking control variable regression coefficients. Then, a simultaneous regression model is presented which permits regularization priors to be specified in a way that avoids this unintentional "re-confounding".The new model is illustrated on synthetic and empirical data.1. Introduction. This paper considers the use of Bayesian regularized linear regression models for the purpose of estimating a treatment effect from observational data. Treatment effects -the amount some response variable would change if the value of the treatment variable were changed by a given amount -can only be properly estimated from observational data by taking into account all of the various explanatory factors that may otherwise account for the observed correlation between the treatment and response variables. In the case of a linear regression model (assuming it to be correct) this "adjustment for confounding" means that the model includes a sufficient set of control variables as regressors in addition to the treatment
Interference exists when a unit's outcome depends on another unit's treatment assignment. For example, intensive policing on one street could have a spillover effect on neighbouring streets. Classical randomization tests typically break down in this setting because many null hypotheses of interest are no longer sharp under interference. A promising alternative is to instead construct a conditional randomization test on a subset of units and assignments for which a given null hypothesis is sharp. Finding these subsets is challenging, however, and existing methods are limited to special cases or have limited power. In this paper, we propose valid and easy-to-implement randomization tests for a general class of null hypotheses under arbitrary interference between units. Our key idea is to represent the hypothesis of interest as a bipartite graph between units and assignments, and to find an appropriate biclique of this graph. Importantly, the null hypothesis is sharp within this biclique, enabling conditional randomization-based tests. We also connect the size of the biclique to statistical power. Moreover, we can apply off-the-shelf graph clustering methods to find such bicliques efficiently and at scale. We illustrate our approach in settings with clustered interference and show advantages over methods designed specifically for that setting. We then apply our method to a large-scale policing experiment 174
This paper considers linear model selection when the response is vector-valued and the predictors are randomly observed. We propose a new approach that decouples statistical inference from the selection step in a "postinference model summarization" strategy. We study the impact of predictor uncertainty on the model selection procedure. The method is demonstrated through an application to asset pricing.
This paper considers passive fund selection from an individual investor's perspective. The growth of the passive fund market over the past decade is staggering. Individual investors who wish to buy these funds for their retirement and brokerage accounts have many options and are faced with a difficult selection problem. Which funds do they invest in, and in what proportions? We develop a novel statistical methodology to address this problem by adapting recent advances in posterior summarization. A Bayesian decision‐theoretic approach is presented to construct optimal sparse portfolios for individual investors over time.
Interference exists when a unit's outcome depends on another unit's treatment assignment. For example, intensive policing on one street could have a spillover effect on neighboring streets. Classical randomization tests typically break down in this setting because many null hypotheses of interest are no longer sharp under interference. A promising alternative is to instead construct a conditional randomization test on a subset of units and assignments for which a given null hypothesis is sharp. Finding these subsets is challenging, however, and existing methods either have low power or are limited to special cases. In this paper, we propose valid, powerful, and easyto-implement randomization tests for a general class of null hypotheses under arbitrary interference between units. Our key idea is to represent the hypothesis of interest as a bipartite graph between units and assignments, and to find a clique of this graph. Importantly, the null hypothesis is sharp for the units and assignments in this clique, enabling randomization-based tests conditional on the clique. We can apply off-the-shelf graph clustering methods to find such cliques efficiently and at scale. We illustrate this approach in settings with clustered interference and show advantages over methods designed specifically for that setting. We then apply our method to a large-scale policing experiment in Medellín, Colombia, where interference has a spatial structure.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.