This supplement provides details of the implementation of the test statistic described in Section 3.1 in "A Test of Exogeneity Without Instrumental Variables in Models With Bunching." It also develops the theorems that describe the test statistic's asymptotic behavior. Finally, it presents a Monte Carlo study of the small sample behavior of the test statistic using real data (the same data set used in the paper's Section 3). S1. A TEST STATISTIC BASED ON θAS DESCRIBED IN THE PAPER in Section 3.2, the empirical application is based on the parameterIn order to estimate θ, a simple two-step process is suggested, which consists of first estimating the term E[Y |X = 0 Z] and then estimating the outer limit as x ↓ 0. The requirements of the approach are the following.ASSUMPTION S1: Suppose that ] is finite and uniformly bounded.The first requirement permits that the sample be divided between observations such that X = 0, and X = 0. This requirement can be relaxed so that the test can still be applied to cases in which there is no bunching (see Remark S1.1). The other requirements are there to guarantee that E[Y |X = 0 Z] is estimable at the √ n rate, but they have an effect on the null hypothesis (see Remark S1.5).If Assumption S1(2) holds, thenThe estimation of γ is done with an OLS regression of Y onto Z using only observations such that X = 0. The outer limit in θ is a boundary quantity, and so the limit estimator needs to take this into account. The issues with nonparametric boundary estimation are well known and addressed extensively in the Regresson Discontinuity Design (RDD) literature (e.g., Hahn, Todd, and Van der Klaauw (2001), Porter (2003), and Imbens and Lemieux (2008). The quantity in (S1) is of the same nature as that in the RDD, with the only difference being that the dependent variable in the regression, Z γ − Y , has to be estimated. However, Assumption S1 guarantees that γ is estimated at the rate of √ n, whereas
This paper proposes a new strategy for the identification of the marginal effects of an endogenous multivalued variable (which can be continuous, or a vector) in a model with an Instrumental Variable (IV) of lower dimension, which may even be a single binary variable, and multiple controls. Despite the failure of the classical order condition, we show that identification may be achieved by exploiting heterogeneity of the “first stage” in the controls through a new rank condition that we term covariance completeness. The identification strategy justifies the use of interactions between instruments and controls as additional exogenous variables and can be straightforwardly implemented by parametric, semiparametric, and nonparametric two-stage least squares estimators, following the same generic algorithm. Monte Carlo simulations show that the estimators have excellent performance in moderate sample sizes. Finally, we apply our methods to the problem of estimating the effect of air quality on house prices, based on Chay and Greenstone (2005, Journal of Political Economy 113, 376–424). All methods are implemented in a companion Stata software package.
This paper considers identification and estimation of causal effect parameters from participating in a binary treatment in a difference in differences (DID) setup when the parallel trends assumption holds after conditioning on observed covariates. Relative to existing work in the econometrics literature, we consider the case where the value of covariates can change over time and, potentially, where participating in the treatment can affect the covariates themselves. We propose new empirical strategies in both cases. We also consider two-way fixed effects (TWFE) regressions that include time-varying regressors, which is the most common way that DID identification strategies are implemented under conditional parallel trends. We show that, even in the case with only two time periods, these TWFE regressions are not generally robust to (i) time-varying covariates being affected by the treatment, (ii) treatment effects and/or paths of untreated potential outcomes depending on the level of time-varying covariates in addition to only the change in the covariates over time, (iii) treatment effects and/or paths of untreated potential outcomes depending on time-invariant covariates, (iv) treatment effect heterogeneity with respect to observed covariates, and (v) violations of strong functional form assumptions, both for outcomes over time and the propensity score, that are unlikely to be plausible in most DID applications. Thus, TWFE regressions can deliver misleading estimates of causal effect parameters in a number of empirically relevant cases. We propose both doubly robust estimands and regression adjustment/imputation strategies that are robust to these issues while not being substantially more challenging to implement.
This paper proposes a new strategy for the identification of all the marginal effects of an endogenous multi-valued variable (which can be continuous, or a vector) in a regression model with one binary instrumental variable. The unobservables must be separable from the endogenous variable of interest in the model. Identification is achieved by exploiting heterogeneity of the "first stage" in covariates. The covariates themselves may be endogenous, and their endogeneity does not need to be modeled. With some modifications, the identification strategy is extended to the Regression Discontinuity Design (RDD) with multi-valued endogenous variables, thereby showing that adding covariates in RDD may improve identification. This paper also provides parametric, semiparametric and nonparametric estimators based on the identification strategy, discusses their asymptotic properties, and shows that the estimators have satisfactory performance in moderate samples sizes. All the proposed estimators can be implemented as Two-Stage Least Squares (TSLS). Finally, we apply our methods to the problem of estimating the effect of air quality on house prices, based on Chay and Greenstone (2005).
We study the effects of enrichment activities such as reading, homework, and extracurricular lessons on children's cognitive and non-cognitive skills. We take into consideration that children forgo alternative activities, such as play and socializing, in order to spend time on enrichment. Our study controls for selection on unobservables using a novel approach which leverages the fact that many children spend zero hours per week on enrichment activities. At zero enrichment, confounders vary but enrichment does not, which gives us direct information about the effect of confounders on skills. Using time diary data available in the Panel Study of Income Dynamics (PSID), we find that the net effect of enrichment is zero for cognitive skills and negative for non-cognitive skills, which suggests that enrichment may be crowding out more productive activities on the margin. The negative effects on non-cognitive skills are concentrated in higher-income students in high school, consistent with elevated academic competition related to college admissions.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.