The Two-Sample Two-Stage Least Squares (TS2SLS) data combination estimator is a popular estimator for the parameters in linear models when not all variables are observed jointly in one single data set. Although the limiting normal distribution has been established, the asymptotic variance formula has only been stated explicitly in the literature for the case of conditional homoskedasticity. By using the fact that the TS2SLS estimator is a function of reduced form and first-stage OLS estimators, we derive the variance of the limiting normal distribution under conditional heteroskedasticity. A robust variance estimator is obtained, which generalises to cases with more general patterns of variable (non-)availability. Stata code and some Monte Carlo results are provided in an Appendix. Stata code for a nonlinear GMM estimator that is identical to the TS2SLS estimator in just identified models and asymptotically equivalent to the TS2SLS estimator in overidentified models is also provided there.
This paper investigates the problem of making inference about the coefficients in the linear projection of an outcome variable y on covariates (x, z) when data are available from two independent random samples; the first sample contains information on only the variables (y, z), while the second sample contains information on only the covariates. In this context, the validity of existing inference procedures depends crucially on the assumptions imposed on the joint distribution of (y, z, x). This paper introduces a novel characterization of the identified set of the coefficients of interest when no assumption (except for the existence of second moments) on this joint distribution is imposed. One finding is that inference is necessarily nonstandard because the function characterizing the identified set is a nondifferentiable (yet directionally differentiable) function of the data. The paper then introduces an estimator and a confidence interval based on the directional differential of the function characterizing the identified set. Monte Carlo experiments explore the numerical performance of the proposed estimator and confidence interval.
This note studies the criterion for identifiability in parametric models based on the minimization of the Hellinger distance and exhibits its relationship to the identifiability criterion based on the Fisher matrix. It shows that the Hellinger distance criterion serves to establish identifiability of parameters of interest, or lack of it, in situations where the criterion based on the Fisher matrix does not apply, like in models where the support of the observed variables depends on the parameter of interest or in models with irregular points of the Fisher matrix. Several examples illustrating this result are provided.
There are surveys that gather precise information on an outcome of interest, but measure continuous covariates by a discrete number of intervals, in which case the covariates are interval censored. For applications with a second independent dataset precisely measuring the covariates, but not the outcome, this paper introduces a semiparametrically efficient estimator for the coefficients in a linear regression model. The second sample serves to establish point identifi-cation. An empirical application investigating the relationship between income and body mass index illustrates the use of the estimator.1 For the advantages see, for example, Juster and Smith (1997). For the disadvantages see, for example, Hsiao (1983) and Rigobon and Stoker (2009). 2 See, for example, Hsiao (1983) for a linear regression model. This paper introduces an estimator, called the two-step, two-sample augmented generalized instrumental variable (2S-AGIV) estimator, drawing on the comparative advantages of the point-and set-identifying approaches. The paper has three results. The first result states that, when there is a second independent sample with continuous measurements of the covariates, the linear regression model with interval-censored covariates in the first sample point-identifies the coefficients of interest. The model implies an identifying moment restriction using indicator variables for the intervals as instrumental variables observed in both samples. Neither parametric, support, nor monotonicity restrictions on the interval-censored covariates are needed to obtain this result. The second result shows that the existing two-sample instrumental variable estimators, including the two-stage least squares (2SLS; Klevmarken, 1982) and two-sample instrumental variable (2S-GIV; Ridder & Moffitt, 2007) estimators, are consistent and asymptotically normal; however, they are not semiparametrically efficient. The 2SLS estimator is equivalent to imputing in the censored sample the truncated mean of the covariate of interest within the interval calculated from the uncensored sample. The 2S-GIV estimator is equivalent to a weighted least squares estimator on the truncated mean outcome and covariate of interest within the interval. The paper shows that the 2S-AGIV estimator is consistent, asymptotically normal, and semiparametrically efficient, which is the third result. This last property means that, in large samples, the 2S-AGIV estimator can realize in the best possible way the precision gains offered by the point-identifying approach without parametrizing the distribution of the covariates. A simulation study bears out these theoretical properties of the 2S-AGIV estimator.An empirical exercise using data from the HSE and the FRS illustrates and supports the use of the 2S-AGIV estimator. The exercise tests the unearned income effect (UIE) hypothesis, which postulates an inverted U-shaped relationship between income and body mass index (BMI; see; Lakdawalla & Philipson, 2009). This hypothesis is relevant, for instance, in assessing the eff...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.