Linear regression is arguably the most fundamental statistical model; however, the validity of its use in randomized clinical trials, despite being common practice, has never been crystal clear, particularly when stratified or covariate-adaptive randomization is used. In this article, we investigate several of the most intuitive and commonly used regression models for estimating and inferring the treatment effect in randomized clinical trials. By allowing the regression model to be arbitrarily misspecified, we demonstrate that all these regression-based estimators robustly estimate the treatment effect, albeit with possibly different efficiency. We also propose consistent non-parametric variance estimators and compare their performances to those of the model-based variance estimators that are readily available in standard statistical software.Based on the results and taking into account both theoretical efficiency and practical feasibility, we make recommendations for the effective use of regression under various scenarios. For equal allocation, it suffices to use the regression adjustment for the stratum covariates and additional baseline covariates, if available, with the usual ordinary-least-squares variance estimator. For unequal allocation, regression with treatment-by-covariate interactions should be used, together with our proposed variance estimators. These recommendations apply to simple and stratified randomization, and minimization, among others. We hope this work helps to clarify and promote the usage of regression in randomized clinical trials.
We consider the problem of estimating and inferring treatment effects in randomized experiments. In practice, stratified randomization, or more generally, covariate-adaptive randomization, is routinely used in the design stage to balance treatment allocations with respect to a few variables that are most relevant to the outcomes. Then, regression is performed in the analysis stage to adjust the remaining imbalances to yield more efficient treatment effect estimators. Building upon and unifying the recent results obtained for ordinary least squares adjusted estimators under covariate-adaptive randomization, this paper presents a general theory of regression adjustment that allows for model misspecification and the presence of a large number of baseline covariates. We exemplify the theory on two lasso-adjusted treatment effect estimators, both of which are optimal in their respective classes. In addition, nonparametric consistent variance estimators are proposed to facilitate valid inferences, which work irrespective of the specific randomization methods used. The robustness and improved efficiency of the proposed estimators are demonstrated through numerical studies.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.