To Impute or not to Impute? Missing Data in Treatment Effect Estimation

Berrevoets, Jeroen; Imrie, Fergus; Kyono, Trent; Jordon, J.B.; Schaar, Mihaela van der

doi:10.48550/arxiv.2202.02096

Cited by 2 publications

(3 citation statements)

References 21 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Depending on which imputation strategy we use, we may introduce additional parametric assumptions into our pipeline. Furthermore, depending on which mechanism governs the missingness patterns, we may have to make additional structural assumptions before imputation can even begin [14,16,77]. Making such a structural assumptions, before learning a graph using Peters et al [73] would make our solution for the practitioner's problem invalid once again.…”

Section: Definition 2 (Transition)mentioning

confidence: 99%

“…CDL can help define interaction effects of missingness and covariates to improve the accuracy of imputation methods. For example, CDL-based approaches allow deep learning to accurately impute missing values, respecting the causal interaction between missingness indicators and treatment selection [14][15][16]. CDL-based methods outperform existing imputation techniques in terms of both imputation accuracy and unbiased estimation from data with missing values.…”

Section: Real-world Applicationsmentioning

confidence: 99%

See 1 more Smart Citation

Causal Deep Learning

Berrevoets¹,

Kacprzyk²,

Qian³

et al. 2023

Preprint

View full text Add to dashboard Cite

Causality has the potential to truly transform the way we solve a large number of realworld problems. Yet, so far, its potential remains largely unlocked since most work so far requires strict assumptions which do not hold true in practice. To address this challenge and make progress in solving real-world problems, we propose a new way of thinking about causality -we call this causal deep learning. The framework which we propose for causal deep learning spans three dimensions: (1) a structural dimension, which allows incomplete causal knowledge rather than assuming either full or no causal knowledge; (2) a parametric dimension, which encompasses parametric forms which are typically ignored; and finally, (3) a temporal dimension, which explicitly allows for situations which capture exposure times or temporal structure. Together, these dimensions allow us to make progress on a variety of real-world problems by leveraging (sometimes incomplete) causal knowledge and/or combining diverse causal deep learning methods. This new framework also enables researchers to compare systematically across existing works as well as identify promising research areas which can lead to real-world impact.

show abstract

Section: Definition 2 (Transition)mentioning

confidence: 99%

Section: Real-world Applicationsmentioning

confidence: 99%

Causal Deep Learning

Berrevoets¹,

Kacprzyk²,

Qian³

et al. 2023

Preprint

View full text Add to dashboard Cite

show abstract

“…However, few literature investigated these strategies in the estimation of medical treatment effect. Berrevoets et al found that selective imputation performed better than naively impute all data in causal context 11 , but it’s difficult to identify either missingness causing treatment or missingness caused by treatment pattern in practice. Thus our study aimed to compare various MI strategies with complete case analysis in treatment effect estimation systematically to evaluate whether imputing data works better and which imputation strategy should be adopted in RCT emulation using RWD.…”

Section: Introductionmentioning

confidence: 99%

Data Imputation for Clinical Trial Emulation: A Case Study on Impact of Intracranial Pressure Monitoring for Traumatic Brain Injury

Zhao

Liu

Groner

et al. 2023

Preprint

View full text Add to dashboard Cite

Randomized clinical trial emulation using real-world data is significant for treatment effect evaluation. Missing values are common in the observational data. Handling missing data improperly would cause biased estimations and invalid conclusions. However, discussions on how to address this issue in causal analysis using observational data are still limited. Multiple imputation by chained equations (MICE) is a popular approach to fill in missing data. In this study, we combined multiple imputation with propensity score weighted model to estimate the average treatment effect (ATE). We compared various multiple imputation (MI) strategies and a complete data analysis on two benchmark datasets. The experiments showed that data imputations had better performances than completely ignoring the missing data, and using different imputation models for different covariates gave a high precision of estimation. Furthermore, we applied the optimal strategy on a medical records data to evaluate the impact of ICP monitoring on inpatient mortality of traumatic brain injury (TBI). The experiment details and code are available athttps://github.com/Zhizhen-Zhao/IPTW-TBI.

show abstract

To Impute or not to Impute? Missing Data in Treatment Effect Estimation

Cited by 2 publications

References 21 publications

Causal Deep Learning

Causal Deep Learning

Data Imputation for Clinical Trial Emulation: A Case Study on Impact of Intracranial Pressure Monitoring for Traumatic Brain Injury

Contact Info

Product

Resources

About