Abstract:We compare two recently proposed methods that combine ideas from conformal inference and quantile regression to produce locally adaptive and marginally valid prediction intervals under sample exchangeability (Romano et al., 2019 [1]; Kivaranovic et al., 2019 [2]). First, we prove that these two approaches are asymptotically efficient in large samples, under some additional assumptions. Then we compare them empirically on simulated and real data. Our results demonstrate that the method in Romano et al. (2019) t… Show more
“…For all conditional quantile estimators, we set . Lastly, we use data as the training fold, as suggested by Sesia and Candès (2020). Our method is implemented in Rcfcausal package, available at https://github.com/lihualei71/cfcausal.…”
Section: From Observables To Counterfactualsmentioning
Evaluating treatment effect heterogeneity widely informs treatment decision making. At the moment, much emphasis is placed on the estimation of the conditional average treatment effect via flexible machine learning algorithms. While these methods enjoy some theoretical appeal in terms of consistency and convergence rates, they generally perform poorly in terms of uncertainty quantification. This is troubling since assessing risk is crucial for reliable decision‐making in sensitive and uncertain environments. In this work, we propose a conformal inference‐based approach that can produce reliable interval estimates for counterfactuals and individual treatment effects under the potential outcome framework. For completely randomized or stratified randomized experiments with perfect compliance, the intervals have guaranteed average coverage in finite samples regardless of the unknown data generating mechanism. For randomized experiments with ignorable compliance and general observational studies obeying the strong ignorability assumption, the intervals satisfy a doubly robust property which states the following: the average coverage is approximately controlled if either the propensity score or the conditional quantiles of potential outcomes can be estimated accurately. Numerical studies on both synthetic and real data sets empirically demonstrate that existing methods suffer from a significant coverage deficit even in simple models. In contrast, our methods achieve the desired coverage with reasonably short intervals.
“…For all conditional quantile estimators, we set . Lastly, we use data as the training fold, as suggested by Sesia and Candès (2020). Our method is implemented in Rcfcausal package, available at https://github.com/lihualei71/cfcausal.…”
Section: From Observables To Counterfactualsmentioning
Evaluating treatment effect heterogeneity widely informs treatment decision making. At the moment, much emphasis is placed on the estimation of the conditional average treatment effect via flexible machine learning algorithms. While these methods enjoy some theoretical appeal in terms of consistency and convergence rates, they generally perform poorly in terms of uncertainty quantification. This is troubling since assessing risk is crucial for reliable decision‐making in sensitive and uncertain environments. In this work, we propose a conformal inference‐based approach that can produce reliable interval estimates for counterfactuals and individual treatment effects under the potential outcome framework. For completely randomized or stratified randomized experiments with perfect compliance, the intervals have guaranteed average coverage in finite samples regardless of the unknown data generating mechanism. For randomized experiments with ignorable compliance and general observational studies obeying the strong ignorability assumption, the intervals satisfy a doubly robust property which states the following: the average coverage is approximately controlled if either the propensity score or the conditional quantiles of potential outcomes can be estimated accurately. Numerical studies on both synthetic and real data sets empirically demonstrate that existing methods suffer from a significant coverage deficit even in simple models. In contrast, our methods achieve the desired coverage with reasonably short intervals.
“…Secondly, we extend ideas from conformal prediction to the multidimensional case and propose a calibration procedure that guarantees the coverage requirement (1) in the finite-sample case for any distribution. Conformal inference (Vovk et al, 2005) is a framework commonly used in the one dimensional case (d = 1) (Chernozhukov et al, 2021;Guan, 2019;Gupta et al, 2021;Izbicki et al, 2020Izbicki et al, , 2021Kivaranovic et al, 2020;Romano et al, 2019;Sesia and Candès, 2020) that provides a generic methodology for building prediction intervals that provably attain valid marginal coverage (1). See (Angelopoulos and Bates, 2021) for a recent overview of this subject.…”
We develop a method to generate predictive regions that cover a multivariate response variable with a user-specified probability. Our work is composed of two components. First, we use a deep generative model to learn a representation of the response that has a unimodal distribution. Existing multiple-output quantile regression approaches are effective in such cases, so we apply them on the learned representation, and then transform the solution to the original space of the response. This process results in a flexible and informative region that can have an arbitrary shape, a property that existing methods lack. Second, we propose an extension of conformal prediction to the multivariate response setting that modifies any method to return sets with a pre-specified coverage level. The desired coverage is theoretically guaranteed in the finite-sample case for any distribution. Experiments conducted on both real and synthetic data show that our method constructs regions that are significantly smaller (sometimes by a factor of 100) compared to existing techniques.
“…Note that ifQ lo andQ hi are trained well, then the resulting confidence interval will be approximately [Q lo (X n+1 ),Q hi (X n+1 )]. These are the scores used to create the predictive intervals seen in [16] and [17].…”
Section: General Conditional Quantile Inferencementioning
confidence: 99%
“…The choice of n 1 and n 2 represents a balance between model training and interval tightness; increasing n 1 increases the amount of data for f lo and f hi , and increasing n 2 results in a better quantile for the predictive interval. [17] contains more information on the effect of the conformity score on the size of predictive intervals as well as the impact of the ratio n 1 /n on interval width and coverage. We also simulate the impact of different scores on conditional quantile intervals in Section 4.…”
Section: General Conditional Quantile Inferencementioning
confidence: 99%
“…We run 500 trials; in each trial, we set n = 5, 000 with n 1 = n 2 = n/2 and α = 0.1 with r = s = α/2. For a separate study on the impact of n on the confidence interval width and coverage, we refer the reader to [17]. We test coverage on 5, 000 datapoints for each trial.…”
We consider the problem of constructing confidence intervals for the median of a response Y ∈ R conditional on features X ∈ R d in a situation where we are not willing to make any assumption whatsoever on the underlying distribution of the data (X, Y ). We propose a method based upon ideas from conformal prediction and establish a theoretical guarantee of coverage while also going over particular distributions where its performance is sharp. Additionally, we prove an equivalence between confidence intervals for the conditional median and confidence intervals for the response variable, resulting in a lower bound on the length of any possible conditional median confidence interval. This lower bound is independent of sample size and holds for all distributions with no point masses.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.