Distribution-Free Predictive Inference for Regression

Lei, Jing; G’Sell, Max; Rinaldo, Alessandro; Tibshirani, Ryan J.; Wasserman, Larry

doi:10.1080/01621459.2017.1307116

Cited by 480 publications

(554 citation statements)

References 30 publications

Supporting

Mentioning

549

Contrasting

Order By: Relevance

“…As an immediate corollary of Theorem 1, note that it also follows that conformal quantile regression bands have asymptotic conditional coverage, which we define as in [13].…”

Section: Theoretical Analysismentioning

confidence: 97%

A comparison of some conformal quantile regression methods

Sesia

Candès

2020

Stat

View full text Add to dashboard Cite

We compare two recently proposed methods that combine ideas from conformal inference and quantile regression to produce locally adaptive and marginally valid prediction intervals under sample exchangeability (Romano et al., 2019 [1]; Kivaranovic et al., 2019 [2]). First, we prove that these two approaches are asymptotically efficient in large samples, under some additional assumptions. Then we compare them empirically on simulated and real data. Our results demonstrate that the method in Romano et al. (2019) typically yields tighter prediction intervals in finite samples. Finally, we discuss how to tune these procedures by fixing the relative proportions of observations used for training and conformalization. arXiv:1909.05433v1 [stat.ME]

show abstract

“…As an immediate corollary of Theorem 1, note that it also follows that conformal quantile regression bands have asymptotic conditional coverage, which we define as in [13].…”

Section: Theoretical Analysismentioning

confidence: 97%

A comparison of some conformal quantile regression methods

Sesia

Candès

2020

Stat

View full text Add to dashboard Cite

show abstract

“…Example 2. Figure 2 shows a toy regression problem, where 40 training samples drawn from a sine function have feature x in [0, 5], and 10 training samples have feature x in (10,15]. However, the testing samples have feature x in (5,10].…”

Section: Motivationmentioning

confidence: 99%

“…However, the testing samples have feature x in (5,10]. We use a neural network (NN) regressor to fit the data, and as shown, NN does a better job in fitting the sine function in [0, 5] than in (10,15]. Meanwhile, it does a terrible job in extrapolating outside of the training space, i.e., (5, 10].…”

Section: Motivationmentioning

confidence: 99%

Towards safe machine learning for CPS

Easwaran

2019

Proceedings of the 10th ACM/IEEE International Conference on Cyber-Physical Systems

View full text Add to dashboard Cite

Machine learning (ML) techniques are increasingly applied to decisionmaking and control problems in Cyber-Physical Systems among which many are safety-critical, e.g., chemical plants, robotics, autonomous vehicles. Despite the significant benefits brought by ML techniques, they also raise additional safety issues because 1) most expressive and powerful ML models are not transparent and behave as a black box and 2) the training data which plays a crucial role in ML safety is usually incomplete. An important technique to achieve safety for ML models is "Safe Fail", i.e., a model selects a reject option and applies the backup solution, a traditional controller or a human operator for example, when it has low confidence in a prediction.Data-driven models produced by ML algorithms learn from training data, and hence they are only as good as the examples they have learnt. As pointed in [17], ML models work well in the "training space" (i.e., feature space with sufficient training data), but they could not extrapolate beyond the training space. As observed in many previous studies, a feature space that lacks training data generally has a much higher error rate than the one that contains sufficient training samples [31]. Therefore, it is essential to identify the training space and avoid extrapolating beyond the training space. In this paper, we propose an efficient Feature Space Partitioning Tree (FSPT ) to address this problem. Using experiments, we also show that, a strong relationship exists between model performance and FSPT score.

show abstract

“…They mainly make changes to feature value and test the chain effect to performance loss of predictions. The loss is then taken as the measure of local importance of feature [11]. This method only relies on the output evaluation and provides an unified way to check feature contribution for black-box models.…”

Section: Related Workmentioning

confidence: 99%

Unpack Local Model Interpretation for GBDT

Fang

Zhou

et al. 2018

Database Systems for Advanced Applications

View full text Add to dashboard Cite

A gradient boosting decision tree (GBDT), which aggregates a collection of single weak learners (i.e. decision trees), is widely used for data mining tasks. Because GBDT inherits the good performance from its ensemble essence, much attention has been drawn to the optimization of this model. With its popularization, an increasing need for model interpretation arises. Besides the commonly used feature importance as a global interpretation, feature contribution is a local measure that reveals the relationship between a specific instance and the related output. This work focuses on the local interpretation and proposes an unified computation mechanism to get the instance-level feature contributions for GBDT in any version. Practicality of this mechanism is validated by the listed experiments as well as applications in real industry scenarios.

show abstract

Distribution-Free Predictive Inference for Regression

Cited by 480 publications

References 30 publications

A comparison of some conformal quantile regression methods

A comparison of some conformal quantile regression methods

Towards safe machine learning for CPS

Unpack Local Model Interpretation for GBDT

Contact Info

Product

Resources

About