Comparing Sequential Forecasters

Choe, Yo Joong; Ramdas, Aaditya

doi:10.48550/arxiv.2110.00115

2021

DOI: 10.48550/arxiv.2110.00115

|View full text |Cite

Preprint

Comparing Sequential Forecasters

Yo Joong Choe¹,

Aaditya Ramdas²

Abstract: Consider two or more forecasters, each making a sequence of predictions for different events over time. We ask a relatively basic question: how might we compare these forecasters, either online or post-hoc, while avoiding unverifiable assumptions on how the forecasts or outcomes were generated? This work presents a novel and rigorous answer to this question. We design a sequential inference procedure for estimating the time-varying difference in forecast quality as measured by a relatively large class of prope… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

Supporting

Mentioning

Contrasting

Year Published

2023

Publication Types

Select...

Article1

Relationship

Self Cite0

Independent1

Authors

Journals

Cited by 1 publication

(1 citation statement)

References 33 publications

(96 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…While these methods test the same null hypothesis, the types of misspecification are often different from the ones in forecast evaluation. Henzi and Ziegel (2021) and Choe and Ramdas (2021) give a first application of e-values and related concepts to testing probability forecast superiority. Their articles are concerned with comparing probability predictions p t , q t ∈ [0, 1] for a binary event Y t+h ∈ {0, 1} with respect to so-called proper scoring rules S, such as the squared error S(p, y) = (p − y) 2 .…”

mentioning

confidence: 99%

Sequentially valid tests for forecast calibration

Arnold,

Henzi,

Ziegel

2023

Ann. Appl. Stat.

View full text Add to dashboard Cite

Forecasting and forecast evaluation are inherently sequential tasks. Predictions are often issued on a regular basis, such as every hour, day, or month, and their quality is monitored continuously. However, the classical statistical tools for forecast evaluation are static, in the sense that statistical tests for forecast calibration are only valid if the evaluation period is fixed in advance. Recently, e-values have been introduced as a new, dynamic method for assessing statistical significance. An e-value is a non-negative random variable with expected value at most one under a null hypothesis. Large e-values give evidence against the null hypothesis, and the multiplicative inverse of an e-value is a conservative p-value. E-values are particularly suitable for sequential forecast evaluation, since they naturally lead to statistical tests which are valid under optional stopping. This article proposes e-values for testing probabilistic calibration of forecasts, which is one of the most important notions of calibration. The proposed methods are also more generally applicable for sequential goodness-of-fit testing. We demonstrate in a simulation study that the e-values are competitive in terms of power when compared to extant methods, which do not allow for sequential testing. In this context, we introduce test power heat matrices, a graphical tool to compactly visualize results of simulation studies on test power. In a case study, we show that the e-values provide important and new useful insights in the evaluation of probabilistic weather forecasts.

show abstract

mentioning

confidence: 99%

Sequentially valid tests for forecast calibration

Arnold,

Henzi,

Ziegel

2023

Ann. Appl. Stat.

View full text Add to dashboard Cite

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Comparing Sequential Forecasters

Cited by 1 publication

References 33 publications

Sequentially valid tests for forecast calibration

Sequentially valid tests for forecast calibration

Contact Info

Product

Resources

About