2010
DOI: 10.1053/j.seminoncol.2009.12.004
|View full text |Cite
|
Sign up to set email alerts
|

Traditional Statistical Methods for Evaluating Prediction Models Are Uninformative as to Clinical Value: Towards a Decision Analytic Framework

Abstract: Cancer prediction models are becoming ubiquitous, yet we generally have no idea whether they do more good than harm. This is because current statistical methods for evaluating prediction models are uninformative as to their clinical value. Prediction models are typically evaluated in terms of discrimination or calibration. However, it is generally unclear how high discrimination needs to be before it is considered "high enough"; similarly, there are no rationale guidelines as to the degree of miscalibration th… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
110
0
1

Year Published

2011
2011
2023
2023

Publication Types

Select...
9

Relationship

0
9

Authors

Journals

citations
Cited by 98 publications
(116 citation statements)
references
References 33 publications
0
110
0
1
Order By: Relevance
“…Discrimination should not be reported in isolation because a poorly calibrated model can have the same discriminative capacity as a perfectly calibrated model. 29 One limitation of calibration is that assessment techniques do not allow for comparisons between models. In the validation cohorts, both the SHFM and the HFSS showed inadequate calibration attributable to the model overestimating survival in some groups of patients, including low-risk patients, blacks, and patients with ICD/CRT therapy.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…Discrimination should not be reported in isolation because a poorly calibrated model can have the same discriminative capacity as a perfectly calibrated model. 29 One limitation of calibration is that assessment techniques do not allow for comparisons between models. In the validation cohorts, both the SHFM and the HFSS showed inadequate calibration attributable to the model overestimating survival in some groups of patients, including low-risk patients, blacks, and patients with ICD/CRT therapy.…”
Section: Discussionmentioning
confidence: 99%
“…Vickers et al 29 have proposed the use of simple decision analytic techniques to compare prediction models in terms of their consequences. These techniques weight true and false-positive errors differently, to reflect the impact of decision consequences (ie, risks associated with heart transplantation or ventricular assist device versus risks associated with continuing medical therapy).…”
Section: Discussionmentioning
confidence: 99%
“…The goodness of fit or E/O ratio is commonly applied to measure how close the predicted and the observed values are [31,32]. The C-statistic is usually applied to measure how well the model will assign a higher probability of having an event to a case group and a lower probability to a non-case group [33].…”
Section: Discussionmentioning
confidence: 99%
“…20,21 We performed 10-fold cross-validation techniques to estimate how the model will generalize to an independent population and correct for this optimism bias. 22 We assessed calibration performance by calculating the Hosmer-Lemeshow goodness-offit statistic, which measures whether the predicted probability of infection corresponds with the observed probability. A wellcalibrated model gives a corresponding P value greater than 0.05.…”
Section: Performance Measuresmentioning
confidence: 99%