Authors' response to comments

Pencina, Michael J.; Chipman, Jonathan; Steyerberg, Ewout W.; Braun, Danielle; Fine, Jason P.; D’Agostino, Ralph B.

doi:10.1002/sim.7520

Cited by 3 publications

(3 citation statements)

References 13 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In this communication, we present four different metrics, Harrell’s C, pseudo R 2 , IDI and NRI, through which to analyze the effect of cg05575921, cg04987734, and LEA on survival prediction. To be clear, each of those metrics have strengths and weaknesses that are well noted in the literature [28,32,33,34]. Still, in each of those analyses, although the amount of additional information was rather small, LEA did seem to predict a small proportion of survival independent of the effects of cg05575921 and cg04987734.…”

Section: Discussionmentioning

confidence: 96%

A Direct Comparison of the Relationship of Epigenetic Aging and Epigenetic Substance Consumption Markers to Mortality in the Framingham Heart Study

Mills

Beach

Dogan

et al. 2019

Genes

View full text Add to dashboard Cite

A number of studies have examined the relationship of indices of epigenetic aging (EA) to key health outcomes. Unfortunately, our understanding of the relationship of EA to mortality and substance use-related health variables is unclear. In order to clarify these interpretations, we analyzed the relationship of the Levine EA index (LEA), as well as established epigenetic indices of cigarette (cg05575921) and alcohol consumption (cg04987734), to all-cause mortality in the Framingham Heart Study Offspring Cohort (n = 2256) Cox proportional hazards regression. We found that cg05575921 and cg04987734 had an independent effect relative to LEA and vice versa, with the model including all the predictors having better performance than models with either LEA or cg05575921 and cg04987734 alone. After correction for multiple comparisons, 195 and 327, respectively, of the 513 markers in the LEA index, as well as the overall index itself, were significantly associated with cg05575921 and cg04987734 methylation status. We conclude that the epigenetic indices of substance use have an independent effect over and above LEA, and are slightly stronger predictors of mortality in head-to-head comparisons. We also conclude that the majority of the strength of association conveyed by the LEA is secondary to smoking and drinking behaviors, and that efforts to promote healthy aging should continue to focus on addressing substance use.

show abstract

Section: Discussionmentioning

confidence: 96%

A Direct Comparison of the Relationship of Epigenetic Aging and Epigenetic Substance Consumption Markers to Mortality in the Framingham Heart Study

Mills

Beach

Dogan

et al. 2019

Genes

View full text Add to dashboard Cite

show abstract

“…So why not just use net benefit? Pencina et al state that in order to use net benefit, there need to be “well‐established thresholds,” and in their “vast and varied experience,” this is “rare.” If the authors are implying that there is often no single threshold, this is true, but irrelevant because net benefit is traditionally estimated across a range of thresholds. If the authors are implying that it is the suitable range of thresholds that is unknown, this is obviously false.…”

mentioning

confidence: 99%

Comments on “Net reclassification index at event rate: Properties and relationships”

Vickers

2019

Statistics in Medicine

View full text Add to dashboard Cite

We read with some interest the recent papers in Statistics in Medicine, and associated commentaries, on Net Reclassification Improvement (NRI) and Integrated Discrimination Improvement (IDI). 1,2 It first struck us as somewhat incongruous to devote so much journal space to measures that have clearly been shown to have a host of undesirable properties, such as being grossly anticonservative and favoring overfit models. 3,4 The new version of the NRI, the NRI(p), is said to have good properties because at the event rate, it is equivalent to net benefit. So why not just use net benefit? Pencina et al state that in order to use net benefit, there need to be "wellestablished thresholds," and in their "vast and varied experience," this is "rare." 5 If the authors are implying that there is often no single threshold, this is true, but irrelevant because net benefit is traditionally estimated across a range of thresholds. If the authors are implying that it is the suitable range of thresholds that is unknown, this is obviously false. It would suggest, for example, that for most models, users have no idea how to interpret the resulting predicted probabilities. We note, for example, in the paper that introduced the NRI and IDI, Pencina et al themselves used thresholds of 6% and 20% for cardiovascular risk. 6 This paper made no reference to the motivating example being a "rare" situation due to the availability of well-established thresholds. NRI(p) is proposed as a summary measure, but net benefit at a threshold chosen with respect to clinical consequences would be preferable. As Kerr et al show, 7 it is trivial to come up with examples showing that NRI(p) inappropriately selects between two markers. Take, for instance, potentially fatal and highly prevalent disease for which there is a safe and in expensive drug therapy. A highly specific marker would have superior NRI(p) (i.e., net benefit at the event rate, which is high), but in practice, we would prefer a sensitive marker because there is a premium on finding disease. This scenario would be avoided if a clinically relevant threshold was used in place of the event rate.The requirement of good calibration is also highly problematic ("need to ascertain good calibration before examining the predictive performance of a new model" 1 ). What level of calibration counts as "good"? The Hosmer-Lemeshow test is not going to help because it may give a lower p value to a very large study demonstrating a small amount of miscalibration than a small study with findings of important miscalibration.So we have a statistical approach that with some obvious drawbacks, but which might be useful in the space (size undefined) where there is no information on relevant thresholds (no examples given) and models have to be well calibrated (no criteria given). We challenge the proponents of IDI and NRI to provide nontrivial examples where an NRI or IDI statistic provides useful information over and above net benefit and standard metrics such as the concordance index. Continued promotion of and ac...

show abstract

“…We argue that presenting standardized net benefit curves with maximum standardized net benefit as a companion single‐number summary offers a more elegant and interpretable pairing than standardized net benefit and the AUC. Moreover, Baker shows that the inverse of the NRI at event rate times the event rate can be interpreted as the summary test trade‐off, ie, an approximate lower bound over all thresholds for the minimum number of tests for a new marker that needs to be traded for a true positive to yield a positive net benefit …”

mentioning

confidence: 99%

Single‐number summary and decision analytic measures can happily coexist

Pencina

Steyerberg

D’Agostino

2019

Statistics in Medicine

Self Cite

View full text Add to dashboard Cite

In his commentary, Dr Vickers repeats some of his previous criticisms of the NRI and IDI measures and challenges us to find examples where NRI at event rate (NRI(p)) offers information over and above net benefit. 1 However, this challenge makes little sense. NRI(p) is a single-number statistical summary measure. Net benefit should be presented across a range of thresholds; thus one should consider a net benefit curve, which is an important and useful contribution of Drs Vickers and Elkin. 2 NRI(p) is a difference between two points on the standardized net benefit curves evaluated at event rate. A point on a curve cannot contain the same or more information than the curve itself.Instead, one might contrast the NRI(p) with the change in the area under the ROC curve (AUC), or more directly, the maximum standardized net benefit (the parent measure of NRI(p)) with the AUC. We have shown that the maximum standardized net benefit is a global measure that does not depend on the event rate itself. 3 Moreover, it is proper and cannot be "fooled" by miscalibration. In these regards, it shares the properties of the AUC. Its connection with several other statistical summary measures affords a richer interpretation. Indeed, the measure can be interpreted as the Kolmogorov-Smirnov distance between the risk distributions among events and nonevents as well as the maximum relative utility or standardized net benefit. We argue that presenting standardized net benefit curves with maximum standardized net benefit as a companion single-number summary offers a more elegant and interpretable pairing than standardized net benefit and the AUC. Moreover, Baker 4 shows that the inverse of the NRI at event rate times the event rate can be interpreted as the summary test trade-off, ie, an approximate lower bound over all thresholds for the minimum number of tests for a new marker that needs to be traded for a true positive to yield a positive net benefit. 5 We agree with Dr Vickers that a thoughtful discussion about a range of classification thresholds can be useful. However, identifying any threshold, or even a threshold range, can be arbitrary, is likely to vary from person to person, and is subject to change and debate. The thresholds used in primary prevention of cardiovascular disease are a good example. When introducing the NRI in 2008, we used 6% and 20%, consistent with the practice at the time. 6 However, the 2013 American Heart Association/American College of Cardiology guidelines lowered the thresholds to 5% and 7.5%, and at the same time, expanded the definition of the outcome used in the risk prediction model. 7 The subsequent US Preventive Services Task Force guideline raised the threshold to 10%, 8 but future guidelines may lower the threshold again. The biomarker discovery process needs more grounding. Furthermore, some researchers argue that model and biomarker evaluation needs a continuous framework. That is why we need global measures of model performance. AUC, maximum standardized net benefit, and R-squared-type measures a...

show abstract

Authors' response to comments

Cited by 3 publications

References 13 publications

A Direct Comparison of the Relationship of Epigenetic Aging and Epigenetic Substance Consumption Markers to Mortality in the Framingham Heart Study

A Direct Comparison of the Relationship of Epigenetic Aging and Epigenetic Substance Consumption Markers to Mortality in the Framingham Heart Study

Comments on “Net reclassification index at event rate: Properties and relationships”

Single‐number summary and decision analytic measures can happily coexist

Contact Info

Product

Resources

About