The predictive capacity of a marker in a population can be described using the population distribution of risk (Huang et al. 2007;Pepe et al. 2008a;Stern 2008). Virtually all standard statistical summaries of predictability and discrimination can be derived from it (Gail and Pfeiffer 2005). The goal of this paper is to develop methods for making inference about risk prediction markers using summary measures derived from the risk distribution. We describe some new clinically motivated summary measures and give new interpretations to some existing statistical measures. Methods for estimating these summary measures are described along with distribution theory that facilitates construction of confidence intervals from data. We show how markers and, more generally, how risk prediction models, can be compared using clinically relevant measures of predictability. The methods are illustrated by application to markers of lung function and nutritional status for predicting subsequent onset of major pulmonary infection in children suffering from cystic fibrosis. Simulation studies show that methods for inference are valid for use in practice.
KEYWORDS: discrimination, risk, classification, decision makingAuthor Notes: This work is supported in part by grants from the National Institutes of Health (R01 GM054438 and U01 CA086368).Unauthenticated Download Date | 5/11/18 6:02 AMThe predictive capacity of a marker in a population can be described using the population distribution of risk (Huang et al. 2007;Pepe et al. 2008a;Stern 2008). Virtually all standard statistical summaries of predictability and discrimination can be derived from it (Gail and Pfeiffer 2005). The goal of this paper is to develop methods for making inference about risk prediction markers using summary measures derived from the risk distribution. We describe some new clinically motivated summary measures and give new interpretations to some existing statistical measures. Methods for estimating these summary measures are described along with distribution theory that facilitates construction of confidence intervals from data. We show how markers and, more generally, how risk prediction models, can be compared using clinically relevant measures of predictability. The methods are illustrated by application to markers of lung function and nutritional status for predicting subsequent onset of major pulmonary infection in children suffering from cystic fibrosis. Simulation studies show that methods for inference are valid for use in practice.
BackgroundLet D denote a binary outcome variable, such as presence of disease or occurrence of an event within a specified time period and let Y denote a set of predictive markers used to predict a bad outcome, D = 1, or a good outcome, D = 0. For example, elements of the Framingham risk score (age, gender, total and high-density lipoprotein cholesterol, systolic blood pressure, treatment for hypertension and smoking) are used to predict occurrence of a cardiovascular event within 10 years (http://hp2010.nhlbihin.net/...