Over the past few decades, interest in biomarkers to enhance predictive modeling has soared. Methodology for evaluating these has also been an active area of research. There are now several performance measures available for quantifying the added value of biomarkers. This commentary provides an overview of methods currently used to evaluate new biomarkers, describes their strengths and limitations, and offers some suggestions on their use.Keywords: Biomarkers, Model fit, Calibration, Reclassification, Clinical utilityDuring the past few decades, there has been an explosion of work on the use of biomarkers in predictive modeling and whether it is useful to include these when evaluating risk of clinical events. As new biologic mechanisms have been discovered, genetic markers evolved, and new assays developed, questions about the usefulness of new markers for clinical prediction have been debated. In cardiology, several strong risk factors for cardiovascular disease, namely cholesterol levels, blood pressure, smoking, and diabetes, have been well-known for decades [1] and have been incorporated into clinical practice. They have also been included in predictive models for cardiovascular disease, primarily developed in the Framingham Heart Study [2]. Since then, many new markers with more modest effects have been discovered as new biologic pathways have been unearthed. In fields which have less powerful predictors to date, development and addition of predictive markers may be even more important.As interest in biomarkers has soared, so has the methodology used to evaluate their utility. There are now several performance measures available for quantifying the added value of biomarkers (Table 1), several of which have been proposed in the last decade. This commentary provides an overview of methods currently used to evaluate new biomarkers, describes their strengths and limitations, and offers some suggestions on their use.
Likelihood functionsA fundamental construct for much of statistical modeling is the likelihood function. This reflects the probability, or "likelihood," of obtaining the observed data under the assumed model, including the selected variables and their associated parameters [3]. As more variables are added and the model fits the data better, the probability of obtaining the data that are actually observed improves. Much of statistical theory is based on this function. Thus, the primordial criterion of whether new variables, including biomarkers, can add to or improve a model is whether and by how much the likelihood increases. When the models are nested, we can test improvement with a likelihood ratio test, though other related tests, such as a Wald test, are sometimes used. For nonparametric models or machine learning tools, other loss functions are often used, such as cross-entropy or deviance, which are functions of the log likelihood for binary outcomes [4].Other likelihood-based measures do not directly perform a test of significance, but apply a penalty for added variables, such as the ...