The development and implementation of quantitative imaging biomarkers has been hampered by the inconsistent and often incorrect use of terminology related to these markers. Sponsored by the Radiological Society of North America, an interdisciplinary group of radiologists, statisticians, physicists, and other researchers worked to develop a comprehensive terminology to serve as a foundation for quantitative imaging biomarker claims. Where possible, this working group adapted existing definitions derived from national or international standards bodies rather than invent new definitions for these terms. This terminology also serves as a foundation for the design of studies that evaluate the technical performance of quantitative imaging biomarkers and for studies of algorithms that generate the quantitative imaging biomarkers from clinical scans. This paper provides examples of research studies and quantitative imaging biomarker claims that use terminology consistent with these definitions as well as examples of the rampant confusion in this emerging field. We provide recommendations for appropriate use of quantitative imaging biomarker terminological concepts. It is hoped that this document will assist researchers and regulatory reviewers who examine quantitative imaging biomarkers and will also inform regulatory guidance. More consistent and correct use of terminology could advance regulatory science, improve clinical research, and provide better care for patients who undergo imaging studies.
Quantitative biomarkers from medical images are becoming important tools for clinical diagnosis, staging, monitoring, treatment planning, and development of new therapies. While there is a rich history of the development of quantitative imaging biomarker (QIB) techniques, little attention has been paid to the validation and comparison of the computer algorithms that implement the QIB measurements. In this paper we provide a framework for QIB algorithm comparisons. We first review and compare various study designs, including designs with the true value (e.g. phantoms, digital reference images, and zero-change studies), designs with a reference standard (e.g. studies testing equivalence with a reference standard), and designs without a reference standard (e.g. agreement studies and studies of algorithm precision). The statistical methods for comparing QIB algorithms are then presented for various study types using both aggregate and disaggregate approaches. We propose a series of steps for establishing the performance of a QIB algorithm, identify limitations in the current statistical literature, and suggest future directions for research.
Medical imaging has seen substantial and rapid technical advances during the past decade, including advances in image acquisition devices, processing and analysis software, and agents to enhance specificity. Traditionally, medical imaging has defined anatomy, but increasingly newer, more advanced, imaging technologies provide biochemical and physiologic information based on both static and dynamic modalities. These advanced technologies are important not only for detecting disease but for characterizing and assessing change of disease with time or therapy. Because of the rapidity of these advances, research to determine the utility of quantitative imaging in either clinical research or clinical practice has not had time to mature. Methods to appropriately develop, assess, regulate, and reimburse must be established for these advanced technologies. Efficient and methodical processes that meet the needs of stakeholders in the biomedical research community, therapeutics developers, and health care delivery enterprises will ultimately benefit individual patients. To help address this, the authors formed a collaborative program-the Quantitative Imaging Biomarker Alliance. This program draws from the very successful precedent set by the Integrating the Healthcare Enterprise effort but is adapted to the needs of imaging science. Strategic guidance supporting the development, qualification, and deployment of quantitative imaging biomarkers will lead to improved standardization of imaging tests, proof of imaging test performance, and greater use of imaging to predict the biologic behavior of tissue and monitor therapy response. These, in turn, confer value to corporate stakeholders, providing incentives to bring new and innovative products to market.
Our findings indicate that measurement of changes in tumor volumes is adequately reproducible. Using tumor volumes as the basis for response assessments could have a positive impact on both patient management and clinical trials. More authoritative work to qualify or discard changes in volume as the basis for response assessments should proceed.
Quantitative imaging biomarkers (QIBs) are being used increasingly in medicine to diagnose and monitor patients' disease. The computer algorithms that measure QIBs have different technical performance characteristics. In this paper we illustrate the appropriate statistical methods for assessing and comparing the bias, precision, and agreement of computer algorithms. We use data from three studies of pulmonary nodules. The first study is a small phantom study used to illustrate metrics for assessing repeatability. The second study is a large phantom study allowing assessment of four algorithms' bias and reproducibility for measuring tumor volume and the change in tumor volume. The third study is a small clinical study of patients whose tumors were measured on two occasions. This study allows a direct assessment of six algorithms' performance for measuring tumor change. With these three examples we compare and contrast study designs and performance metrics, and we illustrate the advantages and limitations of various common statistical methods for QIB studies.
The weight of the evidence indicates there are circumstances in which volumetric image analysis adds value to clinical trial science and the practice of medicine.
Purpose To (a) evaluate whether plaque tissue characteristics determined with conventional computed tomographic (CT) angiography could be quantitated at higher levels of accuracy by using image processing algorithms that take characteristics of the image formation process coupled with biologic insights on tissue distributions into account by comparing in vivo results and ex vivo histologic findings and (b) assess reader variability. Materials and Methods Thirty-one consecutive patients aged 43-85 years (average age, 64 years) known to have or suspected of having atherosclerosis who underwent CT angiography and were referred for endarterectomy were enrolled. Surgical specimens were evaluated with histopathologic examination to serve as standard of reference. Two readers used lumen boundary to determine scanner blur and then optimized component densities and subvoxel boundaries to best fit the observed image by using semiautomatic software. The accuracy of the resulting in vivo quantitation of calcification, lipid-rich necrotic core (LRNC), and matrix was assessed with statistical estimates of bias and linearity relative to ex vivo histologic findings. Reader variability was assessed with statistical estimates of repeatability and reproducibility. Results A total of 239 cross sections obtained with CT angiography and histologic examination were matched. Performance on held-out data showed low levels of bias and high Pearson correlation coefficients for calcification (-0.096 mm and 0.973, respectively), LRNC (1.26 mm and 0.856), and matrix (-2.44 mm and 0.885). Intrareader variability was low (repeatability coefficient ranged from 1.50 mm to 1.83 mm among tissue characteristics), as was interreader variability (reproducibility coefficient ranged from 2.09 mm to 4.43 mm). Conclusion There was high correlation and low bias between the in vivo software image analysis and ex vivo histopathologic quantitative measures of atherosclerotic plaque tissue characteristics, as well as low reader variability. Software algorithms can mitigate the blurring and partial volume effects of routine CT angiography acquisitions to produce accurate quantification to enhance current clinical practice. Clinical trial registration no. NCT02143102 RSNA, 2017 Online supplemental material is available for this article. An earlier incorrect version of this article appeared online. This article was corrected on September 15, 2017.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.