Rationale and Objectives-The accuracy of diagnostic test and imaging segmentation is important in clinical practice because it has a direct impact on therapeutic planning. Statistical validations of classification accuracy was conducted based on parametric receiver operating characteristic analysis, illustrated on three radiologic examples.Materials and Methods-Two parametric models were developed for diagnostic or imaging data. Example 1: A semiautomated fractional segmentation algorithm was applied to magnetic resonance imaging of nine cases of brain tumors. The tumor and background pixel data were assumed to have bi-beta distributions. Fractional segmentation was validated against an estimated composite pixelwise gold standard based on multi-reader manual segmentations. Example 2: The predictive value of 100 cases of spiral computed tomography of ureteral stone sizes, distributed as bi-normal after a nonlinear transformation, under two treatment options received. Example 3: One hundred eighty cases had prostate-specific antigen levels measured in a prospective clinical trial. Radical prostatectomy was performed in all to provide a binary gold standard of local and advanced cancer stages. Prostate-specific antigen level was transformed and modeled by bi-normal distributions. In all examples, areas under the receiver operating characteristic curves were computed. Conclusion-All clinical examples yielded fair to excellent accuracy. The validation metric area under the receiver operating characteristic curves may be generalized to evaluating the performances of several continuous classifiers related to imaging.
Results-The
KeywordsBrain segmentation; magnetic resonance; prostate specific antigen (PSA); genitourinary system; computed tomography; receiver operating characteristic (ROC) analysisThe accuracy of diagnostic test and imaging segmentation is important in clinical practice because it has a direct impact on therapeutic planning. Recently, continuous classification tools In contrast, traditional diagnostic tests were often based on an ordinal rating scale. For example, a five-point scale might be adopted for observer performance evaluations, where 1 = definitely normal, 2 = probably normal, 3 = probably abnormal, 4 = probably abnormal, and 5 = definitely abnormal. A discrete subjective rating method was used in a multi-modal (magnetic resonance [MR], computed tomography [CT], and ultrasound) comparative ovarian cancer technology assessment study (5,6), in one of a series of prospective multicenter Radiologic Diagnostic Oncology Group clinical trials sponsored by the funded by the National Institutes of Health in the 1990s. The advantages of the continuous diagnostic over ordinal scale are that detailed information is preserved, they are more natural with the advancements in measurement tools and computing methods, and enable more objective interpretations. Ordinal rating data will not be the focus of this article. Instead, we will evaluate the performances of continuous classifiers only.To conduct a...