Diagnostic Mammography: Identifying Minimally Acceptable Interpretive Performance Criteria

Carney, Patricia A.; Parikh, Jay R.; Sickles, Edward A.; Feig, Stephen A.; Monsees, Barbara; Bassett, Lawrence W.; Smith, Robert A.; Rosenberg, Robert D.; Ichikawa, Laura; Wallace, James A.; Tran, Khai; Miglioretti, Diana L.

doi:10.1148/radiol.12121216

Cited by 38 publications

(31 citation statements)

References 17 publications

(22 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Despite variability in performance measures, on average, radiologists who worked up fewer recalled mammograms had consistently lower sensitivity, CDRs, and FPRs at any given total volume. Current U.S. Food and Drug Administration regulations require U.S. physicians to have interpreted 960 [11] 43 [12] 47 [10] 129 [11] 32 [18] 86 [17] 101 [8] Progesterone receptor status* Negative 432 (25) 83 (27) 81 (20) 268 (26) 38 (27) 109 (26) 285 (24) Positive 1297 (75) 220 (73) 332 (80) 745 (74) 104 (73) 306 (74) 887 (76) Unknown 232 [12] 45 [13] 47 [10] 140 [12] 36 [20] 86 [17] 110 [9] Note.-Numbers in parenthese are column percentages. Percentages in brackets (for unknown variables) were not included in total column percentages.…”

Section: Discussionmentioning

confidence: 99%

“…However, this is not always the case. It is possible to improve both measures to the point where improvements in one measure reach a threshold beyond which the other is diminished (20). Thus, increases in FPRs associated with the improvement in sensitivity and CDRs potentially could Figure 1: (continued) (d-f) Graphs show multivariable adjusted screening performance measures according to work-up of any recalled screening mammograms.…”

Section: Discussionmentioning

confidence: 99%

“…be reduced with use of other strategies to improve interpretative performance, such as interventions for radiologists to improve interpretative performance (21)(22)(23), application of performance thresholds (20), providing additional audit feedback by reviewing the lesion that was sampled for biopsy, or providing additional feedback related to improving specificity (24)(25)(26)(27). Some women who undergo screening mammography may regard the small increase in the FPR as an acceptable trade-off for improved sensitivity (28)(29)(30)(31).…”

Section: Discussionmentioning

confidence: 99%

See 2 more Smart Citations

Effect of Radiologists’ Diagnostic Work-up Volume on Interpretive Performance

Buist¹,

Anderson²,

Smith³

et al. 2014

Radiology

Self Cite

View full text Add to dashboard Cite

Purpose:To examine radiologists' screening performance in relation to the number of diagnostic work-ups performed after abnormal findings are discovered at screening mammography by the same radiologist or by different radiologists. Materials andMethods:In an institutional review board-approved HIPAA-compliant study, the authors linked 651 671 screening mammograms interpreted from 2002 to 2006 by 96 radiologists in the Breast Cancer Surveillance Consortium to cancer registries (standard of reference) to evaluate the performance of screening mammography (sensitivity, false-positive rate [FPR], and cancer detection rate [CDR]). Logistic regression was used to assess the association between the volume of recalled screening mammograms ("own" mammograms, where the radiologist who interpreted the diagnostic image was the same radiologist who had interpreted the screening image, and "any" mammograms, where the radiologist who interpreted the diagnostic image may or may not have been the radiologist who interpreted the screening image) and screening performance and whether the association between total annual volume and performance differed according to the volume of diagnostic work-up. Results:Annually, 38% of radiologists performed the diagnostic work-up for 25 or fewer of their own recalled screening mammograms, 24% performed the work-up for 0-50, and 39% performed the work-up for more than 50. For the work-up of recalled screening mammograms from any radiologist, 24% of radiologists performed the work-up for 0-50 mammograms, 32% performed the work-up for 51-125, and 44% performed the work-up for more than 125. With increasing numbers of radiologist workups for their own recalled mammograms, the sensitivity (P = .039), FPR (P = .004), and CDR (P , .001) of screening mammography increased, yielding a stepped increase in women recalled per cancer detected from 17.4 for 25 or fewer mammograms to 24.6 for more than 50 mammograms. Increases in work-ups for any radiologist yielded significant increases in FPR (P = .011) and CDR (P = .001) and a nonsignificant increase in sensitivity (P = .15). Radiologists with a lower annual volume of any work-ups had consistently lower FPR, sensitivity, and CDR at all annual interpretive volumes. Conclusion:These findings support the hypothesis that radiologists may improve their screening performance by performing the diagnostic work-up for their own recalled screening mammograms and directly receiving feedback afforded by means of the outcomes associated with their initial decision to recall. Arranging for radiologists to work up a minimum number of their own recalled cases could improve screening performance but would need systems to facilitate this workflow.q RSNA, 2014

show abstract

Section: Discussionmentioning

confidence: 99%

Section: Discussionmentioning

confidence: 99%

Section: Discussionmentioning

confidence: 99%

See 1 more Smart Citation

Effect of Radiologists’ Diagnostic Work-up Volume on Interpretive Performance

Buist¹,

Anderson²,

Smith³

et al. 2014

Radiology

Self Cite

View full text Add to dashboard Cite

show abstract

“…We previously established criteria for acceptable interpretive performance of screening mammography, evaluating multiple performance measures separately: specifically sensitivity, specificity, cancer detection rate (CDR), recall rate, and positive predictive value of recall (PPV 1 ) [4, 5]. We suggested that a radiologist not meeting the criteria for at least one measure should be advised to examine their data in the context of their specific clinical practice and consider additional, focused continuing medical education to improve performance if appropriate [5, 6]. However, considering these performance measures together is more clinically meaningful, because they are inter-related, resulting in trade-offs between recalling patients for further work-up and detecting cancer [7].…”

Section: Introductionmentioning

confidence: 99%

“…Expert panels have been used for decades to establish benchmarks and guidelines for breast imaging [4–6, 8]. …”

Section: Introductionmentioning

confidence: 99%

Criteria for Identifying Radiologists With Acceptable Screening Mammography Interpretive Performance on Basis of Multiple Performance Measures

Miglioretti

Ichikawa

Smith

et al. 2015

American Journal of Roentgenology

Self Cite

View full text Add to dashboard Cite

Objective Using a combination of performance measures, we updated previously proposed criteria for identifying physicians whose performance interpreting screening mammograms may indicate suboptimal interpretation skills. Materials and Methods In this Institutional Review Board-approved, HIPAA-compliant study, six expert breast imagers used a method based on the Angoff approach to update criteria for acceptable mammography performance on the basis of combined performance measures: (Group 1) sensitivity and specificity, for facilities with complete capture of false-negative cancers; and (Group 2) cancer detection rate (CDR), recall rate, and positive predictive value of a recall (PPV1), for facilities that cannot capture false negatives, but have reliable cancer follow-up information for positive mammograms. Decisions were informed by normative data from the Breast Cancer Surveillance Consortium (BCSC). Results Updated, combined ranges for acceptable sensitivity and specificity of screening mammography are: (1) sensitivity ≥80% and specificity ≥85% or (2) sensitivity 75–79% and specificity 88–97%. Updated ranges for CDR, recall rate, and PPV1 are: (1) CDR ≥6/1000, recall rate 3–20%, and any PPV1; (2) CDR 4–6/1000, recall rate 3–15%, and PPV1 ≥3%; or (3) CDR 2.5–4/1000, recall rate 5–12%, and PPV1 3–8%. Using the original criteria, 51% of BCSC radiologists had acceptable sensitivity and specificity; 40% had acceptable CDR, recall rate, and PPV1. Using the combined criteria, 69% had acceptable sensitivity and specificity and 62% had acceptable CDR, recall rate, and PPV1. Conclusion The combined criteria improve previous criteria by considering the inter-relationships of multiple performance measures and broaden the acceptable performance ranges compared to previous criteria based on individual measures.

show abstract

Modeled residual current cancer risk after clinical investigation of a positive multicancer early detection test result

Hudnut

Hubbell

Venn

et al. 2023

Cancer

View full text Add to dashboard Cite

Background Positive results of a multi‐cancer early detection (MCED) test require confirmatory diagnostic workup. Here, residual current cancer risk (RR) during the process of diagnostic resolution, including situations where the initial confirmatory test does not provide resolution, was modeled. Methods A decision‐tree framework was used to model conditional risk in a patient’s journey through confirmatory diagnostic options and outcomes. The diagnostic journey assumed that cancer signal detection (a positive MCED test result) had already led to a transition from screening to diagnosis and began with an initial positive predictive value (PPV) from the positive result. Evaluation of a most probable (top) predicted cancer signal origin (CSO) and then a second–most probable predicted CSO followed. Under the assumption that the top‐ and second‐predicted CSOs were each followed by a targeted confirmatory test, the RR was estimated for each subsequent scenario. Results For an initial MCED test result with typical performance characteristics modeled (PPV, 40%; top‐predicted CSO accuracy, 90%), after a negative initial confirmatory test (sensitivity, 70%, 90%, or 100%) the RR ranged from 6% to 20%. A second‐predicted CSO (accuracy, 50%), after a negative second confirmatory test, still provided a significant RR (3%–18%) in comparison with the National Institute for Health and Care Excellence–recommended cancer risk threshold warranting investigation in symptomatic individuals (3%). With a 40% PPV for an MCED test and 90% specificity for a confirmatory test, the risk of incidental findings after one or two confirmatory tests was 6% and 12%, respectively. Conclusions These results may illustrate the impact of a positive MCED test on follow‐up decision‐making.

show abstract

Diagnostic Mammography: Identifying Minimally Acceptable Interpretive Performance Criteria

Cited by 38 publications

References 17 publications

Effect of Radiologists’ Diagnostic Work-up Volume on Interpretive Performance

Effect of Radiologists’ Diagnostic Work-up Volume on Interpretive Performance

Criteria for Identifying Radiologists With Acceptable Screening Mammography Interpretive Performance on Basis of Multiple Performance Measures

Modeled residual current cancer risk after clinical investigation of a positive multicancer early detection test result

Contact Info

Product

Resources

About