2022
DOI: 10.1038/s41379-022-01147-y
|View full text |Cite|
|
Sign up to set email alerts
|

Recommendations on compiling test datasets for evaluating artificial intelligence solutions in pathology

Abstract: Artificial intelligence (AI) solutions that automatically extract information from digital histology images have shown great promise for improving pathological diagnosis. Prior to routine use, it is important to evaluate their predictive performance and obtain regulatory approval. This assessment requires appropriate test datasets. However, compiling such datasets is challenging and specific recommendations are missing. A committee of various stakeholders, including commercial AI developers, pathologists, and … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
30
0
4

Year Published

2022
2022
2024
2024

Publication Types

Select...
8

Relationship

1
7

Authors

Journals

citations
Cited by 35 publications
(42 citation statements)
references
References 136 publications
(168 reference statements)
0
30
0
4
Order By: Relevance
“…Like other ensemble methods ( 6 , 31 ), our NoisyEnsembles also have the potential to increase the overall performance of the deep learning application ( Figures 4A,B vs. Supplementary Figure S11 ). For testing, it is recommended to sample the possible image space as well as possible ( 11 ). However, during the application of CNNs in a routine setting, we recommend paying attention to the best possible quality, as the performance was directly linked to the tissue quality ( Figure 2 , Supplementary Figure S1 ).…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…Like other ensemble methods ( 6 , 31 ), our NoisyEnsembles also have the potential to increase the overall performance of the deep learning application ( Figures 4A,B vs. Supplementary Figure S11 ). For testing, it is recommended to sample the possible image space as well as possible ( 11 ). However, during the application of CNNs in a routine setting, we recommend paying attention to the best possible quality, as the performance was directly linked to the tissue quality ( Figure 2 , Supplementary Figure S1 ).…”
Section: Discussionmentioning
confidence: 99%
“…It was not until 2021 that the first AI algorithm in computational pathology was approved by the Food and Drug Administration (FDA) ( 9 ). A major problem in overcoming this “translational valley of death” is to ensure the reproducibility and generalizability of the developed products ( 10 ) by defining appropriate test datasets ( 11 ). All of these applications and algorithms depend, of course, on the digitalized histomorphological whole slide image (WSI) used in training and during application.…”
Section: Introductionmentioning
confidence: 99%
“…The data are frequently labelled, either by labelling a WSI with the diagnosis or smaller areas within the WSI, giving information about the region of interest 3 . The algorithm's performance is compared to a defined reference (usually a pathologist's diagnosis) in terms of performance 15 …”
Section: Why Do Errors Arise In Ai Diagnostic Tools?mentioning
confidence: 99%
“…This creates bias and an increased risk of errors in these subgroups 9,31–33 . This can also cause the problem of hidden stratification; when an algorithm appears to perform well across the whole population, but is actually performing poorly in subsets not identified during training or testing, and this niche of poor performance goes undetected 11,12,15 . For example, an algorithm may generally be effective at lung cancer detection, but consistently miss a rare subtype 12 .…”
Section: Why Do Errors Arise In Ai Diagnostic Tools?mentioning
confidence: 99%
“…Clinical validation is necessary for any SaMD ( Fraggetta et al, 2021 ) as determined by the manufacturer before (pre-market) and after (post-market) distribution to establish a relationship between verification and validation results of an algorithm and the clinical conditions of interest ( Carolan et al, 2022 ). Prior to routine use, it is important to evaluate solutions that automatically extract information from digital histology images, and their predictive performance ( Homeyer et al, 2022 ). Various other technical and business challenges must similarly be overcome to commercialize digital pathology solutions ( Kearney et al, 2021 ; Lujan et al, 2021 ).…”
Section: Regulatory Standardsmentioning
confidence: 99%