2023
DOI: 10.1111/jdv.18814
|View full text |Cite
|
Sign up to set email alerts
|

Determining the clinical applicability of machine learning models through assessment of reporting across skin phototypes and rarer skin cancer types: A systematic review

Abstract: Machine learning (ML) models for skin cancer recognition may have variable performance across different skin phototypes and skin cancer types. Overall performance metrics alone are insufficient to detect poor subgroup performance. We aimed (1) to assess whether studies of ML models reported results separately for different skin phototypes and rarer skin cancers, and (2) to graphically represent the skin cancer training datasets used by current ML models. In this systematic review, we searched PubMed, Embase an… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2

Citation Types

0
1
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
6

Relationship

0
6

Authors

Journals

citations
Cited by 6 publications
(2 citation statements)
references
References 56 publications
0
1
0
Order By: Relevance
“…Groh et al further exemplified that models are most accurate for skin types they were trained on, although some studies report no differences in model performance by skin tone 15 . The effect of skin tone on model performance is largely underreported—only 10% (7/70) of deep learning algorithms include information about skin tone 16 and few report performance by skin tone categories 17 . Further, there is no gold standard for skin tone labeling, and commonly used practices like estimated Fitzpatrick Skin Type are limited by uncertainty 18 and lack of inclusiveness.…”
Section: Introductionmentioning
confidence: 99%
“…Groh et al further exemplified that models are most accurate for skin types they were trained on, although some studies report no differences in model performance by skin tone 15 . The effect of skin tone on model performance is largely underreported—only 10% (7/70) of deep learning algorithms include information about skin tone 16 and few report performance by skin tone categories 17 . Further, there is no gold standard for skin tone labeling, and commonly used practices like estimated Fitzpatrick Skin Type are limited by uncertainty 18 and lack of inclusiveness.…”
Section: Introductionmentioning
confidence: 99%
“…Adamson and Smith have a word of advice about usage of ML methods in diagnosis of skin diseases that inclusivity must be kept in mind for classification results to be accurate [3]. Steele et al searched PubMed, Embase, and CENTRAL, and found that the performance of ML methods was variable, and overall accuracy measure was not a good measure for sub-group accuracy [4].…”
Section: Introductionmentioning
confidence: 99%