Radiomics machine learning study with a small sample size: Single random training-test set split may lead to unreliable results

An, Chansik; Park, Yae Won; Ahn, Sung Soo; Han, Kyunghwa; Kim, Hwiyoung; Lee, Seung Koo

doi:10.1371/journal.pone.0256152

Cited by 40 publications

(29 citation statements)

References 26 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The authors have not utilized a validation set and as such the proposed ML models are likely overfitted to the test set and report a misleading higher area under the curve performance. 5…”

Section: Sirmentioning

confidence: 99%

Developing Machine Learning Algorithms to Support Patient-centered, Value-based Carpal Tunnel Decompression Surgery

Mantelakis

Khajuria

2022

Plastic and Reconstructive Surgery - Global Open

View full text Add to dashboard Cite

“…The authors have not utilized a validation set and as such the proposed ML models are likely overfitted to the test set and report a misleading higher area under the curve performance. 5…”

Section: Sirmentioning

confidence: 99%

Developing Machine Learning Algorithms to Support Patient-centered, Value-based Carpal Tunnel Decompression Surgery

Mantelakis

Khajuria

2022

Plastic and Reconstructive Surgery - Global Open

View full text Add to dashboard Cite

“…Recent work has shown that using ML methods after a single random training-validation dataset split may yield unreliable results. 17 Thus, to assess the method's out-ofsample performance and to minimize the risk of overfitting, cross-validation offers a simple way to test the accuracy of the method for new (independent) data. For instance, k-fold cross validation procedures can provide unbiased estimates of the true generalization performance for feature selection, model comparison, or classification accuracy.…”

Section: Consider the Use Of Independent Datasets Or Cross-validationmentioning

confidence: 99%

“…Recent work has shown that using ML methods after a single random training‐validation dataset split may yield unreliable results 17 . Thus, to assess the method's out‐of‐sample performance and to minimize the risk of over‐fitting, cross‐validation offers a simple way to test the accuracy of the method for new (independent) data.…”

Section: Consider the Use Of Independent Datasets Or Cross‐validationmentioning

confidence: 99%

Ten simple rules for reporting machine learning methods implementation and evaluation on biomedical data

Seghier

2021

Int J Imaging Syst Tech

View full text Add to dashboard Cite

There is a huge discrepancy in how researchers implement, evaluate, and report the performance of a machine learning method for classification or segmentation of biomedical data. Poor reporting and inadequate inferences are, however, not unusual to see in current literature. More specifically, vague aims and scope, missing details about the data, ambiguous preprocessing procedures, lack of clarity regarding the method's implementation, poor validation and testing, invalid comparisons between methods, and the absence of a clear rationale for performance metric choices are making it difficult to draw the right conclusions from many studies in the field. This report suggests 10 guidelines and principles that should be followed when reporting the implementation of a method and the evaluation of its performance in order to make the study transparent, interpretable, replicable, and useful. All stages of data processing and method's performance evaluation should be clearly described, and parameters and metric choices must be justified in order to aid readers in appreciating the performance of the method or in comparing it with other relevant methods. We feel that these guidelines are important for clear scientific communication in the field of biomedical data processing.

show abstract

“…[7] In addition, a study demonstrated that even with cross-validation, studies with a small sample size are still unreliable. [38] Therefore, multi-center research is necessary to obtain sufficient sample size and avoid unstable and suboptimal results.…”

Section: Challenges and Potential Solutionsmentioning

confidence: 99%

<strong>Radiomics in Antineoplastic Agents Development: Application and Challenge in Response Evaluation</strong>

2019

Chinese Medical Sciences Journal

View full text Add to dashboard Cite

The recent spring up of the antineoplastic agents and the prolonged survival rate bring both challenge and chance to radiological practice. Radiological methods including CT, MRI and PET play an increasingly important role in evaluation the efficacy of these antineoplastic drugs. However, different antineoplastic agents potentially induce different radiological signs, making it a challenge for radiological response evaluation, which depends mainly on one-sided morphological RECIST criteria in the status quo of clinical practice. This brings opportunities for the development of radiomics, which is promising to serve as a surrogate for response evaluations of anti-tumor treatments. In this article, we introduce the basic concepts of radiomics, review the state-of-art radiomics researches with highlights of radiomics application researches in predictions of molecular biomarkers, treatment response, and prognosis. We also provide in-depth analyses on major obstacles and future direction of this new technique in clinical researches on new antineoplastic agents. OVERVIEW OF IMAGING RESPONSE EVALUATIONThe latest few decades have witnessed great advances in antineoplastic agents, which mainly focus on chemotherapy, immunotherapy and molecular targeted therapy. [1][2][3] Following the spring up of the antineoplastic agents and the consequential prolonged survival, traditional gold standard for 

show abstract

Radiomics machine learning study with a small sample size: Single random training-test set split may lead to unreliable results

Cited by 40 publications

References 26 publications

Developing Machine Learning Algorithms to Support Patient-centered, Value-based Carpal Tunnel Decompression Surgery

Developing Machine Learning Algorithms to Support Patient-centered, Value-based Carpal Tunnel Decompression Surgery

Ten simple rules for reporting machine learning methods implementation and evaluation on biomedical data

<strong>Radiomics in Antineoplastic Agents Development: Application and Challenge in Response Evaluation</strong>

Contact Info

Product

Resources

About