2022
DOI: 10.1101/2022.12.01.518728
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

MLcps: Machine Learning Cumulative Performance Score for classification problems

Abstract: A performance metric is a tool to measure the correctness of a trained Machine Learning (ML) model. Numerous performance metrics have been developed for classification problems making it overwhelming to select the appropriate one since each of them represents a particular aspect of the model. Furthermore, selection of a performance metric becomes harder for problems with imbalanced and/or small datasets. Therefore, in clinical studies where datasets are frequently imbalanced and, in situations when the prevale… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
6
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
2
1
1

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(6 citation statements)
references
References 13 publications
0
6
0
Order By: Relevance
“…All supporting data, which includes images used for training, validation, and testing [22], as well as the trained model weights [23], is available at zenodo.…”
Section: Availability Of Supporting Source Code and Requirementsmentioning
confidence: 99%
“…All supporting data, which includes images used for training, validation, and testing [22], as well as the trained model weights [23], is available at zenodo.…”
Section: Availability Of Supporting Source Code and Requirementsmentioning
confidence: 99%
“…An archival copy of the code and supporting data is available via the GigaScience repository, GigaDB [ 35 ]. DOME-ML (Data, Optimisation, Model, and Evaluation in Machine Learning) annotations, supporting the current study, are available via the supporting data in GigaDB.…”
Section: Data Availabilitymentioning
confidence: 99%
“…TPOT (26,27,28), FEDOT (19), Auto-Sklearn (22), GAMA (29), RECIPE (30), and ML-Plan (31)); (4) accessibility and ease of use; some AutoMLs are designed to be broadly accessible requiring little to no coding experience to implement and run (e.g. ALIRO (24), MLme (32), MLIJAR-supervised (33), H20-3 (34), STREAMLINE (35), and Auto-WEKA (36)), while others are designed primarily as a code library to facilitate building a customizable pipeline with automated elements (e.g. LAMA (37), FLAML (38), Hyperopt-sklearn (39), TransmorgrifAI (40), MLBox (41), Xcessiv (42)); (5) output focus; the aims of AutoML vary, with focus on either a single best optimized model/pipeline (e.g.…”
Section: Introductionmentioning
confidence: 99%
“…TPOT (28)), or a direct comparison of model performance across algorithms (e.g. STREAMLINE (35), MLme (32), and PYCARET (23)); (6) inclusion and automation of different possible elements of a complete end-to-end ML pipeline; with algorithm selection and hyperparameter optimization being most common; and (7) transparency in the documentation, i.e. to what degree are the available elements, options, and automations defined and validated.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation