2021
DOI: 10.1038/s41592-021-01205-4
|View full text |Cite|
|
Sign up to set email alerts
|

DOME: recommendations for supervised machine learning validation in biology

Abstract: Modern biology frequently relies on machine learning to provide predictions and improve decision processes. There have been recent calls for more scrutiny on machine learning performance and possible limitations. Here we present a set of community-wide recommendations aiming to help establish standards of machine learning validation in biology. Adopting a structured methods description for machine learning based on DOME (data, optimization, model, evaluation) will allow both reviewers and readers to better und… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
127
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
8
1

Relationship

1
8

Authors

Journals

citations
Cited by 139 publications
(147 citation statements)
references
References 45 publications
1
127
0
Order By: Relevance
“…It is straightforward to develop ML algorithms incorrectly, resulting in models that do not perform accurately on new data. A complete set of community-wide recommendations that aim to establish requirements for ML validation in biology was published recently by Walsh et al 59 The recommendations are split into four core areas of ML: data, optimization, the model, and evaluation of the final model. Topics relevant to QbD ML modeling include splitting datasets correctly, avoiding overfitting when optimizing, and how to evaluate the performance of the ML algorithm using appropriate metrics.…”
Section: The Art Of Developing a Successful ML Model To Advance Qbdmentioning
confidence: 99%
“…It is straightforward to develop ML algorithms incorrectly, resulting in models that do not perform accurately on new data. A complete set of community-wide recommendations that aim to establish requirements for ML validation in biology was published recently by Walsh et al 59 The recommendations are split into four core areas of ML: data, optimization, the model, and evaluation of the final model. Topics relevant to QbD ML modeling include splitting datasets correctly, avoiding overfitting when optimizing, and how to evaluate the performance of the ML algorithm using appropriate metrics.…”
Section: The Art Of Developing a Successful ML Model To Advance Qbdmentioning
confidence: 99%
“…Machine learning has been recently combined with GSMMs, expanding the potential of both approaches [140] . An inherent difficulty of machine learning is that this approach requires relatively large amounts of data to parameterize the algorithm and estimate its performance [141] , thus reducing the applicability of machine learning in this field.…”
Section: Other Approaches For Community Modelingmentioning
confidence: 99%
“…For the processing of HRI, the point-based PSRI index was adapted as described in [27] and transformed to the Hyperspectral image PSRI (HPSRI) as follows:…”
Section: Reflectance Indices Calculationmentioning
confidence: 99%
“…Extracting information from HRIs that is not just pertinent to the plant phenotyping but also readily interpretable is of a special concern. Currently, machine learning-based methods of advanced image analysis as well as other mathematical tools are becoming widespread [25][26][27][28][29][30]. Although quite efficient in many cases, these methods normally do not take physiologically relevant information as an input.…”
Section: Introductionmentioning
confidence: 99%