2017
DOI: 10.1021/acs.jcim.6b00753
|View full text |Cite
|
Sign up to set email alerts
|

Comparison of the Predictive Performance and Interpretability of Random Forest and Linear Models on Benchmark Data Sets

Abstract: The ability to interpret the predictions made by quantitative structure activity relationships (QSARs) offers a number of advantages. Whilst QSARs built using non6linear modelling approaches, such as the popular Random Forest algorithm, might sometimes be more predictive than those built using linear modelling approaches, their predictions have been perceived as difficult to interpret. However, a growing number of approaches have been proposed for interpreting non6linear QSAR models in general and Random Fores… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
64
0

Year Published

2018
2018
2024
2024

Publication Types

Select...
10

Relationship

0
10

Authors

Journals

citations
Cited by 98 publications
(64 citation statements)
references
References 69 publications
(239 reference statements)
0
64
0
Order By: Relevance
“…Furthermore, they noticed that their data sets were SBVS compliant and compared advantageously to the biased DUD sets, leading to a potential broader use of their sets. MUV sets were applied to the evaluation of VS tools (Tiikkainen et al, 2009 ; Abdo et al, 2010 ), the training of new QSAR models (Marchese Robinson et al, 2017 ) or molecular graph convolutions (Kearnes et al, 2016 ).…”
Section: Discussion and Recommendationsmentioning
confidence: 99%
“…Furthermore, they noticed that their data sets were SBVS compliant and compared advantageously to the biased DUD sets, leading to a potential broader use of their sets. MUV sets were applied to the evaluation of VS tools (Tiikkainen et al, 2009 ; Abdo et al, 2010 ), the training of new QSAR models (Marchese Robinson et al, 2017 ) or molecular graph convolutions (Kearnes et al, 2016 ).…”
Section: Discussion and Recommendationsmentioning
confidence: 99%
“…While making machine learning models and predictions more accessible is important to demonstrate impact, efforts to increase the interpretability of these models beyond the "black box" are critical 63 . We and others have taken different routes to improve this aspect including tools to highlight contributions of models to test molecules 64 , identifying training compounds in the same neighborhood as test molecules and scores of model applicability or overlap 46,63 .…”
Section: Making Models More Accessible and Interpretablementioning
confidence: 99%
“…Our previously published studies reported a better performance while using RF for classification tasks (46,47). Furthermore, RF is a commonly used algorithm which reported higher predictive ability compared to other ML-based algorithms (48,49). Thus, RF classifier was selected in our proposed study to risk stratify the patients.…”
Section: Supplementarymentioning
confidence: 99%