2009
DOI: 10.1021/ci900203n
|View full text |Cite
|
Sign up to set email alerts
|

Application of Random Forest Approach to QSAR Prediction of Aquatic Toxicity

Abstract: This work is devoted to the application of the random forest approach to QSAR analysis of aquatic toxicity of chemical compounds tested on Tetrahymena pyriformis. The simplex representation of the molecular structure approach implemented in HiT QSAR Software was used for descriptors generation on a two-dimensional level. Adequate models based on simplex descriptors and the RF statistical approach were obtained on a modeling set of 644 compounds. Model predictivity was validated on two external test sets of 339… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

2
93
0

Year Published

2012
2012
2021
2021

Publication Types

Select...
6
4

Relationship

2
8

Authors

Journals

citations
Cited by 146 publications
(95 citation statements)
references
References 40 publications
2
93
0
Order By: Relevance
“…[40,45] The tree was built in the space of the predictions of 50 single models using Kruskal's algorithm. [46] Then average distance (d av ) and its root-mean-square deviation (s) among all tree edges were calculated.…”
Section: Modeling Techniques and Applicability Domainmentioning
confidence: 99%
“…[40,45] The tree was built in the space of the predictions of 50 single models using Kruskal's algorithm. [46] Then average distance (d av ) and its root-mean-square deviation (s) among all tree edges were calculated.…”
Section: Modeling Techniques and Applicability Domainmentioning
confidence: 99%
“…In contrast to similarity based methods, feature based methods either select input features (chemical descriptors) or weight them by a score or a model parameter. Feature-based approaches include (generalized) linear models (e.g., Luco and Ferretti, 1997;Sagardia et al, 2013), random forests, (e.g., Svetnik et al, 2003;Polishchuk et al, 2009), and scoring schemes based on naive Bayes (Bender et al, 2004;Xia et al, 2004). Choosing informative features for the task at hand is key in feature-based methods and requires deep insights into chemical and biological properties and processes (Verbist et al, 2015), such as interactions between molecules (e.g., ligand-target), reactions and enzymes involved, and metabolic modifications of the molecules.…”
Section: Introductionmentioning
confidence: 99%
“…21 The UNC group used the sphere exclusion rational design method 14 and the random forest (Breiman 2001) and k Nearest Neighbors (kNN) (Zheng and Tropsha, 2000) QSAR methods. Additionally, the minimal test set dissimilarity 22 and random forest approach 23 were used by the UNC team in collaboration with A.V. Bogatsky Physical Chemical Institute NASU (UNC2).…”
Section: Introductionmentioning
confidence: 99%