2016
DOI: 10.1016/j.chemolab.2016.01.002
|View full text |Cite
|
Sign up to set email alerts
|

Extending proteochemometric modeling for unraveling the sorption behavior of compound–soil interaction

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
7
0

Year Published

2018
2018
2020
2020

Publication Types

Select...
4
1

Relationship

4
1

Authors

Journals

citations
Cited by 5 publications
(7 citation statements)
references
References 42 publications
0
7
0
Order By: Relevance
“…This study exploited six popular and convenient ML algorithms, namely k-NN, rpart, glm, RF, XGB, and SVM, for discriminating AVPs from Non-AVPs. Previously, these ML algorithms have been extensively utilized in various domains [84,85,91,92,93,94,95,96,97,98,99]. In this study, the six ML algorithms were implemented using the caret package in the R software [100].…”
Section: Methodsmentioning
confidence: 99%
“…This study exploited six popular and convenient ML algorithms, namely k-NN, rpart, glm, RF, XGB, and SVM, for discriminating AVPs from Non-AVPs. Previously, these ML algorithms have been extensively utilized in various domains [84,85,91,92,93,94,95,96,97,98,99]. In this study, the six ML algorithms were implemented using the caret package in the R software [100].…”
Section: Methodsmentioning
confidence: 99%
“…As noticed in Table 7, the top seven important PCPs are QIAN880137, AURR980102, ROBB760113, PRAM820101, GRAR740101, PALJ810111, and Figure 3, the superior performance of our proposed model iQSP over 10-fold CV and independent validation test might mainly be due to the following reasons: (i) Performing with multiple random sampling procedure to protect against the risk of having good predictive result by chance [39][40][41][42][43]49,50]; (ii) using an efficient feature selection method (GA-SAR) to identify m informative features from 531 PCPs. Using eighteen informative PCPs could provide faster and more cost-effective models, while model developers could gain an insight into the underlying prediction processes [58,[62][63][64]; (iii) selecting a powerful method for QSP prediction. Although, iQSP displayed a superior performance over the existing methods assessed by the rigorous cross-validation methods, there is still room for further improvements, including increasing the size of QSPs by gathering peptide sequences from various data sources, utilizing an interpretable learning algorithm, such as scoring card method [44,53], improving the interpretation of important features responsible for the biological activity [50,64] and exploring different ML algorithms, such as extreme gradient boosting [65] or deep learning [66].…”
Section: Feature Contribution Analysismentioning
confidence: 99%
“…In order to establish a robust and interpretable sequence-based tool for modeling the investigated PVPs, we followed the six prime keys as mentioned in a series of recent publications [17][18][19][20][21][22][23][24][25][26][27][28][29][30][31][32][33] and summarized in several comprehensive review papers [29,[34][35][36]: (i) establishing a reliable dataset that contains experimentally validated sequences for training and validating the model; (ii) representing protein sequences with interpretable features; (iii) developing interpretable learning algorithms so as to allow the interpretation of important features responsible for the biological activity; (iv) assessing the prediction model using standard cross-validation tests; (v) constructing a user-friendly web-server for obtaining the prediction without the need to understand complex mathematical and statistical details; and (vi) analyzing and characterizing the important features derived from the developed model to provide a better understanding of the biophysical and biochemical properties of proteins. Figure 2 shows the workflow of PVPred-SCM, which works in predicting and analyzing PVPs.…”
Section: Methodsmentioning
confidence: 99%
“…Although, experiment #9 was not in the three-top ranked experiments over 10-fold CV, it provides a promising result in terms of ACC, MCC, and auROC with 92.52%, 0.846, and 0.948, respectively, which was not significantly different from the result of experiment #3 (95.11%, 0.894, and 0.966). Moreover, due to the fact that the independent test was the most rigorous cross-validation method to demonstrate the robustness and reliability of the model in real-world applications [17][18][19][20]28,29,31,33,[39][40][41], it could be noted that experiment #9 provided an important contribution to PVP prediction. For convenience, the best PVP predictor based on the SCM method in conjunction with the propensity scores of dipeptides from experiment #9 would be referred to as PVPred-SCM.…”
Section: Prediction Performancementioning
confidence: 99%
See 1 more Smart Citation