2021
DOI: 10.1021/acs.jcim.0c01489
|View full text |Cite|
|
Sign up to set email alerts
|

XGraphBoost: Extracting Graph Neural Network-Based Features for a Better Prediction of Molecular Properties

Abstract: Determining the properties of chemical molecules is essential for screening candidates similar to a specific drug. These candidate molecules are further evaluated for their target binding affinities, side effects, target missing probabilities, etc. Conventional machine learning algorithms demonstrated satisfying prediction accuracies of molecular properties. A molecule cannot be directly loaded into a machine learning model, and a set of engineered features needs to be designed and calculated from a molecule. … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
43
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
6
2
1

Relationship

0
9

Authors

Journals

citations
Cited by 55 publications
(43 citation statements)
references
References 58 publications
0
43
0
Order By: Relevance
“…The training-validation loss ratio could serve as a heuristic to indicate overfitting in some instances, what constitutes a suitable threshold may differ according to the model type and the dataset. Various machine-learning models, especially in intricate architectures such as deep learning, have been found to be a practical approach, even when the ratio between training loss and validation loss is high [47][48][49]. A well-established phenomenon in deep learning, as well as some classical machine learning, has addressed this issue regarding the bias-variance tradeoff, known for the double descent risk curve [50].…”
Section: Model Results and Validationmentioning
confidence: 99%
“…The training-validation loss ratio could serve as a heuristic to indicate overfitting in some instances, what constitutes a suitable threshold may differ according to the model type and the dataset. Various machine-learning models, especially in intricate architectures such as deep learning, have been found to be a practical approach, even when the ratio between training loss and validation loss is high [47][48][49]. A well-established phenomenon in deep learning, as well as some classical machine learning, has addressed this issue regarding the bias-variance tradeoff, known for the double descent risk curve [50].…”
Section: Model Results and Validationmentioning
confidence: 99%
“…Extensive literature reports concerning the benchmarks of algorithm models using the aforementioned databases applied to VS related tasks are available, such as molecular property predictions, fingerprint generation or the evaluation of structural protein-ligand docking parameters. These include the following: Support Vector Machine (SVM), Extreme Gradient Boost (XGBoost), Random Forest (RF), and Deep Neural Networks (DNN) ( Jiang et al, 2021 ) as representatives of descriptor-based models and many graph-based algorithm variants, such as MPNN—Message Passing Neural Networks ( Yang et al, 2019 ; Deng et al, 2021 ; Jiang et al, 2021 ) and networks implementing algorithm model variants involving spatial graph convolution, like GCN—Graph Convolution Network ( Li et al, 2017 ; Xiong et al, 2020 ; Menke and Koch, 2020 ; Deng et al, 2021 ; Hsieh et al, 2020 ) or GC—Graph Convolution ( Wu et al, 2018 ) and spectral graph convolution, such as AGCN–Adaptive Graph Convolution ( Li et al, 2018 ), graph based networks including attention mechanisms of interaction between near nodes or edges, i.e., AFP—Attentive Fingerprint ( Xiong et al, 2020 ; Jiang et al, 2021 ), PAGTN—Path-Augmented Graph Transformer Network ( Chen et al, 2019 ), EAGCN—Edge Attention GCN ( Shang et al, 2018 ), among others ( Wu et al, 2018 ; Lim et al, 2019 ).…”
Section: Introductionmentioning
confidence: 99%
“…Direct Message Passage Neural Network, D-MPNN—a graph based model combined with Extreme Gradient Boost, XGBoost—a descriptor-based network as the output layer, achieved the best results for several of the presented dataset ( Deng et al, 2021 ). Furthermore, the concatenation of molecular fingerprint vectors generated by conventional models with descriptors generated using graph models have been reported as providing the best prediction results when submitted to the final parameter generation layers ( Wang et al, 2019 ).…”
Section: Introductionmentioning
confidence: 99%
“…Another aspect of their attractiveness for molecular property prediction is the ease with which a molecule can be described as an undirected graph, transforming atoms to nodes and bonds to edges encoded both atom and bond properties. GNNs have proven to be useful and powerful tools in the machine learning molecular modeling toolbox [19,20].…”
Section: Introductionmentioning
confidence: 99%