2019
DOI: 10.1063/1.5078640
|View full text |Cite
|
Sign up to set email alerts
|

Prediction of atomization energy using graph kernel and active learning

Abstract: Data-driven prediction of molecular properties presents unique challenges to the design of machine learning methods concerning data structure/dimensionality, symmetry adaption, and confidence management. In this paper, we present a kernel-based pipeline that can learn and predict the atomization energy of molecules with high accuracy. The framework employs Gaussian process regression to perform predictions based on the similarity between molecules, which is computed using the marginalized graph kernel. To appl… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
60
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
4
3
2

Relationship

1
8

Authors

Journals

citations
Cited by 43 publications
(60 citation statements)
references
References 30 publications
0
60
0
Order By: Relevance
“…28 More details can be found in references. 15,[25][26][27]…”
Section: Normalized Marginalized Graph Kernel Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…28 More details can be found in references. 15,[25][26][27]…”
Section: Normalized Marginalized Graph Kernel Methodsmentioning
confidence: 99%
“…25 Using GraphDot, Tang and de Jong introduced an MGK for molecular atomization energy prediction using the QM7 data sets. 26 Xiang et al developed normalized marginalized graph kernels (nMGK) for molecules and constructed accurate prediction models for various thermodynamic and transport properties of pure substances. 27 In this paper, we aim to benchmark the marginalized graph kernels using the direct message passing neural network (D-MPNN) 13 as a strong baseline.…”
Section: Introductionmentioning
confidence: 99%
“…Also, the data sets have to be designed with enough diversity to capture meaningful underlying structures from input data. Active learning protocols [43,44] are known to help in assuring this diversity. After the training procedure, and assessment of the predictive power of models by cross-validation [45] or other model validation techniques, their parameters can be stored to perform inference in unknown data.…”
Section: Supervised Learningmentioning
confidence: 99%
“…This ultimately allows for building a relatively small but high-quality training set (relative to the design space) to save resources on data acquisition [11,12,13,14]. For molecular property prediction, active learning has been applied to achieve better accuracy than random selection with the same training set size [15,16].…”
Section: Introductionmentioning
confidence: 99%