2023
DOI: 10.1073/pnas.2220789120
|View full text |Cite
|
Sign up to set email alerts
|

Machine learning of spectra-property relationship for imperfect and small chemistry data

Abstract: Machine learning (ML) is causing profound changes to chemical research through its powerful statistical and mathematical methodological capabilities. However, the nature of chemistry experiments often sets very high hurdles to collect high-quality data that are deficiency free, contradicting the need of ML to learn from big data. Even worse, the black-box nature of most ML methods requires more abundant data to ensure good transferability. Herein, we combine physics-based spectral descriptors with a symbolic r… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
8
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
6
1

Relationship

3
4

Authors

Journals

citations
Cited by 10 publications
(9 citation statements)
references
References 33 publications
(36 reference statements)
0
8
0
Order By: Relevance
“…Some previous studies have indicated that the performance of a trained model in a single-solvent system including gas or water phase degrades significantly once it is directly transferred to unseen systems or scenarios. , Therefore, to make sure that HMNN is omniscient to various solvents, we design a subsequent learning process (tier II) to extract solvent difference knowledge on a few-sample and multisolvent data set . During tier II, for each individual task, we select the top two important experts that have already learned more than 90% of the required task knowledge in total from both gas and water phases and freeze the four selected experts (2 from gas phase and 2 from water phase) for subsequent network assembling (Figure A).…”
Section: Model Frameworkmentioning
confidence: 99%
See 1 more Smart Citation
“…Some previous studies have indicated that the performance of a trained model in a single-solvent system including gas or water phase degrades significantly once it is directly transferred to unseen systems or scenarios. , Therefore, to make sure that HMNN is omniscient to various solvents, we design a subsequent learning process (tier II) to extract solvent difference knowledge on a few-sample and multisolvent data set . During tier II, for each individual task, we select the top two important experts that have already learned more than 90% of the required task knowledge in total from both gas and water phases and freeze the four selected experts (2 from gas phase and 2 from water phase) for subsequent network assembling (Figure A).…”
Section: Model Frameworkmentioning
confidence: 99%
“…9 Additionally, using a regression-based method facilitates learning an approximate formulaic mapping of QSPR, hence addressing the obstacle of imperfect and limited data. 10 However, regarding the learned end-to-end relationships, their nature of low dimension and lack of chemical explanatory factors determine the inevitably limited generalization abilities of DL-based models.…”
Section: ■ Introductionmentioning
confidence: 99%
“…physically meaningful quantity, the spectra have been used directly as descriptors for predicting catalytic properties. 22,23 However, the implementation of dynamic structural inversion and continuous catalytic generation based on spectroscopic descriptors is still to be developed.…”
Section: ■ Introductionmentioning
confidence: 99%
“…For mining the implied spectral information, machine learning (ML) has given a glimpse of achieving breakthroughs. ML can be used to exact useful features from spectra, to establish correlations between spectra and catalyst properties or structures and to optimize catalyst design based on spectral feedback. As a physically meaningful quantity, the spectra have been used directly as descriptors for predicting catalytic properties. , However, the implementation of dynamic structural inversion and continuous catalytic generation based on spectroscopic descriptors is still to be developed.…”
Section: Introductionmentioning
confidence: 99%
“…ML techniques have significantly advanced the establishment of high-throughput screening (HTS) pipelines for discovering a broad spectrum of materials. This progress is evident in the polymer domain, primarily attributed to the evolution of polymer informatics, which has introduced a diverse set of descriptors for numerically characterizing polymers, including fingerprint descriptors, physiochemical descriptors, and graph descriptors. Given the complex nature of polymers, in terms of their structure and composition, continuous efforts are being made to enhance these descriptors to better represent and predict polymer properties. For the purpose of predicting a range of polymer properties, Ramprasad et al developed a hierarchical fingerprint encompassing three distinct scales: atomic, quantitative structure–property relationship, and morphological levels .…”
Section: Introductionmentioning
confidence: 99%