2022
DOI: 10.21203/rs.3.rs-2116998/v1
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

polyBERT: A chemical language model to enable fully machine-driven ultrafast polymer informatics

Abstract: Polymers are a vital part of everyday life. Their chemical universe is so large that it presents unprecedented opportunities as well as significant challenges to identify suitable application-specific candidates. We present a complete end-to-end machine-driven polymer informatics pipeline that can search this space for suitable candidates at unprecedented speed and accuracy. This pipeline includes a polymer chemical fingerprinting capability called polyBERT (inspired by Natural Language Processing concepts), a… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
4
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
3
3

Relationship

0
6

Authors

Journals

citations
Cited by 6 publications
(6 citation statements)
references
References 35 publications
0
4
0
Order By: Relevance
“…The immensity of the potential benefits of ML systems for polymer science has resulted in intense development of models for a variety of use cases. These range from general inverse design of materials with given properties [8][9][10][11][12][13] to specific applications including gas separation 14 , thermal conductivity 15 , mechanical toughness 16 , MRI contrast agents 17 , cloud point engineering 18 , and polymer electrolytes 19 . In several instances, the developed ML model was able to offer actionable material designs or predictions, leading to experimental validation of the model itself 12,17,18 .…”
mentioning
confidence: 99%
“…The immensity of the potential benefits of ML systems for polymer science has resulted in intense development of models for a variety of use cases. These range from general inverse design of materials with given properties [8][9][10][11][12][13] to specific applications including gas separation 14 , thermal conductivity 15 , mechanical toughness 16 , MRI contrast agents 17 , cloud point engineering 18 , and polymer electrolytes 19 . In several instances, the developed ML model was able to offer actionable material designs or predictions, leading to experimental validation of the model itself 12,17,18 .…”
mentioning
confidence: 99%
“…However, these methods are limited in their ability to address a broad spectrum of chemical tasks, focusing instead on specific categories or a narrow range of challenges within the field. Subsequently, there have been some efforts to apply small language models in the field of chemistry (Kuenneth & Ramprasad, 2023;Flam-Shepherd et al, 2022;Fabian et al, 2020;Edwards et al, 2021;Liu et al, 2021). Similarly, however, these methods are also only capable of addressing a subset of chemical tasks.…”
Section: Related Workmentioning
confidence: 99%
“…16 Other helpful resources include the software QSARINS, which focuses on multiple linear regression modeling and includes tools for data preprocessing, validation, outlier detection, and visualization 60 and polyBERT, an endto-end machine learning pipeline for polymer informatics and optimization. 61 Different model types necessitate different data set sizes. Neural nets are powerful nonlinear models consisting of "neurons" organized and interconnected in complex, versatile architectures.…”
Section: Modeling and Leveraging Screening Outputs: How Can This Libr...mentioning
confidence: 99%