2022
DOI: 10.48550/arxiv.2204.11953
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Crystal Transformer: Self-learning neural language model for Generative and Tinkering Design of Materials

Abstract: Self-supervised neural language models have recently achieved unprecedented success, from natural language processing to learning the languages of biological sequences and organic molecules. These models have demonstrated superior performance in the generation, structure classification, and functional predictions for proteins and molecules with learned representations. However, most of the masking-based pre-trained language models are not designed for generative design, and their black-box nature makes it diff… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
8
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
3
1

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(8 citation statements)
references
References 51 publications
(71 reference statements)
0
8
0
Order By: Relevance
“…While the Hybrid-mix dataset may contain a certain amount of materials that are not charge neutral or balanced electronegativity (EB), the Hybrid-pure dataset has samples selected from the Hybrid-mix dataset, which are charge neutral and have EB. The Hybrid-strict dataset is obtained using a similar method to the Hybrid-pure dataset, but we use the strict ICSD oxidation states of the elements to calculate the charge neutrality (CN) and EB, which is different (with more strict constraints) from the Hybrid-pure dataset and the dataset used in our previous study [31].…”
Section: Datasetmentioning
confidence: 99%
See 3 more Smart Citations
“…While the Hybrid-mix dataset may contain a certain amount of materials that are not charge neutral or balanced electronegativity (EB), the Hybrid-pure dataset has samples selected from the Hybrid-mix dataset, which are charge neutral and have EB. The Hybrid-strict dataset is obtained using a similar method to the Hybrid-pure dataset, but we use the strict ICSD oxidation states of the elements to calculate the charge neutrality (CN) and EB, which is different (with more strict constraints) from the Hybrid-pure dataset and the dataset used in our previous study [31].…”
Section: Datasetmentioning
confidence: 99%
“…The group includes four GPT series LMs, BART [17], and RoBERTa [46]. In addition, we also use our previous work, BLMM [31], in our experiments to show the performance.…”
Section: Pretrain Transformer Lms For Materials Composition Generationmentioning
confidence: 99%
See 2 more Smart Citations
“…[27] proposed a blank language model (BLM) which could generate sequences by dynamically creating and filling in blanks. Our BLMM composition generator [28] is developed based on the BLM blank-filling model. All material formulas can be rewritten as sequences (e.g., SiTiO 3 to Si Ti O O O) composed of a vocabulary with 118 or fewer elements.…”
Section: Blmm: Transformer Based 2d Materials Composition Generationmentioning
confidence: 99%