2022
DOI: 10.26434/chemrxiv-2022-41t70
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

DigiMOF: A Database of MOF Synthesis Information Generated via Text Mining

Abstract: The vastness of materials space, particularly that which is concerned with metal-organic frameworks (MOFs), creates the critical problem of performing efficient identification of promising materials for specific applications. Although high-throughput computational approaches, including the use of machine learning, have been useful in rapid screening and rational design of MOFs, they tend to neglect descriptors related to their synthesis. One way to improve the efficiency of MOF discovery is to data mine publis… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
0
0

Year Published

2023
2023
2023
2023

Publication Types

Select...
1
1
1

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(5 citation statements)
references
References 36 publications
0
0
0
Order By: Relevance
“…"ZIF-8nano," 79 confounding attempts for automated identification of reaction products using regular expressions. 33 While this issue will undoubtably be resolved by the adoption of transformer-based language models such as BERT 80 and GPT-4, 81 such models became available only recently and the scientific community, including our group, is in the process of probing their extension to scientific data mining. In fact, the current study highlighted a number of issues with the current structure and completeness of reported synthetic protocols, understanding of which will be very helpful in engineering and fine-tuning GPT-based models.…”
Section: Harnessing Synthesis Information For Accelerated Methodology...mentioning
confidence: 99%
See 4 more Smart Citations
“…"ZIF-8nano," 79 confounding attempts for automated identification of reaction products using regular expressions. 33 While this issue will undoubtably be resolved by the adoption of transformer-based language models such as BERT 80 and GPT-4, 81 such models became available only recently and the scientific community, including our group, is in the process of probing their extension to scientific data mining. In fact, the current study highlighted a number of issues with the current structure and completeness of reported synthetic protocols, understanding of which will be very helpful in engineering and fine-tuning GPT-based models.…”
Section: Harnessing Synthesis Information For Accelerated Methodology...mentioning
confidence: 99%
“…To produce a corpus of ZIF-8 synthesis protocols, we initially followed established methods to download collections of papers and identify synthesis protocols within them. 32,33 Synthesis papers were identified by searching the SCOPUS database using Elsevier's elsapy software (https://github.com/ElsevierDev/elsapy). Papers were identified using the search term "ZIF OR zeolitic imidazol* AND synthesis," returning 4198 results.…”
Section: Text Collection Paragraph Identification and Grammar Parsingmentioning
confidence: 99%
See 3 more Smart Citations