2022
DOI: 10.1039/d2sc04322j
|View full text |Cite
|
Sign up to set email alerts
|

BatteryDataExtractor: battery-aware text-mining software embedded with BERT models

Abstract: BatteryDataExtractor is the first property-specific text-mining tool for auto-generating databases of materials and their property, device, and associated characteristics. The software has been constructed by embedding the BatteryBERT model.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
17
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
8

Relationship

2
6

Authors

Journals

citations
Cited by 14 publications
(20 citation statements)
references
References 55 publications
0
17
0
Order By: Relevance
“…“ ZIF-8 nano ”, 79 confounding attempts for automated identification of reaction products using regular expressions. 33 While this issue will undoubtably be resolved by the adoption of transformer-based language models such as BERT 80 and GPT-4, 81 such models became available only recently and the scientific community, 82 including our group, is in the process of probing their extension to scientific data mining. In fact, the current study highlighted a number of issues with the current structure and completeness of reported synthetic protocols, understanding of which will be very helpful in engineering and fine-tuning GPT-based models.…”
Section: Resultsmentioning
confidence: 99%
“…“ ZIF-8 nano ”, 79 confounding attempts for automated identification of reaction products using regular expressions. 33 While this issue will undoubtably be resolved by the adoption of transformer-based language models such as BERT 80 and GPT-4, 81 such models became available only recently and the scientific community, 82 including our group, is in the process of probing their extension to scientific data mining. In fact, the current study highlighted a number of issues with the current structure and completeness of reported synthetic protocols, understanding of which will be very helpful in engineering and fine-tuning GPT-based models.…”
Section: Resultsmentioning
confidence: 99%
“…ChemDataWriter uses the same web scrapers that are embedded within BatteryDataExtractor 25 to download papers from three publishers (the Royal Society of Chemistry, Elsevier, and Springer), as well as the same document processors to pre-process the HTML/XML files into plain text. Web scrapers allow users to download multiple papers on a specific topic, over a specific date range, or from a particular set of journals.…”
Section: Implementation Detailsmentioning
confidence: 99%
“…6 In chemistry and materials science, researchers have used NLP to perform data extraction, 7–9 create databases, 10–17 and make predictions out of the extracted data. 18–22 To enhance the text-mining performance, many NLP toolkits and models have been created over the past few years, such as ChemDataExtractor, 23,24 BatteryDataExtractor, 25 MatBERT, 26 MatSciBERT, 27 and BatteryBERT. 28…”
Section: Introductionmentioning
confidence: 99%
“…To address this problem, Swain and Cole developed ChemDataExtractor (CDE) to automate the extraction of chemical data from research articles and patents via text mining . To date, CDE has been deployed to automatically assemble databases of magnetic materials, , battery materials, UV/vis absorption spectra, hydrogen storage and synthesis applications, and nanomaterial synthesis. While CDE has been used to text mine both organic and inorganic chemistry literatures, it has yet to be applied to MOFs, possibly due to challenges presented by the diverse nature of their building blocks and complex synthesis techniques. To the best of our knowledge, Park et al’s text mining software was the first work which enlisted text mining to scrape MOF-related data such as pore volume and surface area .…”
Section: Introductionmentioning
confidence: 99%