“…The increase in the number of model parameters and different training strategies have improved the performance of these models on natural language tasks such as question answering, 2,3 text summarization, 4,5 sentiment analysis, 1,3 machine translation, 6 conversational abilities, [7][8][9] and code generation. 10 In the materials science domain, existing datasets are mainly related to tasks like named entity recognition (NER), 11,12 text classication, [13][14][15] synthesis process and relation classication, 16 and composition extraction from tables, 17 which are used by researchers to benchmark the performance of materials domain language models like MatS-ciBERT 14 (the rst materials-domain language model), Mat-BERT, 18 MaterialsBERT, 19 OpticalBERT, 20 and BatteryBERT. 15 Recently, Song et al (2023) reported better performance of materials science domain specic language models compared to BERT and SciBERT on seven materials domain datasets related to named entity recognition, relation classication, and text classication.…”