2020
DOI: 10.1016/j.commatsci.2020.109544
|View full text |Cite
|
Sign up to set email alerts
|

The Materials Simulation Toolkit for Machine learning (MAST-ML): An automated open source toolkit to accelerate data-driven materials research

Abstract: As data science and machine learning methods are taking on an increasingly important role in the materials research community, there is a need for the development of machine learning software tools that are easy to use (even for nonexperts with no programming ability), provide flexible access to the most important algorithms, and codify best practices of machine learning model development and evaluation. Here, we introduce the Materials Simulation Toolkit for Machine Learning (MAST-ML), an open source Python-b… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
32
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
9

Relationship

3
6

Authors

Journals

citations
Cited by 44 publications
(32 citation statements)
references
References 53 publications
0
32
0
Order By: Relevance
“…The data contain 408 activation energies for 15 different hosts and are described in detail in Reference 35 (see the sidebar titled Online Availability of Data in This Review for data availability on Figshare). All the models were evaluated using the routines available in the scikit-learn package (155), and the model fits and analysis were automated using the Materials Simulation Toolkit for Machine Learning (MAST-ML) (https://github.com/uw-cmg/MAST-ML) (156).…”
Section: Example Of Assessing Model Errors and Domain Of Applicability Using Gaussian Process Regression And Random Forest Decision Tree mentioning
confidence: 99%
See 1 more Smart Citation
“…The data contain 408 activation energies for 15 different hosts and are described in detail in Reference 35 (see the sidebar titled Online Availability of Data in This Review for data availability on Figshare). All the models were evaluated using the routines available in the scikit-learn package (155), and the model fits and analysis were automated using the Materials Simulation Toolkit for Machine Learning (MAST-ML) (https://github.com/uw-cmg/MAST-ML) (156).…”
Section: Example Of Assessing Model Errors and Domain Of Applicability Using Gaussian Process Regression And Random Forest Decision Tree mentioning
confidence: 99%
“…All of the models were evaluated using the routines available in the scikit-learn package, 156 and the model fits and analysis were automated using the Materials Simulation Toolkit for Machine Learning (MAST-ML). 157,158 To help assess the model domain of applicability, we explore a chemistry test where we consider Pd-X systems, where Pd is the host element and X is a dilute impurity taken from three sets (set 1 = 3d and 4d transition metals, set 2 = Col VIA elements except O, set 3 = elements from the first 2 rows on the periodic table). In this test we train the model with no Pd host data and then predict the errors for the 3 sets.…”
Section: Model Domain Of Applicability and Assessing Uncertainties In...mentioning
confidence: 99%
“…The model analysis and exploration were primarily performed with the MAterials Simulation Toolkit for Machine Learning (MAST-ML, version 3.x, University of Wisconsin-Madison Computational Materials Group, Madison, WI, USA.) [ 20 ], an open-source Python package with scikit-learn [ 19 ] library to automate machine learning workflows and model assessments. The hyperparameters ( α , γ ) of the GKRR model were optimized using a genetic algorithm (GA) with the five-fold cross validation (CV) root-mean-square error (RMSE) as the scoring metric.…”
Section: Methodsmentioning
confidence: 99%
“…The machine learning models were built and validated with the MAterials Simulation Toolkit for Machine Learning MAST-ML utility [79], which uses numerical procedures as implemented in scikit-learn [80].…”
Section: Vibmentioning
confidence: 99%