As materials data sets grow in size and scope, the role of data mining and statistical learning methods to analyze these materials data sets and build predictive models is becoming more important. This manuscript introduces matminer, an open-source, Python-based software platform to facilitate datadriven methods of analyzing and predicting materials properties. Matminer provides modules for retrieving large data sets from external databases such as the Materials Project, Citrination, Materials Data Facility, and Materials Platform for Data Science. It also provides implementations for an extensive library of feature extraction routines developed by the materials community, with 44 featurization classes that can generate thousands of individual descriptors and combine them into mathematical functions. Finally, matminer provides a visualization module for producing interactive, shareable plots. These functions are designed in a way that integrates closely with machine learning and data analysis packages already developed and in use by the Python data science community. We explain the structure and logic of matminer, provide a description of its various modules, and showcase several examples of how matminer can be used to collect data, reproduce data mining studies reported in the literature, and test new methodologies.
Nucleation and crystal growth are important in material synthesis, climate modeling, biomineralization, and pharmaceutical formulation. Despite tremendous efforts, the mechanisms and kinetics of nucleation remain elusive to both theory and experiment. Here we investigate sodium chloride (NaCl) nucleation from supersaturated brines using seeded atomistic simulations, polymorph-specific order parameters, and elements of classical nucleation theory. We find that NaCl nucleates via the common rock salt structure. Ion desolvation-not diffusion-is identified as the limiting resistance to attachment. Two different analyses give approximately consistent attachment kinetics: diffusion along the nucleus size coordinate and reaction-diffusion analysis of approach-to-coexistence simulation data from Aragones et al. ( J. Chem. Phys. 2012, 136, 244508 ). Our simulations were performed at realistic supersaturations to enable the first direct comparison to experimental nucleation rates for this system. The computed and measured rates converge to a common upper limit at extremely high supersaturation. However, our rate predictions are between 15 and 30 orders of magnitude too fast. We comment on possible origins of the large discrepancy.
Point defects have a strong impact on the performance of semiconductor and insulator materials used in technological applications, spanning microelectronics to energy conversion and storage. The nature of the dominant defect types, how they vary with processing conditions, and their impact on materials properties are central aspects that determine the performance of a material in a certain application. This information is, however, difficult to access directly from experimental measurements. Consequently, computational methods, based on electronic density functional theory (DFT), have found widespread use in the calculation of point-defect properties. Here we have developed the Python Charged Defect Toolkit (PyCDT) to expedite the setup and post-processing of defect calculations with widely used DFT software. PyCDT has a user-friendly command-line interface and provides a direct interface with the Materials Project database. This allows for setting up many charged defect calculations for any material of interest, as well as post-processing and applying state-of-the-art electrostatic correction terms. Our paper serves as a documentation for PyCDT, and demonstrates its use in an application to the well-studied GaAs compound semiconductor. We anticipate that the PyCDT code will be useful as a framework for undertaking readily reproducible calculations of charged point-defect properties, and that it will provide a foundation for automated, high-throughput calculations.
Increased interest in natural gas hydrate formation and decomposition, coupled with experimental difficulties in diffusion measurements, makes estimating transport properties in hydrates an important technological challenge. This research uses an equilibrium path sampling method for free energy calculations [Radhakrishnan, R.; Schlick, T. J. Chem. Phys. 2004, 121, 2436] with reactive flux and kinetic Monte Carlo simulations to estimate the methane diffusivity within a structure I gas hydrate crystal. The calculations support a water-vacancy assisted diffusion mechanism where methane hops from an occupied "donor" cage to an adjacent "acceptor" cage. For pathways between cages that are separated by five-membered water rings, the free energy landscape has a high barrier with a shallow well at the top. For pathways between cages that are separated by six-membered water rings, the free energy calculations show a lower barrier with no stable intermediate. Reactive flux simulations confirm that many reactive trajectories become trapped in the shallow intermediate at the top of the barrier leading to a small transmission coefficient for these paths. Stable intermediate configurations are identified as doubly occupied off-pathway cages and methane occupying the position of a water vacancy. Rate constants are computed and used to simulate self-diffusion with a kinetic Monte Carlo algorithm. Self-diffusion rates were much slower than the Einstein estimate because of lattice connectivity and methane's preference for large cages over small cages. Specifically, the fastest pathways for methane hopping are arranged in parallel (nonintersecting) channels, so methane must hop via a slow pathway to escape the channel. From a computational perspective, this paper demonstrates that equilibrium path sampling can compute free energies for a broader class of coordinates than umbrella sampling with molecular dynamics. From a technological perspective, this paper provides one estimate for an important transport property that has been difficult to measure. In a hydrate I crystal at 250 K with nearly all cages occupied by methane, we estimate D approximately 7 x 10(-15) X m(2)/s where X is the fraction of unoccupied cages.
This work reexamines seeded simulation results for NaCl nucleation from a supersaturated aqueous solution at 298.15 K and 1 bar pressure. We present a linear regression approach for analyzing seeded simulation data that provides both nucleation rates and uncertainty estimates. Our results show that rates obtained from seeded simulations rely critically on a precise driving force for the model system. The driving force vs. solute concentration curve need not exactly reproduce that of the real system, but it should accurately describe the thermodynamic properties of the model system. We also show that rate estimates depend strongly on the nucleus size metric. We show that the rate estimates systematically increase as more stringent local order parameters are used to count members of a cluster and provide tentative suggestions for appropriate clustering criteria.
Structure-property relationships form the basis of many design rules in materials science, including synthesizability and long-term stability of catalysts, control of electrical and optoelectronic behavior in semiconductors, as well as the capacity of and transport properties in cathode materials for rechargeable batteries. The immediate atomic environments (i.e., the first coordination shells) of a few atomic sites are often a key factor in achieving a desired property. Some of the most frequently encountered coordination patterns are tetrahedra, octahedra, body and face-centered cubic as well as hexagonal close packedlike environments. Here, we showcase the usefulness of local order parameters to identify these basic structural motifs in inorganic solid materials by developing classification criteria. We introduce a systematic testing framework, the Einstein crystal test rig, that probes the response of order parameters to distortions in perfect motifs to validate our approach. Subsequently, we highlight three important application cases. First, we map basic crystal structure information of a large materials database in an intuitive manner by screening the Materials Project (MP) database (61,422 compounds) for elementspecific motif distributions. Second, we use the structure-motif recognition capabilities to automatically find interstitials in metals, semiconductor, and insulator materials. Our Interstitialcy Finding Tool (InFiT) facilitates high-throughput screenings of defect properties. Third, the order parameters are reliable and compact quantitative structure descriptors for characterizing diffusion hops of intercalants as our example of magnesium in MnO 2 -spinel indicates. Finally, the tools developed in our work are readily and freely available as software implementations in the pymatgen library, and we expect them to be further applied to machine-learning approaches for emerging applications in materials science.
Molecular-dynamics simulations are performed to understand the role of host-framework flexibility on the diffusion of methane molecules in the one-dimensional pores of AFI-, LTL-, and MTW-type zeolites. In particular, the impact of the choice of the host model is studied. Dynamically corrected Transition State Theory is used to provide insights into the diffusion mechanism on a molecular level. Free-energy barriers and dynamical correction factors can change significantly by introducing lattice flexibility. In order to understand the phenomenon of free-energy barriers reduction, we investigate the motion of the window atoms. The influence that host-framework flexibility exerts on gas diffusion in zeolites is, generally, a complex function of material, host model, and loading such that transferability of conclusions from one zeolite to the other is not guaranteed.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.