X-ray absorption spectroscopy is a premier, element-specific technique for materials characterization. Specifically, the x-ray absorption near edge structure (XANES) encodes important information about the local chemical environment of an absorbing atom, including coordination number, symmetry and oxidation state. Interpreting XANES spectra is a key step towards understanding the structural and electronic properties of materials, and as such, extracting structural and electronic descriptors from XANES spectra is akin to solving a challenging inverse problem. Existing methods rely on empirical fingerprints, which are often qualitative or semi-quantitative and not transferable. In this study, we present a machine learning-based approach, which is capable of classifying the local coordination environments of the absorbing atom from simulated K-edge XANES spectra. The machine learning classifiers can learn important spectral features in a broad energy range without human bias, and once trained, can make predictions on the fly. The robustness and fidelity of the machine learning method are demonstrated by an average 86% accuracy across the wide chemical space of oxides in eight 3d transition metal families. We found that spectral features beyond the pre-edge region play an important role in the local structure classification problem, especially for the late 3d transition metal elements.
Simulations of excited state properties, such as spectral functions, are often computationally expensive and therefore not suitable for high-throughput modeling. As a proof of principle, we demonstrate that graph-based neural networks can be used to predict the x-ray absorption nearedge structure spectra of molecules to quantitative accuracy. Specifically, the predicted spectra reproduce nearly all prominent peaks, with 90% of the predicted peak locations within 1 eV of the ground truth. Besides its own utility in spectral analysis and structure inference, our method can be combined with structure search algorithms to enable high-throughput spectrum sampling of the vast material configuration space, which opens up new pathways to material design and discovery.
X-ray absorption spectroscopy (XAS) produces a wealth of information about the local structure of materials, but interpretation of spectra often relies on easily accessible trends and prior assumptions about the structure. Recently, researchers have demonstrated that machine learning models can automate this process to predict the coordinating environments of absorbing atoms from their XAS spectra. However, machine learning models are often difficult to interpret, making it challenging to determine when they are valid and whether they are consistent with physical theories. In this work, we present three main advances to the data-driven analysis of XAS spectra: we demonstrate the efficacy of random forests in solving two new property determination tasks (predicting Bader charge and mean nearest neighbor distance), we address how choices in data representation affect model interpretability and accuracy, and we show that multiscale featurization can elucidate the regions and trends in spectra that encode various local properties. The multiscale featurization transforms the spectrum into a vector of polynomial-fit features, and is contrasted with the commonly-used “pointwise” featurization that directly uses the entire spectrum as input. We find that across thousands of transition metal oxide spectra, the relative importance of features describing the curvature of the spectrum can be localized to individual energy ranges, and we can separate the importance of constant, linear, quadratic, and cubic trends, as well as the white line energy. This work has the potential to assist rigorous theoretical interpretations, expedite experimental data collection, and automate analysis of XAS spectra, thus accelerating the discovery of new functional materials.
X-ray absorption spectroscopy (XAS) produces a wealth of information about the local structure of materials, but interpretation of spectra often relies on easily accessible trends and prior assumptions about the structure. Recently, researchers have demonstrated that machine learning models can automate this process to predict the coordinating environments of absorbing atoms from their XAS spectra. However, machine learning models are often difficult to interpret, making it challenging to determine when they are valid and whether they are consistent with physical theories. In this work, we present three main advances to the data-driven analysis of XAS spectra: we demonstrate the efficacy of random forests in solving two new property determination tasks (predicting Bader charge and mean nearest neighbor distance), we show that multiscale featurization can elucidate the regions and trends in spectra that encode various local properties, and we address the effect of normalization on model interpretability. The multiscale featurization transforms the spectrum into a vector of polynomial-fit features, and is contrasted with the commonly-used "pointwise" featurization that directly uses the entire spectrum as input. We find that across thousands of transition metal oxide spectra, the relative importance of features describing the curvature of the spectrum can be localized to individual energy ranges, and we can separate the importance of constant, linear, quadratic, and cubic trends, as well as the white line energy. This work has the potential to assist rigorous theoretical interpretations, expedite experimental data collection, and automate analysis of XAS spectra, thus accelerating discovery of new functional materials.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.