X-ray spectroscopy delivers strong impact across the physical and biological sciences by providing end-users with highly-detailed information about the electronic and geometric structure of matter. To decode this information in challenging cases, e.g. in operando catalysts, batteries, and temporally-evolving systems, advanced theoretical calculations are necessary. The complexity and resource requirements often render these out of reach for end-users, and therefore data are often not interpreted exhaustively, leaving a wealth of valuable information unexploited. In this paper, we introduce supervised machine learning of X-ray absorption spectra, by developing a deep neural network (DNN) that is able to estimate Fe K -edge X-ray absorption near-edge structure spectra in less than a second with no input beyond geometric information about the local environment of the absorption site. We predict peak positions with sub-eV accuracy and peak intensities with errors over an order of magnitude smaller than the spectral variations that the model is engineered to capture. The performance of the DNN is promising, as illustrated by its application to the structural refinement 1 of iron(II)tris(bipyridine) and nitrosylmyoglobin, but also highlights areas for which future developments should focus.
An important consideration when developing a deep neural network (DNN) for the prediction of molecular properties is the representation of the chemical space. Herein we explore the effect of the representation on the performance of our DNN engineered to predict Fe K-edge X-ray absorption near-edge structure (XANES) spectra, and address the question: How important is the choice of representation for the local environment around an arbitrary Fe absorption site? Using two popular representations of chemical space—the Coulomb matrix (CM) and pair-distribution/radial distribution curve (RDC)—we investigate the effect that the choice of representation has on the performance of our DNN. While CM and RDC featurisation are demonstrably robust descriptors, it is possible to obtain a smaller mean squared error (MSE) between the target and estimated XANES spectra when using RDC featurisation, and converge to this state a) faster and b) using fewer data samples. This is advantageous for future extension of our DNN to other X-ray absorption edges, and for reoptimisation of our DNN to reproduce results from higher levels of theory. In the latter case, dataset sizes will be limited more strongly by the resource-intensive nature of the underlying theoretical calculations.
Many chemical and biological reactions, including ligand exchange processes, require thermal energy for the reactants to overcome a transition barrier and reach the product state. Temperature-jump (T-jump) spectroscopy uses a...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.