X-ray absorption spectroscopy (XAS) produces a wealth of information about the local structure of materials, but interpretation of spectra often relies on easily accessible trends and prior assumptions about the structure. Recently, researchers have demonstrated that machine learning models can automate this process to predict the coordinating environments of absorbing atoms from their XAS spectra. However, machine learning models are often difficult to interpret, making it challenging to determine when they are valid and whether they are consistent with physical theories. In this work, we present three main advances to the data-driven analysis of XAS spectra: we demonstrate the efficacy of random forests in solving two new property determination tasks (predicting Bader charge and mean nearest neighbor distance), we address how choices in data representation affect model interpretability and accuracy, and we show that multiscale featurization can elucidate the regions and trends in spectra that encode various local properties. The multiscale featurization transforms the spectrum into a vector of polynomial-fit features, and is contrasted with the commonly-used “pointwise” featurization that directly uses the entire spectrum as input. We find that across thousands of transition metal oxide spectra, the relative importance of features describing the curvature of the spectrum can be localized to individual energy ranges, and we can separate the importance of constant, linear, quadratic, and cubic trends, as well as the white line energy. This work has the potential to assist rigorous theoretical interpretations, expedite experimental data collection, and automate analysis of XAS spectra, thus accelerating the discovery of new functional materials.
Assessing the synthesizability of inorganic materials is a grand challenge for accelerating their discovery using computations. Synthesis of a material is a complex process that depends not only on its thermodynamic stability with respect to others, but also on factors from kinetics, to advances in synthesis techniques, to the availability of precursors. This complexity makes the development of a general theory or first-principles approach to synthesizability currently impractical. Here we show how an alternative pathway to predicting synthesizability emerges from the dynamics of the materials stability network: a scale-free network constructed by combining the convex free-energy surface of inorganic materials computed by high-throughput density functional theory and their experimental discovery timelines extracted from citations. The time-evolution of the underlying network properties allows us to use machine-learning to predict the likelihood that hypothetical, computer-generated materials will be amenable to successful experimental synthesis.
We present first-principle calculations on the vertical ionization potentials (IPs), electron affinities (EAs), and singlet excitation energies on an aromatic-molecule test set (benzene, thiophene, 1,2,5-thiadiazole, naphthalene, benzothiazole, and tetrathiafulvalene) within the GW and BetheSalpeter equation (BSE) formalisms. Our computational framework, which employs a real-space basis for ground-state and a transition-space basis for excited-state calculations, is well-suited for high-accuracy calculations on molecules, as we show by comparing against G0W0 calculations within a plane-wave-basis formalism. We then generalize our framework to test variants of the GW approximation that include a local-density approximation (LDA)-derived vertex function (ΓLDA) and quasiparticle-self-consistent (QS) iterations. We find that ΓLDA and quasiparticle self-consistency shift IPs and EAs by roughly the same magnitude, but with opposite sign for IPs and same sign for EAs. G0W0 and QSGW ΓLDA are more accurate for IPs, while G0W0ΓLDA and QSGW are best for EAs. For optical excitations, we find that perturbative GW -BSE underestimates the singlet excitation energy, while self-consistent GW -BSE results in good agreement with previous best-estimate values for both valence and Rydberg excitations. Finally, our work suggests that a hybrid approach, where G0W0 energies are used for occupied orbitals and G0W0ΓLDA for unoccupied orbitals, also yields optical excitation energies in good agreement with experiment but at a smaller computational cost.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.