Zeolites are versatile catalysts and molecular sieves with large topological diversity, but managing phase competition in zeolite synthesis is an empirical, labor-intensive task. Here, we controlled phase selectivity in templated zeolite synthesis from first principles by combining high-throughput atomistic simulations, literature mining, human-computer interaction, synthesis, and characterization. Proposed binding metrics distilled from over 586,000 zeolite-molecule simulations reproduced the extracted literature and rationalize framework competition in the design of organic structure-directing agents. Energetic, geometric, and electrostatic descriptors of template molecules were found to regulate synthetic accessibility windows and aluminum distributions in pure-phase zeolites. Furthermore, these parameters allowed realizing an intergrowth zeolite through a single bi-selective template. The computation-first approach enabled controlling both zeolite synthesis and structure composition using a priori theoretical descriptors.
Computer simulations can provide mechanistic insight into ionic liquids (ILs) and predict the properties of experimentally unrealized ion combinations. However, ILs suffer from a particularly large disparity in the time scales of atomistic and ensemble motion. Coarse-grained models are therefore used in place of costly all-atom simulations, accessing longer time scales and larger systems. Nevertheless, constructing the many-body potential of mean force that defines the structure and dynamics of a coarse-grained system can be complicated and computationally intensive. Machine learning shows great promise for the linked challenges of dimensionality reduction and learning the potential of mean force. To improve the coarse-graining of ILs, we present a neural network model trained on all-atom classical molecular dynamics simulations. The potential of mean force is expressed as two jointly trained neural network interatomic potentials that learn the coupled short-range and many-body long range molecular interactions. These interatomic potentials treat temperature as an explicit input variable to capture its influence on the potential of mean force. The model reproduces structural quantities with high fidelity, outperforms the temperature-independent baseline at capturing dynamics, generalizes to unseen temperatures, and incurs low simulation cost.
Organic structure directing agents (OSDAs) play a crucial role in the synthesis of micro- and mesoporous materials especially in the case of zeolites. Despite the wide use of OSDAs, their interaction with zeolite frameworks is poorly understood, with researchers relying on synthesis heuristics or computationally expensive techniques to predict whether an organic molecule can act as an OSDA for a certain zeolite. In this paper, we undertake a data-driven approach to unearth generalized OSDA–zeolite relationships using a comprehensive database comprising of 5,663 synthesis routes for porous materials. To generate this comprehensive database, we use natural language processing and text mining techniques to extract OSDAs, zeolite phases, and gel chemistry from the scientific literature published between 1966 and 2020. Through structural featurization of the OSDAs using weighted holistic invariant molecular (WHIM) descriptors, we relate OSDAs described in the literature to different types of cage-based, small-pore zeolites. Lastly, we adapt a generative neural network capable of suggesting new molecules as potential OSDAs for a given zeolite structure and gel chemistry. We apply this model to CHA and SFW zeolites generating several alternative OSDA candidates to those currently used in practice. These molecules are further vetted with molecular mechanics simulations to show the model generates physically meaningful predictions. Our model can automatically explore the OSDA space, reducing the amount of simulation or experimentation needed to find new OSDA candidates.
Neural network (NN) interatomic potentials provide fast prediction of potential energy surfaces, closely matching the accuracy of the electronic structure methods used to produce the training data. However, NN predictions are only reliable within well-learned training domains, and show volatile behavior when extrapolating. Uncertainty quantification methods can flag atomic configurations for which prediction confidence is low, but arriving at such uncertain regions requires expensive sampling of the NN phase space, often using atomistic simulations. Here, we exploit automatic differentiation to drive atomistic systems towards high-likelihood, high-uncertainty configurations without the need for molecular dynamics simulations. By performing adversarial attacks on an uncertainty metric, informative geometries that expand the training domain of NNs are sampled. When combined with an active learning loop, this approach bootstraps and improves NN potentials while decreasing the number of calls to the ground truth method. This efficiency is demonstrated on sampling of kinetic barriers, collective variables in molecules, and supramolecular chemistry in zeolite-molecule interactions, and can be extended to any NN potential architecture and materials system.
Metrics & MoreArticle Recommendations CONSPECTUS: Designing new materials is vital for addressing pressing societal challenges in health, energy, and sustainability. The combination of physicochemical laws and empirical trial and error has long guided material design, but this approach is limited by the cost of experiments and the difficulty of deriving complex guiding principles. The space of hypothetical materials to be considered is incredibly large, and only a small fraction of possible compounds can ever be tested experimentally. The computational techniques of atomistic simulation and machine learning (ML) offer an avenue to rapidly invent new materials and navigate this enormous space. Together, they can be used to infer complex design principles and identify high-quality candidates more rapidly than trial-and-error experimentation. In this Account, we review our group's recent contributions to simulation and ML for materials design. We begin by discussing the numerical representation of materials for use in ML. Representations can be produced through deterministic algorithms, learnable encodings, or physics-based methods and lead to vector, graph, and matrix outputs. We describe how these different approaches offer distinct material-and application-specific advantages. We provide demonstrations from our own work on small-molecule drugs, macromolecules, dyes, electrolytes, and zeolites. In several cases, we show how the appropriate representation led to guiding principles that facilitated experimental materials design. Next, we highlight the development of ML methods for enhancing atomistic simulation. These advances help to improve simulation accuracy and expand the time and length scales that can be explored. They include differentiable atomistic simulations in which ensemble-averaged quantities are differentiated with respect to system parameters, and novel autoregressive methods for enhanced sampling of challenging physical distributions. Other developments include learnable coarse-grained models, which can accelerate molecular dynamics while minimizing the loss of all-atom information, and ML interatomic potentials, which can be trained on maximally informative quantum chemistry data through active learning and adversarial uncertainty attacks. Next, we show how these combined computational advances have enabled high-throughput virtual screening. This has led to the discovery of low-cost organic structure-directing agents for zeolite synthesis, polymer electrolytes, and efficient photoswitches for targeted medicine. We conclude by discussing the limitations of ML and simulation. These include the large data requirements and limited chemical transferability of the former and the speed−accuracy trade-offs of the latter. We predict that advancements in quantum chemistry will further accelerate simulations, while the incorporation of physical principles will improve the reliability of ML.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.