Chemical reaction data in journal articles, patents, and even electronic laboratory notebooks are currently stored in various formats, often unstructured, which presents a significant barrier to downstream applications, including the training of machine-learning models. We present the Open Reaction Database (ORD), an open-access schema and infrastructure for structuring and sharing organic reaction data, including a centralized data repository. The ORD schema supports conventional and emerging technologies, from benchtop reactions to automated high-throughput experiments and flow chemistry. The data, schema, supporting code, and web-based user interfaces are all publicly available on GitHub. Our vision is that a consistent data representation and infrastructure to support data sharing will enable downstream applications that will greatly improve the state of the art with respect to computer-aided synthesis planning, reaction prediction, and other predictive chemistry tasks.
Machine-learned ranking models have been developed for the prediction of substrate-specific cross-coupling reaction conditions. Datasets of published reactions were curated for Suzuki, Negishi, and C-N couplings, as well as Pauson-Khand reactions. String, descriptor, and graph encodings were tested as input representations, and models were trained to predict the set of conditions used in a reaction as a binary vector.Unique reagent dictionaries categorized by expert-crafted reaction roles were constructed for each dataset, leading to context-aware predictions. We find that relational graph convolutional networks and gradient-boosting machines are very effective for this learning task, and we disclose a novel reaction-level graph-attention operation in the top-performing model.
File list (2)download file view on ChemRxiv 2020-10-13_ChemRxiv.pdf (2.25 MiB) download file view on ChemRxiv 2020-10-13_ChemRxiv_SI.pdf (3.28 MiB)
A mild method for the synthesis of highly functionalized [3]-[6]dendralenes is reported, representing a general strategy to diversely substituted higher homologues of the dendralenes. The methodology utilizes allenoates bearing various substitution patterns, along with a wide range of boron and alkenyl nucleophiles that couple under palladium catalysis leading to sp-, sp -, and sp -substituted arrays. Regioselective transformations of the newly formed unsymmetrical dendralene derivatives are demonstrated. The use of micellar catalysis, where water is the global reaction medium, and room temperature reaction conditions, highlights the green nature of this technology.
The
first total synthesis of the cytotoxic alkaloid ritterazine
B is reported. The synthesis features a unified approach to both steroid
subunits, employing a titanium-mediated propargylation reaction to
achieve divergence from a common precursor. Other key steps include
gold-catalyzed cycloisomerizations that install both spiroketals and
late stage C–H oxidation to incorporate the C7′ alcohol.
An environmentally responsible, mild method for the synthesis of functionalized 1,3-butadienes is presented. It utilizes allenic esters of varying substitution patterns, as well as a wide range of boron-based nucleophiles under palladium catalysis, generating sp-sp, sp-sp, and sp-sp bonds. Functional group tolerance measured via robustness screening, along with room temperature and aqueous reaction conditions highlight the methodology's breadth and potential utility in synthesis.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.