The Pd-catalyzed cross-coupling of thiols with aromatic electrophiles is a reliable method for the synthesis of aryl thioethers, which are important compounds for pharmaceutical and agricultural applications. Since thiols and thiolates strongly bind late transition metals, previous research has focused on catalysts supported by chelating, bisphosphine ligands, which were considered less likely to be displaced during the course of the reaction. We show that by using monophosphine ligands instead, more effective catalysis can be achieved. Notably, compared to previous methods, this increased reactivity allows for the use of much lower reaction temperature, soluble bases, and base-sensitive substrates. In contrast to conventional wisdom, our mechanistic data suggest that the extent of displacement of phosphine ligands by thiols is, firstly, not correlated with the ligand bulk or thiol nucleophilicity, and secondly, not predictive of the effectiveness of a given ligand in combination with palladium.
The merger of High-Throughput Experimentation (HTE) and data science presents an opportunity to both accelerate and inspire innovations in synthetic chemistry. Similarly, developments in machine learning (ML) have enabled the distillation of large and complex data sets into predictive models capable of generalizing patterns in the data. However, efforts to merge HTE with ML remain constrained by a few reported datasets with limited structural diversity and corresponding trained models that do not extrapolate well to substrates beyond the training set. Herein, we detail the first ML models for Pd-catalyzed C–N couplings using pharmaceutically relevant structurally diverse large data sets (~ 5000 unique products) generated using nanomole scale compatible chemistry. Careful consideration is given to both the diversity of the data set and accurate model predictions for substrates bearing features beyond those present in the training set. The structural diversity in the data set is enabled by leveraging the Merck & Co., Inc Building Block Collection with an initial focus on C–N coupling using secondary amines. The large dataset enables the systematic evaluation of model performance using five different data-splitting strategies. These five splits are carefully designed to evaluate the model’s ability to extrapolate beyond the substrates in the training set. The accuracy of classification models built with a lens toward application to medicinal chemistry campaigns exceeded the baseline precision-recall by 25-67% depending on the splitting strategy. These results would manifest as significant enrichment of successful C–N couplings using the hits recommended by the models. In addition, the accuracy of the best models for each of the five splits ranges between 70-87% suggesting excellent overall predictivity of the models even for completely unseen substrates.
We report a series of palladium(II)-catalyzed, intramolecular alkene hydrofunctionalization reactions with carbon, nitrogen, and oxygen nucleophiles to form five- and six-membered carbo- and heterocycles. In these reactions, the presence of...
We report a series of palladium(II)-catalyzed, intramolecular alkene hydrofunctionalization reactions with carbon, nitrogen, and oxygen nucleophiles to form five- and six-membered carbo- and heterocycles. In these reactions, the presence of a proximal bidentate directing group controls the cyclization pathway, dictating the ring size that is generated, even in cases that are disfavored based on Baldwin’s rules and in cases where there is an inherent preference for an alternative pathway. DFT studies shed light on the origins of pathway selectivity in these processes.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.