Catalyst design in asymmetric reaction development has traditionally been driven by empiricism, wherein experimentalists attempt to qualitatively recognize structural patterns to improve selectivity. Machine learning algorithms and chemoinformatics can potentially accelerate this process by recognizing otherwise inscrutable patterns in large datasets. Herein we report a computationally guided workflow for chiral catalyst selection using chemoinformatics at every stage of development. Robust molecular descriptors that are agnostic to the catalyst scaffold allow for selection of a universal training set on the basis of steric and electronic properties. This set can be used to train machine learning methods to make highly accurate predictive models over a broad range of selectivity space. Using support vector machines and deep feed-forward neural networks, we demonstrate accurate predictive modeling in the chiral phosphoric acid–catalyzed thiol addition to N-acylimines.
Modern,
enantioselective catalyst development is driven largely
by empiricism. Although this approach has fostered the introduction
of most of the existing synthetic methods, it is inherently limited
by the skill, creativity, and chemical intuition of the practitioner.
Herein, we present a complementary approach to catalyst optimization
in which statistical methods are used at each stage to streamline
development. To construct the optimization informatics workflow, a
number of critical components had to be subjected to rigorous validation.
First, the critically important molecular descriptors were validated
in two case studies to establish the importance of conformation-dependent
molecular representations. Next, with a large data set available,
it was possible to investigate the amount of data necessary to make
predictive models with different modeling methods. Given the commercial
availability of many catalyst structures, it was possible to compare
models generated with algorithmically selected training sets and commercially
available training sets. Finally, the augmentation of limited data
sets is demonstrated in a method informed by unsupervised learning
to restore the accuracy of the generated models.
Different subset selection methods are examined to guide catalyst selection in optimization campaigns. Error assessment methods are used to quantitatively inform selection of new catalyst candidates from in silico libraries of catalyst structures.
The macrocyclic cavities in carbaporphyrins are well suited for the formation of metalated derivatives. A carbaporphyrin diester and a naphthocarbaporphyrin reacted with [Rh(CO) 2 Cl] 2 to give good-to-excellent yields of rhodium(I) complexes, and these were fully characterized by X-ray crystallography. Both rhodium(I) derivatives were converted into rhodium-(III) complexes in refluxing pyridine, albeit in moderate yields. Carbachlorins also formed rhodium(I) complexes, but these could not be further transformed into rhodium(III) products. The rhodium(III) complexes incorporate two axial pyridine ligands, which exhibit strongly shielded resonances in their 1 H NMR spectra, and the rhodium(III) carbaporphyrin diester was further characterized by X-ray crystallography. adj-Dicarbaporphyrins also formed rhodium(I) complexes, but these reactions involved the relocation of a proton to generate an internal methylene unit. The environments associated with the two faces of the resulting macrocycles are very different from one another, and this results in the 1 H NMR chemical shifts for the two internal methylene protons being separated by well over 3 ppm. Although the diatropicities of rhodium(I) complexes for monocarbaporphyrins and carbachlorins are comparable to those of the parent ligands, the chemical shifts for rhodium(I) dicarbaporphyrins are consistent with a significant reduction in the porphyrinoid aromaticity. A dicarbachlorin also gave a rhodium(I) complex, but this species fully retained the diatropic characteristics of the parent ligand. Nevertheless, the internal CH 2 unit still gave two widely separated doublets indicative of radically differing environments for the two faces of the macrocycle. Rhodium(I) dicarbaporphyrin and dicarbachlorin complexes were further characterized by X-ray crystallography.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.