The application of machine learning (ML) to problems in homogeneous catalysis has
emerged as a promising avenue for catalyst optimization. An important aspect of such optimization
campaigns is determining which reactions to run at the outset of experimentation and which future
predictions are the most reliable. Herein, we explore methods for these two tasks in the context of
our previously developed chemoinformatics workflow. First, different methods for training set
selection are compared, including algorithmic selection and selection informed by unsupervised
learning methods. Next, an array of different metrics for assessment of prediction confidence are
examined in multiple catalyst manifolds. These approaches will inform future computer-guided
studies to accelerate catalyst selection and reaction optimization. Finally, this work demonstrates
the generality of the Average Steric Occupancy (ASO) and Average Electronic Indicator Field
(AEIF) descriptors in their application to transition metal catalysts for the first time. <br>
The application of machine learning (ML) to problems in homogeneous catalysis has
emerged as a promising avenue for catalyst optimization. An important aspect of such optimization
campaigns is determining which reactions to run at the outset of experimentation and which future
predictions are the most reliable. Herein, we explore methods for these two tasks in the context of
our previously developed chemoinformatics workflow. First, different methods for training set
selection are compared, including algorithmic selection and selection informed by unsupervised
learning methods. Next, an array of different metrics for assessment of prediction confidence are
examined in multiple catalyst manifolds. These approaches will inform future computer-guided
studies to accelerate catalyst selection and reaction optimization. Finally, this work demonstrates
the generality of the Average Steric Occupancy (ASO) and Average Electronic Indicator Field
(AEIF) descriptors in their application to transition metal catalysts for the first time. <br>
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.