Human cardiomyocytes (CMs) have potential for use in therapeutic cell therapy and high-throughput drug screening. Because of the inability to expand adult CMs, their large-scale production from human pluripotent stem cells (hPSC) has been suggested. Significant improvements have been made in understanding directed differentiation processes of CMs from hPSCs and their suspension culture-based production at chemically defined conditions. However, optimization experiments are costly, timeconsuming, and highly variable, leading to challenges in developing reliable and consistent protocols for the generation of large CM numbers at high purity. This study examined the ability of data-driven modeling with machine learning for identifying key experimental conditions and predicting final CM content using data collected during hPSC-cardiac differentiation in advanced stirred tank bioreactors (STBRs). Through feature selection, we identified process conditions, features, and patterns that are the most influential on and predictive of the CM content at the process endpoint, on differentiation day 10 (dd10). Process-related features were extracted from experimental data collected from 58 differentiation experiments by feature engineering. These features included data continuously collected online by the bioreactor system, such as dissolved oxygen concentration and pH patterns, as well as offline determined data, including the cell density, cell aggregate size, and nutrient concentrations. The selected features were used as inputs to construct models to classify the resulting CM content as being "sufficient" or "insufficient" regarding pre-defined thresholds. The models built using random forests and Gaussian process modeling predicted insufficient CM content for a differentiation process with 90% accuracy and precision on dd7 of the protocol and with 85% accuracy and 82% precision at a substantially earlier stage: dd5. These models provide insight into potential key factors affecting hPSC cardiac differentiation to aid in selecting future experimental conditions and can predict the final CM content at earlier
This study employed machine learning (ML) models to predict the cardiomyocyte (CM) content following differentiation of human induced pluripotent stem cells (hiPSCs) encapsulated in hydrogel microspheroids and to identify the main experimental variables affecting the CM yield. Understanding how to enhance CM generation using hiPSCs is critical in moving toward large‐scale production and implementing their use in developing therapeutic drugs and regenerative treatments. Cardiomyocyte production has entered a new era with improvements in the differentiation process. However, existing processes are not sufficiently robust for reliable CM manufacturing. Using ML techniques to correlate the initial, experimentally specified stem cell microenvironment's impact on cardiac differentiation could identify important process features. The initial tunable (controlled) input features for training ML models were extracted from 85 individual experiments. Subsets of the controlled input features were selected using feature selection and used for model construction. Random forests, Gaussian process, and support vector machines were employed as the ML models. The models were built to predict two classes of sufficient and insufficient for CM content on differentiation day 10. The best model predicted the sufficient class with an accuracy of 75% and a precision of 71%. The identified key features including post‐freeze passage number, media type, PF fibrinogen concentration, CHIR/S/V, axial ratio, and cell concentration provided insight into the significant experimental conditions. This study showed that we can extract information from the experiments and build predictive models that could enhance the cell production process by using ML techniques.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.