Enzyme engineering is an important biotechnological process capable of generating tailored biocatalysts for applications in industrial chemical conversion and biopharma. Typical enhancements sought in enzyme engineering and in vitro evolution...
Understanding the complex relationships between enzyme sequence, folding stability and catalytic activity is crucial for applications in industry and biomedicine. However, current enzyme assay technologies are limited by an inability to simultaneously resolve both stability and activity phenotypes and to couple these to gene sequences at large scale. Here we developed Enzyme Proximity-Seq (EP-Seq), a deep mutational scanning method that leverages peroxidase-mediated radical labeling with single cell fidelity to dissect the effects of thousands of mutations on stability and catalytic activity of oxidoreductase enzymes in a single experiment. We used EP-Seq to analyze how 6,387 missense mutations influence folding stability and catalytic activity in a D-amino acid oxidase (DAOx) from R. gracilis. The resulting datasets demonstrate activity-based constraints that limit folding stability during natural evolution, and identify hotspots distant from the active site as candidates for mutations that improve catalytic activity without sacrificing stability. EP-Seq can be extended to other enzyme classes and provides valuable insights into biophysical principles governing enzyme structure and function.
We report the application of machine learning techniques to accelerate classification and analysis of protein unfolding trajectories from force spectroscopy data. Using kernel methods, logistic regression and triplet loss, we developed a workflow called Forced Unfolding and Supervised Iterative Online (FUSION) where a user classifies a small number of repeatable unfolding patterns encoded as image data, and a machine is tasked with identifying similar images to classify the remaining data. We tested the workflow using two case studies on a multi-domain XMod-Dockerin/Cohesin complex, validating the approach first using synthetic data generated with a Monte Carlo algorithm, and then deploying the method on experimental atomic force spectroscopy data. FUSION efficiently separated traces that passed quality filters from unusable ones, classified curves with high accuracy, and identified unfolding pathways undetected by the user. This study demonstrates the potential of machine learning to accelerate data analysis, and generate new insights in protein biophysics.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.