To understand how proteins function on a cellular level, it is of paramount importance to understand their structures and dynamics, including the conformational changes they undergo to carry out their function. For the aforementioned reasons, the study of large conformational changes in proteins has been an interest to researchers for years. However, since some proteins experience rapid and transient conformational changes, it is hard to experimentally capture the intermediate structures. Additionally, computational brute force methods are computationally intractable, which makes it impossible to find these pathways which require a search in a high-dimensional, complex space. In our previous work, we implemented a hybrid algorithm that combines Monte-Carlo (MC) sampling and RRT*, a version of the Rapidly Exploring Random Trees (RRT) robotics-based method, to make the conformational exploration more accurate and efficient, and produce smooth conformational pathways. In this work, we integrated the rigidity analysis of proteins into our algorithm to guide the search to explore flexible regions. We demonstrate that rigidity analysis dramatically reduces the run time and accelerates convergence.
Investigating the conformational space of proteins is essential in order to associate their structures with their fundamental functions. Nonetheless, it is a challenging task, both experimentally and computationally. Because of the transient nature of these conformational changes and the fact that they are impermanent, empirical methods have fallen short to capture them. In silico methods, on the other hand, have shown great promise in exploring these conformational pathways. In this article, we provide an extensive evaluation of our previously introduced, robotics inspired conformational search algorithm (RRT* with Monte Carlo). We then identify what intermediate conformations appear the most in our generated conformational pathways using TDA Mapper, a topological data analysis algorithm, and examine how close these intermediate conformations are to existing experimental data.
Classifying proteins into families is an important task when studying newly discovered proteins. If we can identify the family a protein belongs to, we can predict features without knowing the exact structure of such a protein.However, this grouping process is challenging. We propose a two-stage algorithm that classifies proteins into families by combining a dimensionality reduction technique using a variational autoencoder with learned fingerprint representations using a Convolutional Neural Network (CNN). Our models use fewer parameters than existing methods but perform better, with our variational autoencoder achieving 94% accuracy in reconstructing the most common amino acid in a sequence alignment, and the neural network provides 98-100% accuracy in classifying protein families. We developed a software framework to access our algorithms. All code and data are publicly available at https://github.com/ramindehghanpoor/CLI.
An essential step to understanding how different functionalities of proteins work is to explore their conformational space. However, because of the fleeting nature of conforma- tional changes in proteins, investigating protein conformational spaces is a challenging task to do experimentally. Nonetheless, computational methods have shown to be practical to explore these conformational pathways. In this work, we use Topological Data Analysis (TDA) methods to evaluate our previously introduced algorithm called RRTMC, that uses a combination of Rapidly-exploring Random Trees algorithm and Monte Carlo criteria to explore these pathways. TDA is used to identify the intermediate conformations that are generated the most by RRTMC and examine how close they are to existing known inter- mediate conformations. We concluded that the intermediate conformations generated by RRTMC are close to existing experimental data and that TDA can be a helpful tool to analyze protein conformation sampling methods.
To understand how proteins function on a cellular level, it is of paramount importance to understand their structures and dynamics, including the conformational changes they undergo to carry out their function. For the aforementioned reasons, the study of large conformational changes in proteins has been an interest to researchers for years. However, since some proteins experience rapid and transient conformational changes, it is hard to experimentally capture the intermediate structures. Additionally, computational brute force methods are computationally intractable, which makes it impossible to find these pathways which require a search in a high-dimensional, complex space. In our previous work, we implemented a hybrid algorithm that combines Monte-Carlo (MC) sampling and RRT*, a version of the Rapidly Exploring Random Trees (RRT) robotics-based method, to make the conformational exploration more accurate and efficient, and produce smooth conformational pathways. In this work, we integrated the rigidity analysis of proteins into our algorithm to guide the search to explore flexible regions. We demonstrate that rigidity analysis dramatically reduces the run time and accelerates convergence.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.