Molecular dynamics (MD) simulations have been actively used in the study of protein structure and function. However, extensive sampling in the protein conformational space requires large computational resources and takes a prohibitive amount of time. In this study, we demonstrated that variational autoencoders (VAEs), a type of deep learning model, can be employed to explore the conformational space of a protein through MD simulations. VAEs are shown to be superior to autoencoders (AEs) through a benchmark study, with low deviation between the training and decoded conformations. Moreover, we show that the learned latent space in the VAE can be used to generate unsampled protein conformations. Additional simulations starting from these generated conformations accelerated the sampling process and explored hidden spaces in the conformational landscape.
Allostery is a fundamental process in regulating protein activities. The discovery, design, and development of allosteric drugs demand better identification of allosteric sites. Several computational methods have been developed previously to predict allosteric sites using static pocket features and protein dynamics. Here, we define a baseline model for allosteric site prediction and present a computational model using automated machine learning. Our model, PASSer2.0, advanced the previous results and performed well across multiple indicators with 82.7% of allosteric pockets appearing among the top three positions. The trained machine learning model has been integrated with the Protein Allosteric Sites Server (PASSer) to facilitate allosteric drug discovery.
In this study, we systematically examine the conformational dynamics, binding and allosteric communications in the Omicron BA.1, BA.2, BA.3 and BA.4/BA.5 spike protein complexes with the ACE2 host receptor using...
Allostery refers to the biological process by which an effector modulator binds to a protein at a site distant from the active site, known as allosteric site. Identifying allosteric sites is essential for discovering allosteric process and is considered a critical factor in allosteric drug development. To facilitate related research, we developed PASSer (Protein Allosteric Sites Server) at https://passer.smu.edu, a web application for fast and accurate allosteric site prediction and visualization. The website hosts three trained and published machine learning models: (i) an ensemble learning model with extreme gradient boosting and graph convolutional neural network, (ii) an automated machine learning model with AutoGluon and (iii) a learning-to-rank model with LambdaMART. PASSer accepts protein entries directly from the Protein Data Bank (PDB) or user-uploaded PDB files, and can conduct predictions within seconds. The results are presented in an interactive window that displays protein and pockets’ structures, as well as a table that summarizes predictions of the top three pockets with the highest probabilities/scores. To date, PASSer has been visited over 49 000 times in over 70 countries and has executed over 6 200 jobs.
Molecular dynamics (MD) simulation is widely used to
study protein
conformations and dynamics. However, conventional simulation suffers
from being trapped in some local energy minima that are hard to escape.
Thus, most of the computational time is spent sampling in the already
visited regions. This leads to an inefficient sampling process and
further hinders the exploration of protein movements in affordable
simulation time. The advancement of deep learning provides new opportunities
for protein sampling. Variational autoencoders are a class of deep
learning models to learn a low-dimensional representation (referred
to as the latent space) that can capture the key features of the input
data. Based on this characteristic, we proposed a new adaptive sampling
method, latent space-assisted adaptive sampling for protein trajectories
(LAST), to accelerate the exploration of protein conformational space.
This method comprises cycles of (i) variational autoencoder training,
(ii) seed structure selection on the latent space, and (iii) conformational
sampling through additional MD simulations. The proposed approach
is validated through the sampling of four structures of two protein
systems: two metastable states of Escherichia coli adenosine kinase (ADK) and two native states of Vivid (VVD). In
all four conformations, seed structures were shown to lie on the boundary
of conformation distributions. Moreover, large conformational changes
were observed in a shorter simulation time when compared with structural
dissimilarity sampling (SDS) and conventional MD (cMD) simulations
in both systems. In metastable ADK simulations, LAST explored two
transition paths toward two stable states, while SDS explored only
one and cMD neither. In VVD light state simulations, LAST was three
times faster than cMD simulation with a similar conformational space.
Overall, LAST is comparable to SDS and is a promising tool in adaptive
sampling. The LAST method is publicly available at to facilitate related research.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.