In the present paper, we use a deep reinforcement learning (DRL) approach for solving the multiple sequence alignment problem which is an NP-complete problem. Multiple Sequence Alignment problem simply refers to the process of arranging initial sequences of DNA, RNA or proteins in order to maximize their regions of similarity. Multiple Sequence Alignment is the first step in solving many bioinformatics problems such as constructing phylogenetic trees. In this study, our proposed approach models the Multiple Sequence Alignment problem as a DRL problem and utilizes long short-term memory networks for estimation phase in the reinforcement learning algorithm. Furthermore, the actor-critic algorithm with experience-replay method is used for much quicker convergence process. Using deep Q-learning (an RL approach) and Q-network overcomes the complexity of other approaches. The experimental evaluation is performed on 8 different real-life datasets and in every used dataset our approach outperforms other well-known approaches and tools such as MAFFT, ClustalW, and other heuristic approaches in case of scoring in solving the MSA problem.
The present article focuses on solving the protein folding problem with deep reinforcement learning (DRL) approach. The protein folding problem is an NP-hard problem and as we are proposing our approach in the hydrophobic-polar model, we deal with an NP-complete problem. Also, the protein folding problem is a combinatorial optimization problem. Combinatorial optimization problems are hard to solve optimally, that is why any attempt to improve their solutions is beneficent. Generally, this problem refers to the process of predicting the structure of a protein from its amino acids sequence. During recent years, the protein folding problem has attracted a lot of attention. The amount of time and expenses of using nuclear magnetic resonance imaging and crystallography for identifying the three-dimensional structure is the main reason of many proposed approaches. In this study, our approach models the problem as a DRL problem, and for enhancing its performance, we adopt long short-term memory networks for the approximation phase in the reinforcement learning algorithm. Using deep Q-learning approach and actor-critic algorithm with an experience replay mechanism overcomes the complexity of other proposed approaches which leads to better accuracy in less time. In addition, we analyzed the efficiency and effectiveness of the dueling deep Q-network technique for solving the protein folding problem. Providing a step-by-step implementation and modeling for solving the bi-dimensional protein folding problem with the DRL approach is the purpose of the present study which could be helpful for solving other omics and computational biology problems. However, a comparison between the DRL approach and other notable approaches (as it is available in Sect. 6) shows that our approach outperforms other approaches in finding the minimum value of the free energy, which is the main factor in the protein folding problem, in less time in any available case.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.