Prediction and reconstruction of metabolic pathways play significant roles in many fields such as genetic engineering, metabolic engineering, drug discovery, and are becoming the most active research topics in synthetic biology. With the increase of related data and with the development of machine learning techniques, there have many machine leaning based methods been proposed for prediction or reconstruction of metabolic pathways. Machine learning techniques are showing state-of-the-art performance to handle the rapidly increasing volume of data in synthetic biology. To support researchers in this field, we briefly review the research progress of metabolic pathway reconstruction and prediction based on machine learning. Some challenging issues in the reconstruction of metabolic pathways are also discussed in this paper.
Simulated alignments are alternatives to manually constructed multiple sequence alignments for evaluating performance of multiple sequence alignment tools. The importance of simulated sequences is recognized because their true evolutionary history is known, which is very helpful for reconstructing accurate phylogenetic trees and alignments. However, generating simulated alignments require expertise to use bioinformatics tools and consume several hours for reconstructing even a few hundreds of simulated sequences. It becomes a tedious job for an end user who needs a few datasets of variety of simulated sequences. Currently, there is no databank available which may help researchers to download simulated sequences/alignments for their study. Major focus of our study was to develop a database of simulated protein sequences (SAliBASE) based on different varying parameters such as insertion rate, deletion rate, sequence length, number of sequences, and indel size. Each dataset has corresponding alignment as well. This repository is very useful for evaluating multiple alignment methods.
<div># Machine learning Classifiers for prediction of Pathway module & it classes </div><div>We use SMILES representation of query molecules to generate relevant fingerprints, which are then fed to the machine learning classifiers ETC for producing binary labels corresponding pathway module & its classes. The details of the works are described in our paper.</div><div>A dataset of 6597 downloaded from KEGG, 4612 compounds either belong or not to Pathway module in metabolic pathway the remaining 1985 compounds belong to module classes prediction problems </div><div>### Requirements</div><div>*Chemoinformatics tools</div><div>* Python</div><div>* scikit-learn</div><div>* RDKit</div><div>* Jupyter Notebook</div><div>### Usage</div><div>We provide two folder containing Classifiers files,grid search for optimization of hyperparameters, and datasets(module, module classes</div>
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.