Sequence comparison is one of the foundations in bioinformatics, which can be used to study evolutionary relations among the sequences. In this study, a 2D spectrum-like graphical representation of protein sequences is presented based on the hydrophobicity scale of amino acids. The frequencies of amplitudes of 4-subsequences are adopted to characterize a spectrum-like graph, and a 17D vector is used as the descriptor of protein sequence. The χ2 value of compatibility test is performed. New similarity analysis approach is illustrated on the all protein sequences, which are encoded by the mitochondrion genome of 20 different species. Finally, comparison with the ClustalW method shows the utility of our method.
The location of a protein in a cell is closely correlated with its biological function. At present, it is still a challenge to extract sequence information solely using protein sequence for protein subcellular localization prediction in the current computational biology. In the current paper, we proposed a novel method that coupled the amino acid composition, amino acid hydrophobicity with position specific scoring matrix to predict the subcellular localizations of prokaryotic and eukaryotic proteins. By defined evolutionary difference formula, varying length proteins are expressed as uniform dimensional vectors, which can represent evolutionary difference information between the u residues of a given protein. To perform and evaluate the proposed method, support vector machine (SVM) and jackknife tests are employed. The overall accuracies are 95.1% and 90.9% for prokaryotic and eukaryotic proteins, respectively. Comparison of our results with the previous methods shows that our method may provide a promising method to predict protein subcellular localization prediction.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.