2014
DOI: 10.1371/journal.pone.0106691
|View full text |Cite
|
Sign up to set email alerts
|

iDNA-Prot|dis: Identifying DNA-Binding Proteins by Incorporating Amino Acid Distance-Pairs and Reduced Alphabet Profile into the General Pseudo Amino Acid Composition

Abstract: Playing crucial roles in various cellular processes, such as recognition of specific nucleotide sequences, regulation of transcription, and regulation of gene expression, DNA-binding proteins are essential ingredients for both eukaryotic and prokaryotic proteomes. With the avalanche of protein sequences generated in the postgenomic age, it is a critical challenge to develop automated methods for accurate and rapidly identifying DNA-binding proteins based on their sequence information alone. Here, a novel predi… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
211
0

Year Published

2014
2014
2021
2021

Publication Types

Select...
9

Relationship

2
7

Authors

Journals

citations
Cited by 248 publications
(211 citation statements)
references
References 96 publications
(116 reference statements)
0
211
0
Order By: Relevance
“…These profile-based methods can significantly improve the protein remote homology detection [7,8], protein fold recognition and so forth. Moreover, added into the amino acid composition category are 3 new modes: they are "DR" [274], "Distance Pair" [271], and "PDT" [270]. DR is the abbreviation for "Distance-based Residue".…”
Section: Category Modementioning
confidence: 99%
See 1 more Smart Citation
“…These profile-based methods can significantly improve the protein remote homology detection [7,8], protein fold recognition and so forth. Moreover, added into the amino acid composition category are 3 new modes: they are "DR" [274], "Distance Pair" [271], and "PDT" [270]. DR is the abbreviation for "Distance-based Residue".…”
Section: Category Modementioning
confidence: 99%
“…It is sequence-based method, in which the generated feature vector for protein sequence is based on the distance between residue pairs and has shown better performance for protein remote homology detection. "Distance Pair" method incorporates the amino acid distance pair coupling information and the amino acid reduced alphabet profile into the general pseudo amino acid composition (PseAAC) [108] vector, which is very useful for analysing DNA-binding proteins [15,170,189,275]. PDT is the abbreviation for "physicochemical distance transformation", which can incorporate considerable sequence-order information or important patterns of protein/peptide sequences into Pseudo components [28], which is very useful for conducting various proteome analyses [17, 23, 215-217, 224, 225, 231, 235, 276-289] and genome analysis as well [216,218,220,223,229,255,277,290].…”
Section: Category Modementioning
confidence: 99%
“…7-10) above were often used in the literature to measure the prediction quality of a prediction method, they are no longer the best ones because they lack intuitiveness and are not easy to understand for most biologists, particularly the MCC (the Matthews correlation coefficient). To make it easy to read, we adopt an additional four metrics proposed by Chou (Chou 2001a, b;Chen et al 2013;Lin et al 2014;Liu et al 2014;Guo et al 2014):…”
Section: Evaluation Indicesmentioning
confidence: 99%
“…More recently the notion of reduce alphabet amino acid composition method (RAAAC) was applied by different researchers and achieved remarkable results. Further it has been used in various area of computational biology, such as prediction of DNA-Binding proteins [21], prediction of defensin family and subfamilies [22] and prediction of bioluminescent proteins [23]. Similarly, Feng et al have used RAAAC and Support Vector Machine (SVM) for prediction of HSPs families and obtained maximum overall accuracy 87.82% [2].…”
Section: Introductionmentioning
confidence: 99%