2015
DOI: 10.1109/tcbb.2014.2359451
|View full text |Cite
|
Sign up to set email alerts
|

Disulfide Connectivity Prediction Based on Modelled Protein 3D Structural Information and Random Forest Regression

Abstract: Disulfide connectivity is an important protein structural characteristic. Accurately predicting disulfide connectivity solely from protein sequence helps to improve the intrinsic understanding of protein structure and function, especially in the post-genome era where large volume of sequenced proteins without being functional annotated is quickly accumulated. In this study, a new feature extracted from the predicted protein 3D structural information is proposed and integrated with traditional features to form … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
7
0

Year Published

2015
2015
2019
2019

Publication Types

Select...
5
2

Relationship

1
6

Authors

Journals

citations
Cited by 17 publications
(7 citation statements)
references
References 58 publications
0
7
0
Order By: Relevance
“…A comparison between the different values of n was performed and results reported in Table 8 . The n of 6 was found to be effective in prior research [ 36 ] It is a coincidence that RAMseq and RAMmod were fairly close to the number 6 for PSS and PSSM as seen in Table 2 . We used 6 as a general number that can be expected to perform well on any new dataset based on the results.…”
Section: Resultsmentioning
confidence: 67%
See 2 more Smart Citations
“…A comparison between the different values of n was performed and results reported in Table 8 . The n of 6 was found to be effective in prior research [ 36 ] It is a coincidence that RAMseq and RAMmod were fairly close to the number 6 for PSS and PSSM as seen in Table 2 . We used 6 as a general number that can be expected to perform well on any new dataset based on the results.…”
Section: Resultsmentioning
confidence: 67%
“…For a window size of 2*k + 1 (k positions to the left of the target cysteine, k to the right, and the target cysteine position itself) and the twenty major amino acids, we get a matrix that is twenty rows long and 2*k + 1 columns wide that can be vectorized for a total of 20*(2*k + 1) features. We chose 6 for k as found in the prior work [ 36 ]. Our PSSMs provided 260 entries for the classifier.…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…The classification results are voted on by the decision trees. The final feature category relies on the classification result with the largest number of votes [32]. The Gini index is used to measure the classification results.…”
Section: Introductionmentioning
confidence: 99%
“…Different from the above ab inito approaches which perform predictions by only using the amino acid sequence information, the other trend is using the homology modeling techniques, where some prediction features are extracted from the modeled structures. For instance, the spatial distance between the cysteine residues in the modeled structure can be used as an encoding feature (Yu et al, 2015). Other studies in this trend include: Lin and Tseng (2010) and O' Connor and Yeates (2004).…”
Section: Introductionmentioning
confidence: 99%