2016
DOI: 10.1007/978-1-4939-3572-7_25
|View full text |Cite
|
Sign up to set email alerts
|

The Recipe for Protein Sequence-Based Function Prediction and Its Implementation in the ANNOTATOR Software Environment

Abstract: As biomolecular sequencing is becoming the main technique in life sciences, functional interpretation of sequences in terms of biomolecular mechanisms with in silico approaches is getting increasingly significant. Function prediction tools are most powerful for protein-coding sequences; yet, the concepts and technologies used for this purpose are not well reflected in bioinformatics textbooks. Notably, protein sequences typically consist of globular domains and non-globular segments. The two types of regions r… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
18
0

Year Published

2016
2016
2020
2020

Publication Types

Select...
8
1

Relationship

5
4

Authors

Journals

citations
Cited by 18 publications
(19 citation statements)
references
References 126 publications
0
18
0
Order By: Relevance
“…The prediction accuracy for true positives and negatives is reported to be close to 100%, and the remaining main cause of false positive prediction is hydrophobic α-helices completely buried in the hydrophobic core of proteins. Note that reliable prediction of TMHs and protein topology is a strong restriction for protein function of even otherwise non-characterised proteins [2628] and thus provides very valuable information.…”
Section: Introductionmentioning
confidence: 99%
“…The prediction accuracy for true positives and negatives is reported to be close to 100%, and the remaining main cause of false positive prediction is hydrophobic α-helices completely buried in the hydrophobic core of proteins. Note that reliable prediction of TMHs and protein topology is a strong restriction for protein function of even otherwise non-characterised proteins [2628] and thus provides very valuable information.…”
Section: Introductionmentioning
confidence: 99%
“…To resolve the annotation discrepancies, the sequences containing the 389 domain hits were subjected to a dissectHMMER [1315, 24] analysis where pairs of HMMER E-values are dissected into their fold-critical and remnant contributions. Briefly, it is the fold-critical parts (the supposedly 3D structural part) of a sequence-to-domain alignment that argues for a similar overall fold, hence similar function whereas the remnant part represents disordered, fibrillary or other non-globular regions that might be not obligatory for the domain fold [13, 14, 24].…”
Section: Resultsmentioning
confidence: 99%
“…SIFT results were first retrieved using “SIFT dbSNP batch tool” which was run on 21 March 2012 to pre-screen the results. After that, orthologue sequences (select only “1:1 orthologs”) were retrieved from either OMA browser [ 24 ] or Orthologue search against NCBI Non-redundant protein set on ANNOTATOR [ 25 ] and were used to create a multiple sequence alignment with MAFFT (L-INS-I settings) [ 26 ]. We deleted those sequences that have large gaps using Jalview [ 27 ].…”
Section: Methodsmentioning
confidence: 99%
“… A SNP is located in the functional domain of a protein. We used the amino acid sequence of the gene that the SNP is located in as input to do “Prim-Seq-An w/Pfam” analysis in ANNOTATOR [ 25 ] using default settings which include HMMER against many protein domain databases e.g. SMART, Pfam to retrieve functional domain information of the protein.…”
Section: Methodsmentioning
confidence: 99%