2015
DOI: 10.1093/database/bav012
|View full text |Cite
|
Sign up to set email alerts
|

Improving the consistency of domain annotation within the Conserved Domain Database

Abstract: When annotating protein sequences with the footprints of evolutionarily conserved domains, conservative score or E-value thresholds need to be applied for RPS-BLAST hits, to avoid many false positives. We notice that manual inspection and classification of hits gathered at a higher threshold can add a significant amount of valuable domain annotation. We report an automated algorithm that ‘rescues’ valuable borderline-scoring domain hits that are well-supported by domain architecture (DA, the sequential order o… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
14
0

Year Published

2016
2016
2019
2019

Publication Types

Select...
8
1
1

Relationship

0
10

Authors

Journals

citations
Cited by 19 publications
(14 citation statements)
references
References 12 publications
0
14
0
Order By: Relevance
“…Candidate sequences were aligned using ClustalX2 and a neighbor-joining phylogram was constructed, with both using the default parameters. The NCBI Conserved Domain Search tool was used to examine the sequences for conserved regions against the Conserved Domains Database, v3.13 (Derbyshire et al, 2015 ).…”
Section: Methodsmentioning
confidence: 99%
“…Candidate sequences were aligned using ClustalX2 and a neighbor-joining phylogram was constructed, with both using the default parameters. The NCBI Conserved Domain Search tool was used to examine the sequences for conserved regions against the Conserved Domains Database, v3.13 (Derbyshire et al, 2015 ).…”
Section: Methodsmentioning
confidence: 99%
“… A. Schematic of the BB0794 protein. The gray boxes from amino acids 1–34 indicate the putative N-terminal signal peptide predicted by PrediSi ( Nielsen et al ., 1997 ) while amino acid 1117–1443 represent the C-terminal DUF490 domain identified by the Conserved Domain Database (CDD) algorithm ( Derbyshire et al ., 2015 ). …”
Section: Figmentioning
confidence: 99%
“…Additional clusters were included in the analyzed set based on the phylogenetic representation of their members. Selected representatives of each cluster were checked for the presence of known protein domains in their N-terminal regions using CD-Search (70) with relaxed parameters (expect value cut-off of 10) and the predicted structural similarity to PilZ was checked using HHPred (64, 65). Protein secondary structures were taken from the HHpred outputs or predicted using JPred (71).…”
Section: Methodsmentioning
confidence: 99%