2004
DOI: 10.1073/pnas.0405612101
|View full text |Cite
|
Sign up to set email alerts
|

Comparative homology agreement search: An effective combination of homology-search methods

Abstract: Many methods have been developed to search for homologous members of a protein family in databases, and the reliability of results and conclusions may be compromised if only one method is used, neglecting the others. Here we introduce a general scheme for combining such methods. Based on this scheme, we implemented a tool called comparative homology agreement search (CHASE) that integrates different search strategies to obtain a combined ''E value.'' Our results show that a consensus method integrating distinc… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
13
0

Year Published

2004
2004
2016
2016

Publication Types

Select...
7
1

Relationship

2
6

Authors

Journals

citations
Cited by 20 publications
(13 citation statements)
references
References 29 publications
(26 reference statements)
0
13
0
Order By: Relevance
“…Allagainst-all BLAST searches (Altschul et al 1990) were performed to reveal the similarities among 348,995 predicted protein sequences, identifying 47,342,483 similarities below the maximum E-value 1 ‫ן‬ 10 Formulating data sets for fungal species tree Additional homology searches were performed using CHASE (Alam et al 2004) to select universal (i.e., present in all species) protein clusters. Two data sets were formulated: (1) To get a better rooting of the fungal phylogenetic tree, H. sapiens and A. thaliana sequences were collected for 12 universal fungal protein clusters from the HomoloGene orthologs databases of the National Center for Biotechnology Information (http://www.ncbi.nlm.nih.…”
Section: Orthology Assignmentsmentioning
confidence: 99%
“…Allagainst-all BLAST searches (Altschul et al 1990) were performed to reveal the similarities among 348,995 predicted protein sequences, identifying 47,342,483 similarities below the maximum E-value 1 ‫ן‬ 10 Formulating data sets for fungal species tree Additional homology searches were performed using CHASE (Alam et al 2004) to select universal (i.e., present in all species) protein clusters. Two data sets were formulated: (1) To get a better rooting of the fungal phylogenetic tree, H. sapiens and A. thaliana sequences were collected for 12 universal fungal protein clusters from the HomoloGene orthologs databases of the National Center for Biotechnology Information (http://www.ncbi.nlm.nih.…”
Section: Orthology Assignmentsmentioning
confidence: 99%
“…Therefore, in most cases the existing similarity search algorithms cannot detect all homologues, orthologs, or paralogs of a particular sequence. 24 This is of major significance for biodefense because probes and primers can lose both sensitivity and specificity due to ''molecular erosion'' of a pathogen's target genomic region. This is further complicated because existing bioinformatics methods cannot estimate how ''specific'' is a biothreat sequence segment.…”
Section: Challengesmentioning
confidence: 99%
“…Similarly, Kolde et al (2012) propose a robust rank aggregation (RRA) method for gene list integration, which yields a higher accuracy than the individual gene lists. Consistently, ensemble methods have been shown to yield a higher accuracy than individual methods in other biological problems, such as finding homologous members of a protein family in databases (Alam et al, 2004), and cancer classification (Tan & Gilbert, 2003). To conclude, ensemble methods have a strong potential in biological data mining.…”
Section: Introductionmentioning
confidence: 99%