2009
DOI: 10.1371/journal.pone.0004345
|View full text |Cite
|
Sign up to set email alerts
|

Using Sequence Similarity Networks for Visualization of Relationships Across Diverse Protein Superfamilies

Abstract: The dramatic increase in heterogeneous types of biological data—in particular, the abundance of new protein sequences—requires fast and user-friendly methods for organizing this information in a way that enables functional inference. The most widely used strategy to link sequence or structure to function, homology-based function prediction, relies on the fundamental assumption that sequence or structural similarity implies functional similarity. New tools that extend this approach are still urgently needed to … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

7
472
0
2

Year Published

2011
2011
2018
2018

Publication Types

Select...
4
4

Relationship

1
7

Authors

Journals

citations
Cited by 389 publications
(481 citation statements)
references
References 48 publications
(73 reference statements)
7
472
0
2
Order By: Relevance
“…Taxonomic annotations for the sequences were obtained from the NCBI Taxonomy database and sequences duplicated at the species level were deleted. Clustering on sequence similarity networks (Atkinson et al, 2009) generated using the Enzyme Function-Initiative Enzyme Similarity Tool (Gerlt et al, 2015) were used to identify homologs of characterized proteins from nonspecific hits. In this analysis, nodes represent individual proteins and edges represent the all-versus-all BLAST E-values (Altschul et al, 1990) between them.…”
Section: Gene Sequence Retrievalmentioning
confidence: 99%
See 1 more Smart Citation
“…Taxonomic annotations for the sequences were obtained from the NCBI Taxonomy database and sequences duplicated at the species level were deleted. Clustering on sequence similarity networks (Atkinson et al, 2009) generated using the Enzyme Function-Initiative Enzyme Similarity Tool (Gerlt et al, 2015) were used to identify homologs of characterized proteins from nonspecific hits. In this analysis, nodes represent individual proteins and edges represent the all-versus-all BLAST E-values (Altschul et al, 1990) between them.…”
Section: Gene Sequence Retrievalmentioning
confidence: 99%
“…In this analysis, nodes represent individual proteins and edges represent the all-versus-all BLAST E-values (Altschul et al, 1990) between them. Closely related proteins form visual clusters, allowing the identification of sequences belonging to a protein family from those belonging to related families (Atkinson et al, 2009). Final sequence sets were obtained by decreasing logE-value cutoffs until no major changes in clustering were observed with large increases in cutoff value.…”
Section: Gene Sequence Retrievalmentioning
confidence: 99%
“…The resulting 5923 sequences were used to generate sequence similarity networks using previously published methods (31) and visualized using the Cytoscape program. Nodes in the network represent sequences, and edges represent BLAST E-values.…”
Section: Network Analysis Of Clds and Their Closest Sequence Relativementioning
confidence: 99%
“…A sequence similarity network was constructed according to the method of Atkinson and coworkers (Atkinson, Morris, Ferrin, & Babbitt, 2009). Their method of independent pairwise alignments between sequences allows functional relationships to be observed over very large sets of evolutionarily related proteins, such as members of www.ccsenet.org/jmbr Journal of Molecular Biology Research Vol.…”
Section: Sequence Similarity Network Analysismentioning
confidence: 99%
“…To examine this apparent discrepancy, we first carried out a sequence similarity network analysis (Atkinson et al, 2009) of COG 1522, which contains the diverse members of the AsnC/Lrp TR family in 66 genomes of unicellular organisms (Figure 3). Bxe_A0736 and the L-kynurenine responsive TR KynR from P. aeruginosa are together in group 24 while AsnC is in group 17.…”
Section: Sequence Similarity Network Analysismentioning
confidence: 99%