2019
DOI: 10.1038/s41587-019-0100-8
|View full text |Cite
|
Sign up to set email alerts
|

Taxonomic assignment of uncultivated prokaryotic virus genomes is enabled by gene-sharing networks

Abstract: Viruses of bacteria and archaea are likely to be critical to all natural, engineered and human ecosystems, and yet their study is hampered by the lack of a universal or scalable taxonomic framework. Here, we introduce vConTACT 2.0, a network-based application to establish prokaryotic virus taxonomy that scales to thousands of uncultivated virus genomes, and integrates confidence scores for all taxonomic predictions. Performance tests using vConTACT 2.0 demonstrate near-identical correspondence to the current o… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

8
453
1

Year Published

2019
2019
2023
2023

Publication Types

Select...
3
2
1
1

Relationship

1
6

Authors

Journals

citations
Cited by 592 publications
(465 citation statements)
references
References 75 publications
8
453
1
Order By: Relevance
“…To build a gene-sharing network, we retrieved 231,166 protein sequences representing the genomes of 2,304 bacterial and archaeal viruses from NCBI RefSeq (version 85) and used the network analytics tool, vConTACT (version 2.0; https://bitbucket.org/MAVERICLab/vcontact2) (Jang et al, 2019), as an app at iVirus (Bolduc et al, 2017a). Briefly, including protein sequences from PA5oct, a total of 231,627 sequences were subjected to all-to-all BLASTp searches, with an E-value threshold of 10 -4 , and defined as the homologous protein clusters (PCs) in the same manner as previously described (Bolduc et al, 2017b).…”
Section: Protein Family Clustering and Construction Of The Relationshmentioning
confidence: 99%
See 3 more Smart Citations
“…To build a gene-sharing network, we retrieved 231,166 protein sequences representing the genomes of 2,304 bacterial and archaeal viruses from NCBI RefSeq (version 85) and used the network analytics tool, vConTACT (version 2.0; https://bitbucket.org/MAVERICLab/vcontact2) (Jang et al, 2019), as an app at iVirus (Bolduc et al, 2017a). Briefly, including protein sequences from PA5oct, a total of 231,627 sequences were subjected to all-to-all BLASTp searches, with an E-value threshold of 10 -4 , and defined as the homologous protein clusters (PCs) in the same manner as previously described (Bolduc et al, 2017b).…”
Section: Protein Family Clustering and Construction Of The Relationshmentioning
confidence: 99%
“…Based on the number of shared PCs between the genomes, vConTACT v2.0 calculated the degree of similarity as the negative logarithmic score by multiplying hypergeometric similarity P-value by the total number of pairwise comparisons. Subsequently, pairs of closely related genomes with a similarity score of ≥ 1 were grouped into viral clusters (VCs), with default parameters of vConTACT v2.0 (Jang et al, 2019). The network was visualized with Cytoscape (version 3.5.1; http://cytoscape.org/), using an edge-weighted spring embedded model, which places the genomes or fragments sharing more PCs closer to each other.…”
Section: Protein Family Clustering and Construction Of The Relationshmentioning
confidence: 99%
See 2 more Smart Citations
“…To understand the relationships between the large numbers of virophages and PLVs in our dataset, we used a network-based analysis of shared protein clusters using vConTACT v.2.0 21 . Such an approach is suitable when faced with numerous distantly related viruses which may undergo frequent genetic exchange.…”
Section: Main Textmentioning
confidence: 99%