2007
DOI: 10.1371/journal.pbio.0050016
|View full text |Cite
|
Sign up to set email alerts
|

The Sorcerer II Global Ocean Sampling Expedition: Expanding the Universe of Protein Families

Abstract: Metagenomics projects based on shotgun sequencing of populations of micro-organisms yield insight into protein families. We used sequence similarity clustering to explore proteins with a comprehensive dataset consisting of sequences from available databases together with 6.12 million proteins predicted from an assembly of 7.7 million Global Ocean Sampling (GOS) sequences. The GOS dataset covers nearly all known prokaryotic protein families. A total of 3,995 medium- and large-sized clusters consisting of only G… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

9
664
1
4

Year Published

2007
2007
2016
2016

Publication Types

Select...
5
4

Relationship

0
9

Authors

Journals

citations
Cited by 753 publications
(681 citation statements)
references
References 150 publications
9
664
1
4
Order By: Relevance
“…The detection of inteins from all of our samples in this study, the presence of a variety of putative inteins from every station from the Global Ocean Survey data set (Yooseph et al, 2007) and the presence of DNA polymerase I motif C inteins in the phycodnavirus isolates HaV (Nagasaki et al, 2005) and CeV (Monier et al, 2008) all suggest that inteins are common among marine viruses. This raises the question of how DNA polymerase I motif C inteins are acquired by phycodnaviruses.…”
Section: Discussionmentioning
confidence: 59%
See 1 more Smart Citation
“…The detection of inteins from all of our samples in this study, the presence of a variety of putative inteins from every station from the Global Ocean Survey data set (Yooseph et al, 2007) and the presence of DNA polymerase I motif C inteins in the phycodnavirus isolates HaV (Nagasaki et al, 2005) and CeV (Monier et al, 2008) all suggest that inteins are common among marine viruses. This raises the question of how DNA polymerase I motif C inteins are acquired by phycodnaviruses.…”
Section: Discussionmentioning
confidence: 59%
“…The designations for sequences retrieved in this study are in bold type and begin with 'KB' for Kāne'ohe Bay. Sequences starting with 'JCVI' are environmental sequences from the Global Ocean Survey (Yooseph et al, 2007). A blue circle adjacent to a sequence indicates the intein encodes a homing endonuclease.…”
Section: Discussionmentioning
confidence: 99%
“…On the other hand, uncharacterized 'singletons' are rarely saturating, that is, each genome or environment contains a lot of uncharacterizable ORFs (or ORF fragments). That novel families grow linearly if more ocean samples are added was also a major conclusion of a recent metagenomics ocean survey [63 • ], when applying a similar method to that described in [49 • ,50]. However, this observed 'novelty', based on the absence of homology to anything we know, could also be inflated by gene fragments that are too short to be recognized or by spurious gene predictions (although stringent criteria were applied in this study [63 • ]).…”
Section: Function Prediction In Environmental Samples: Lots Of Noveltmentioning
confidence: 85%
“…L'inventaire des gènes est encore plus impressionnant que celui des espèces rencontrées [25][26][27]. Mais que peut-on en dire ?…”
Section: L'inventaire Des Fonctionsunclassified