2013
DOI: 10.1073/pnas.1222159110
|View full text |Cite
|
Sign up to set email alerts
|

Mapping gene clusters within arrayed metagenomic libraries to expand the structural diversity of biomedically relevant natural products

Abstract: Complex microbial ecosystems contain large reservoirs of unexplored biosynthetic diversity. Here we provide an experimental framework and data analysis tool to facilitate the targeted discovery of natural-product biosynthetic gene clusters from the environment. Multiplex sequencing of barcoded PCR amplicons is followed by sequence similarity directed data parsing to identify sequences bearing close resemblance to biosynthetically or biomedically interesting gene clusters. Amplicons are then mapped onto arrayed… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
127
0

Year Published

2015
2015
2023
2023

Publication Types

Select...
3
3

Relationship

3
3

Authors

Journals

citations
Cited by 122 publications
(128 citation statements)
references
References 34 publications
0
127
0
Order By: Relevance
“…Our NPST data comprise ∼1 × 10 6 unique environmental sequences that were amplified from soil metagenomes using degenerate primers targeting two of the most common biosynthetic motifs: nonribosomal peptide synthetase (NRPS) adenylation (A) domains and polyketide synthase (PKS) ketosynthase (KS) domains (12). By targeting these very common biosynthetic domains, sequencing resources are focused on generating only data that are relevant to our search strategy, therefore the raw sequencing power required to generate this dataset was quite modest (∼1.5 Gbps) (8,11). We estimate that the diversity of biosynthetic pathways represented in our NPST dataset is at least 50× larger than the NRPS and PKS pathways contained in all publically available sequenced bacterial genomes, as judged by the number of equivalent domains identified by recent systematic analyses (13)(14)(15).…”
Section: Resultsmentioning
confidence: 99%
See 4 more Smart Citations
“…Our NPST data comprise ∼1 × 10 6 unique environmental sequences that were amplified from soil metagenomes using degenerate primers targeting two of the most common biosynthetic motifs: nonribosomal peptide synthetase (NRPS) adenylation (A) domains and polyketide synthase (PKS) ketosynthase (KS) domains (12). By targeting these very common biosynthetic domains, sequencing resources are focused on generating only data that are relevant to our search strategy, therefore the raw sequencing power required to generate this dataset was quite modest (∼1.5 Gbps) (8,11). We estimate that the diversity of biosynthetic pathways represented in our NPST dataset is at least 50× larger than the NRPS and PKS pathways contained in all publically available sequenced bacterial genomes, as judged by the number of equivalent domains identified by recent systematic analyses (13)(14)(15).…”
Section: Resultsmentioning
confidence: 99%
“…A convenient feature of the computational framework is the ability to identify overlapping clones that allow reconstruction of complete pathways by targeting multiple library wells containing the same NPST. The strategy of partially arraying libraries and generating barcoded NPSTs from each library well allows efficient storage and automated in silico screening of cloned metagenomes for diverse biomedically relevant BGCs, as well as facile recovery of entire BGCs identified in computational screens of NPST data (8,9).…”
Section: Recovery Sequencing and In Silico Analysis Of Epoxyketonementioning
confidence: 99%
See 3 more Smart Citations