2021
DOI: 10.1099/mgen.0.000691
|View full text |Cite
|
Sign up to set email alerts
|

Bacterial genomic epidemiology with mixed samples

Abstract: Genomic epidemiology is a tool for tracing transmission of pathogens based on whole-genome sequencing. We introduce the mGEMS pipeline for genomic epidemiology with plate sweeps representing mixed samples of a target pathogen, opening the possibility to sequence all colonies on selective plates with a single DNA extraction and sequencing step. The pipeline includes the novel mGEMS read binner for probabilistic assignments of sequencing reads, and the scalable pseudoaligner Themisto. We demonstrate the effectiv… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
37
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
3

Relationship

3
5

Authors

Journals

citations
Cited by 24 publications
(43 citation statements)
references
References 88 publications
0
37
0
Order By: Relevance
“…Lineage deconvolution was performed via the mSWEEP and mGEMS algorithms (87, 88) using a reference database consisting of a high quality subset of 20,047 genomes from the Global Pneumococcal Sequencing Project database (9). Included in this subset were 2,663 genome assemblies from the original genome sequencing study of the Maela camp that relied on single colony picks (24).…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…Lineage deconvolution was performed via the mSWEEP and mGEMS algorithms (87, 88) using a reference database consisting of a high quality subset of 20,047 genomes from the Global Pneumococcal Sequencing Project database (9). Included in this subset were 2,663 genome assemblies from the original genome sequencing study of the Maela camp that relied on single colony picks (24).…”
Section: Methodsmentioning
confidence: 99%
“…The PopPUNK algorithm was used to assign each of these genomes to their respective Global Pneumococcal Sequencing Cluster (89). The mSWEEP and mGEMS pipelines were then run using the fastq files for each deep sequencing sample with the exact commands used given in the Rmarkdown provided as part of the accompanying GitHub repository (87, 88). To reduce the possibility of false positives lineages were only called if they were present at a frequency of at least 1%.…”
Section: Methodsmentioning
confidence: 99%
“…Our study extends the previous analysis by providing lineage-level characterization and subsequent genome assembly from these samples for several important pathogen species (Supplementary Table 1) as well as a more detailed exploration of the diversity within the Klebsiella genus. In both the lineage-level characterization and the Klebsiella species analysis we applied the recent mSWEEP and mGEMS methods [15], [16] with a bespoke set of reference sequences (Methods section). The analysis pipeline is described in more detail in Supplementary Figure 1 and in the Methods section.…”
Section: Resultsmentioning
confidence: 99%
“…Here, we were able to advance this understanding thanks to the deep sequencing of neonatal stool samples in a previous landmark study [14]. Combined with novel methodology [15], [16] and high-precision genomic reference libraries, these results allowed us to identify and assemble single genomes from metagenomic sequencing data at the level of resolution for standard bacterial genomic epidemiology.…”
Section: Discussionmentioning
confidence: 99%
“…We then construct the multi-string variant of the SBWT defined in Section 3.1 for each of the datasets. For all new index variants, we extract the data structure from the Wheeler BOSS index constructed using the tool Themisto [20]. This is straightforward as the Wheeler BOSS index gives access to the outgoing edge labels from each k -mer of the data in colexicographic order.…”
Section: Methodsmentioning
confidence: 99%