SeqSero, launched in 2015, is a software tool for Salmonella serotype determination from whole-genome sequencing (WGS) data. Despite its routine use in public health and food safety laboratories in the United States and other countries, the original SeqSero pipeline is relatively slow (minutes per genome using sequencing reads), is not optimized for draft genome assemblies, and may assign multiple serotypes for a strain. Here, we present SeqSero2 (github.com/denglab/SeqSero2; denglab.info/SeqSero2), an algorithmic transformation and functional update of the original SeqSero. Major improvements include (i) additional sequence markers for identification of Salmonella species and subspecies and certain serotypes, (ii) a k-mer based algorithm for rapid serotype prediction from raw reads (seconds per genome) and improved serotype prediction from assemblies, and (iii) a targeted assembly approach for specific retrieval of serotype determinants from WGS for serotype prediction, new allele discovery, and prediction troubleshooting. Evaluated using 5,794 genomes representing 364 common U.S. serotypes, including 2,280 human isolates of 117 serotypes from the National Antimicrobial Resistance Monitoring System, SeqSero2 is up to 50 times faster than the original SeqSero while maintaining equivalent accuracy for raw reads and substantially improving accuracy for assemblies. SeqSero2 further suggested that 3% of the tested genomes contained reads from multiple serotypes, indicating a use for contamination detection. In addition to short reads, SeqSero2 demonstrated potential for accurate and rapid serotype prediction directly from long nanopore reads despite base call errors. Testing of 40 nanopore-sequenced genomes of 17 serotypes yielded a single H antigen misidentification. IMPORTANCE Serotyping is the basis of public health surveillance of Salmonella. It remains a first-line subtyping method even as surveillance continues to be transformed by whole-genome sequencing. SeqSero allows the integration of Salmonella serotyping into a whole-genome-sequencing-based laboratory workflow while maintaining continuity with the classic serotyping scheme. SeqSero2, informed by extensive testing and application of SeqSero in the United States and other countries, incorporates important improvements and updates that further strengthen its application in routine and large-scale surveillance of Salmonella by whole-genome sequencing.
Increasingly, routine surveillance and monitoring of foodborne pathogens using whole-genome sequencing is creating opportunities to study foodborne illness epidemiology beyond routine outbreak investigations and case–control studies. Using a global phylogeny of Salmonella enterica serotype Typhimurium, we found that major livestock sources of the pathogen in the United States can be predicted through whole-genome sequencing data. Relatively steady rates of sequence divergence in livestock lineages enabled the inference of their recent origins. Elevated accumulation of lineage-specific pseudogenes after divergence from generalist populations and possible metabolic acclimation in a representative swine isolate indicates possible emergence of host adaptation. We developed and retrospectively applied a machine learning Random Forest classifier for genomic source prediction of Salmonella Typhimurium that correctly attributed 7 of 8 major zoonotic outbreaks in the United States during 1998–2013. We further identified 50 key genetic features that were sufficient for robust livestock source prediction.
Metagenomics analysis of food samples promises isolation-independent detection and subtyping of foodborne bacterial pathogens in a single workflow. Selective concentration of genomic DNA through immunomagnetic separation (IMS) and multiple displacement amplification (MDA) were shown to shorten culture enrichment of-spiked raw chicken breast samples by over 12 hours while permitting serotyping and high-fidelity single nucleotide polymorphisms (SNP) typing of the pathogen using short shotgun sequencing reads. The herein termed quasi-metagenomics approach was evaluated on -spiked lettuce and black peppercorn samples as well as retail chicken parts naturally contaminated with different serotypes of Between 8 and 24 h culture enrichment was required for detecting and subtyping naturally occurring from unspiked chicken parts compared with 4 to 12 h culture enrichment when-spiked food samples were analyzed, indicating the likely need for longer culture enrichment to revive low levels of stressed or injured cells in food. Further acceleration of the workflow was achieved by real-time nanopore sequencing. After 1.5 hours of analysis on a potable sequencer, sufficient data were generated from sequencing IMS-MDA product of a cultured-enriched lettuce sample to allow serotyping and robust phylogenetic placement of the inoculated isolate. Both culture enrichment and next-generation sequencing remain to be time-consuming processes for food testing where rapid methods for pathogen detection are widely available. Our study demonstrated substantial acceleration of the respective process through IMS-MDA and real-time nanopore sequencing. In one example, the combined use of the two methods delivered a less than 24 h turnaround time from a -contaminated lettuce sample to phylogenetic identification of the pathogen. Improved efficiency like this is important for further expanding the use of whole genome and metagenomics sequencing in microbial analysis of food. Our results suggest the potential of the quasi-metagenomics approach in areas where rapid detection and subtyping of foodborne pathogens is important, such as foodborne outbreak response and precision tracking and monitoring of foodborne pathogens in production environments and supply chains.
Salmonella is one of the most common causes of food-borne diseases worldwide. While Salmonella molecular subtyping by Whole Genome Sequencing (WGS) is increasingly used for outbreak and source tracking investigations, serotyping remains as a first-line characterization of Salmonella isolates. The traditional phenotypic method for serotyping is logistically challenging, as it requires the use of more than 150 specific antisera and well trained personnel to interpret the results. Consequently, it is not a routine method for the majority of laboratories. Several rapid molecular methods targeting O and H loci or surrogate genomic markers have been developed as alternative solutions. With the expansion of WGS, in silico Salmonella serotype prediction using WGS data is available. Here, we compared a microarray method using molecular markers, the Check and Trace Salmonella assay (CTS) and a WGS-based serotype prediction tool that targets molecular determinants of serotype (SeqSero) to the traditional phenotypic method using 100 strains representing 45 common and uncommon serotypes. Compared to the traditional method, the CTS assay correctly serotyped 97% of the strains, four strains gave a double serotype prediction. Among the inconclusive data, one strain was not predicted and two strains were incorrectly identified. SeqSero was evaluated with two versions (SeqSero 1 and the alpha test version of SeqSero 2). The correct antigenic formula was predicted by SeqSero 1 for 96 and 95% of strains using raw reads and assembly, respectively. However, 34 and 33% of these predictions included multiple serotypes by raw reads and assembly. With raw reads, one strain was not identified and three strains were discordant with phenotypic serotyping result. With assembly, three strains were not predicted and two strains were incorrectly predicted. While still under development, SeqSero 2 maintained the accuracy of antigenic formula prediction at 98% and reduced multiple serotype prediction rate to 13%. One strain had no prediction and one strain was incorrectly predicted. Our study indicates that the CTS assay is a good alternative for routine laboratories as it is an easy to use method with a short turn-around-time. SeqSero is a reliable replacement for phenotypic serotyping if WGS is routinely implemented.
A pandemic of Salmonella enterica serotype Enteritidis emerged in the 1980s due to contaminated poultry products. How Salmonella Enteritidis rapidly swept through continents remains a historical puzzle as the pathogen continues to cause outbreaks and poultry supply becomes globalized. We hypothesize that international trade of infected breeding stocks causes global spread of the pathogen. By integrating over 30,000 Salmonella Enteritidis genomes from 98 countries during 1949–2020 and international trade of live poultry from the 1980s to the late 2010s, we present multifaceted evidence that converges on a high likelihood, global scale, and extended protraction of Salmonella Enteritidis dissemination via centralized sourcing and international trade of breeding stocks. We discovered recent, genetically near-identical isolates from domestically raised poultry in North and South America. We obtained phylodynamic characteristics of global Salmonella Enteritidis populations that lend spatiotemporal support for its dispersal from centralized origins during the pandemic. We identified concordant patterns of international trade of breeding stocks and quantitatively established a driving role of the trade in the geographic dispersal of Salmonella Enteritidis, suggesting that the centralized origins were infected breeding stocks. Here we demonstrate the value of integrative and hypothesis-driven data mining in unravelling otherwise difficult-to-probe pathogen dissemination from hidden origins.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.