1Whole genome sequencing can provide essential public health information. However, it is now 2 known that widely used short-read methods have the potential to miss some randomly-3 distributed segments of genomes. This can prevent phages, plasmids, and virulence factors 4 from being detected or properly identified. Here, we compared assemblies of three complete 5 STEC O26:H11 genomes from two different sequence types (ST21 and 29), each acquired 6 using the MiSeq-Nextera XT, MinION nanopore-based sequencing, and Pacific Biosciences 7 (PacBio) sequencing. Each closed genome consisted of a single chromosome, approximately 8 5.7 Mb for CFSAN027343, 5.6 Mb for CFSAN027346, and 5.4 MB for CFSAN027350. However, 9 short-read WGS using MiSeq-Nextera failed to identify some virulence genes in plasmids and 10 on the chromosome, both of which were detected using the long-read platforms. Results from 11 long-read MinION and PacBio allowed us to identify differences in plasmid content: a single 88 12 kb plasmid in CFSAN027343; a 157kb plasmid in CFSAN027350; and two plasmids in 13 CFSAN027346 (one 95 Kb, one 72 Kb). These data enabled rapid characterization of the 14 virulome, detection of antimicrobial genes, and composition/location of Stx phages. Taken 15 together, positive correlations between the two long-read methods for determining plasmids, 16 virulome, antimicrobial resistance genes, and phage composition support MinION sequencing 17 as one accurate and economical option for closing STEC genomes and identifying specific 18 virulence markers.
12are very complex, containing many virulence genes, insertion sequences, phages, and 13 plasmids; consequently, strains of the same lineage can possess significantly different content 14 (7,9,35), (33,(36)(37)(38). Missing the presence of some of these elements during an investigation can 15 have large impacts on human health (e.g. hlyA gene in EHECs). Thus, it really matters that we 16 have reproducible WGS systems for fully capturing and sequencing these elements.
18Long-read sequencing platforms afford one solution to this challenge. Some systems such as 19 Pacific Biosciences (PacBio) Sequencers RSII or Sequel (https://www.pacb.com/products-and-20 services/pacbio-systems/), use single-molecule real-time (SMRT) sequencing technology that 21 allow for real-time observation of DNA synthesis through zero-mode waveguides (ZMWs) and 22 phospho-linked nucleotides (11,12). While comprehensive in their ability to capture entire 23 genomes, extraneous elements included, these systems often require significant investments in 24 machinery, space, and laboratory expertise, all of which may be obstacles to routine use. These 25 systems also require significant quantities of DNA (i.e., 5 µg), require a more substantial preparatory time (i.e., 8 hr DNA sequencing library protocols, and produce average read lengths 1 of about 11Kb, although reads of greater than 50 kb were possible in our laboratory at the time 2 of this study.
4Alternative sequencing platforms based on nanopore technology ma...