c Accurate and rapid typing of pathogens is essential for effective surveillance and outbreak detection. Conventional serotyping of Escherichia coli is a delicate, laborious, time-consuming, and expensive procedure. With whole-genome sequencing (WGS) becoming cheaper, it has vast potential in routine typing and surveillance. The aim of this study was to establish a valid and publicly available tool for WGS-based in silico serotyping of E. coli applicable for routine typing and surveillance. A FASTA database of specific O-antigen processing system genes for O typing and flagellin genes for H typing was created as a component of the publicly available Web tools hosted by the Center for Genomic Epidemiology (CGE) (www.genomicepidemiology.org). All E. coli isolates available with WGS data and conventional serotype information were subjected to WGS-based serotyping employing this specific SerotypeFinder CGE tool. SerotypeFinder was evaluated on 682 E. coli genomes, 108 of which were sequenced for this study, where both the whole genome and the serotype were available. In total, 601 and 509 isolates were included for O and H typing, respectively. The O-antigen genes wzx, wzy, wzm, and wzt and the flagellin genes fliC, flkA, fllA, flmA, and flnA were detected in 569 and 508 genome sequences, respectively. SerotypeFinder for WGS-based O and H typing predicted 560 of 569 O types and 504 of 508 H types, consistent with conventional serotyping. In combination with other available WGS typing tools, E. coli serotyping can be performed solely from WGS data, providing faster and cheaper typing than current routine procedures and making WGS typing a superior alternative to conventional typing strategies.
Escherichia coli is usually a harmless commensal, but some strains have evolved the capability to cause disease in humans and/or animals by specific particular pathogenic mechanisms. In some cases, infection can be fatal (1).Serotyping is a method for classification of E. coli that has existed since the 1940s and has since been developed into standardized procedures (2-4). Performance of serotyping requires a high level of expertise and access to cross-absorbed antisera. It is a time-consuming and laborious procedure. O:K:H serotyping is based on a combination of the three immunogenic structures: the lipopolysaccharide (LPS) (O antigen), the capsular antigen (K), and the flagellar (H) antigen.Since few laboratories are able to perform K typing, O:H serotyping has become the gold standard for characterization of pathogenic E. coli. O:H serotyping is crucial in the detection of outbreaks, for epidemiological surveillance, for taxonomic differentiation of E. coli, for detecting pathogenic serotypes within the species, and for clonal and evolutionary studies. In contrast to several more recently developed molecular typing methods, such as pulsed-field gel electrophoresis (PFGE), ribotyping and to some extent multilocus sequence typing (MLST), serotyping provides information that is directly associated with the antigenic response an...