Tess D. Verschuuren scite author profile

Assembly of bacterial short-read whole-genome sequencing data frequently results in hundreds of contigs for which the origin, plasmid or chromosome, is unclear. Complete genomes resolved by long-read sequencing can be used to generate and label short-read contigs. These were used to train several popular machine learning methods to classify the origin of contigs from Enterococcus faecium, Klebsiella pneumoniae and Escherichia coli using pentamer frequencies. We selected support-vector machine (SVM) models as the best classifier for all three bacterial species (F1-score E. faecium=0.92, F1-score K. pneumoniae=0.90, F1-score E. coli=0.76), which outperformed other existing plasmid prediction tools using a benchmarking set of isolates. We demonstrated the scalability of our models by accurately predicting the plasmidome of a large collection of 1644 E. faecium isolates and illustrate its applicability by predicting the location of antibiotic-resistance genes in all three species. The SVM classifiers are publicly available as an R package and graphical-user interface called ‘mlplasmids’. We anticipate that this tool may significantly facilitate research on the dissemination of plasmids encoding antibiotic resistance and/or contributing to host adaptation.

show abstract

mlplasmids: a user-friendly tool to predict plasmid- and chromosome-derived sequences for single species

Arredondo-Alonso

Rogers

Braat

et al. 2018

Preprint

View full text Add to dashboard Cite

Assembly of bacterial short-read whole genome sequencing (WGS) data frequently results in hundreds of contigs for which the origin, plasmid or chromosome, is unclear. Long-read sequencing has emerged as a solution to resolve plasmid structures and to obtain complete genomes for most bacterial species. This information can be used to generate and label datasets from short-read based contigs as plasmid-or chromosome-derived. We investigated the use of several popular machine learning methods to classify short-read contigs with known plasmid-or chromosome-origin from Enterococcus faecium , Klebsiella pneumoniae and Escherichia coli using pentamer frequencies. Based on resulting F1-scores we selected support-vector machine (SVM) models as best classifier for all three bacterial species (F1-score E. faecium = 0.94, F1-score K. pneumoniae = 0.90, F1-score E. coli = 0.76) , which outperformed other existing plasmid tools using an independent set of isolates (precision E. faecium = 0.92, precision K. pneumoniae = 0.86, precision E. coli = 0.82). We demonstrated the scalability of our model by accurately predicting the plasmidome of a large collection of 1,644 E. faecium isolates with only short-read WGS available using a standard laptop with a single core. A low number of false positive predicted sequences suggests that the assignment of a particular gene of interest as plasmid-or chromosome-encoded by the models is plausible. The SVM classifiers are publicly available as a new R package called 'mlplasmids' at https://gitlab.com/sirarredondo/mlplasmids under the GNU General Public License v3.0. We additionally developed a graphical-user interface using the Shiny package which can be accessed at https://sarredondo.shinyapps.io/mlplasmids/ . Single genomes can easily be predicted by uploading genome assemblies. We anticipate that this tool may significantly facilitate research on the dissemination of plasmids encoding antibiotic resistance and/or contributing to host adaptation.

show abstract

Extended-spectrum beta-lactamase (ESBL)-producing and non-ESBL-producing Escherichia coli isolates causing bacteremia in the Netherlands (2014 – 2016) differ in clonal distribution, antimicrobial resistance gene and virulence gene content

et al. 2020

View full text Add to dashboard Cite

Background Knowledge on the molecular epidemiology of Escherichia coli causing E. coli bacteremia (ECB) in the Netherlands is mostly based on extended-spectrum beta-lactamase-producing E. coli (ESBL-Ec). We determined differences in clonality and resistance and virulence gene (VG) content between non-ESBL-producing E. coli (non-ESBL-Ec) and ESBL-Ec isolates from ECB episodes with different epidemiological characteristics. Methods A random selection of non-ESBL-Ec isolates as well as all available ESBL-Ec blood isolates was obtained from two Dutch hospitals between 2014 and 2016. Whole genome sequencing was performed to infer sequence types (STs), serotypes, acquired antibiotic resistance genes and VG scores, based on presence of 49 predefined putative pathogenic VG. Results ST73 was most prevalent among the 212 non-ESBL-Ec (N = 26, 12.3%) and ST131 among the 69 ESBL-Ec (N = 30, 43.5%). Prevalence of ST131 among non-ESBL-Ec was 10.4% (N = 22, P value < .001 compared to ESBL-Ec). O25:H4 was the most common serotype in both non-ESBL-Ec and ESBL-Ec. Median acquired resistance gene counts were 1 (IQR 1-6) and 7 (IQR 4-9) for non-ESBL-Ec and ESBL-Ec, respectively (P value < .001). Among non-ESBL-Ec, acquired resistance gene count was highest among blood isolates from a

show abstract

Household acquisition and transmission of extended-spectrum β-lactamase (ESBL) -producing Enterobacteriaceae after hospital discharge of ESBL-positive index patients

Riccio

Verschuuren

Conzelmann

et al. 2021

Clinical Microbiology and Infection

View full text Add to dashboard Cite

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.