Mycobacterium tuberculosis is a serious human pathogen threat exhibiting complex evolution of antimicrobial resistance (AMR). Accordingly, the many publicly available datasets describing its AMR characteristics demand disparate data-type analyses. Here, we develop a reference strain-agnostic computational platform that uses machine learning approaches, complemented by both genetic interaction analysis and 3D structural mutation-mapping, to identify signatures of AMR evolution to 13 antibiotics. This platform is applied to 1595 sequenced strains to yield four key results. First, a pan-genome analysis shows that M. tuberculosis is highly conserved with sequenced variation concentrated in PE/PPE/PGRS genes. Second, the platform corroborates 33 genes known to confer resistance and identifies 24 new genetic signatures of AMR. Third, 97 epistatic interactions across 10 resistance classes are revealed. Fourth, detailed structural analysis of these genes yields mechanistic bases for their selection. The platform can be used to study other human pathogens.
Salmonella strains are traditionally classified into serovars based on their surface antigens. While increasing availability of whole-genome sequences has allowed for more detailed subtyping of strains, links between genotype, serovar, and host remain elusive. Here we reconstruct genome-scale metabolic models for 410 Salmonella strains spanning 64 serovars. Model-predicted growth capabilities in over 530 different environments demonstrate that: (1) the Salmonella accessory metabolic network includes alternative carbon metabolism, and cell wall biosynthesis; (2) metabolic capabilities correspond to each strain’s serovar and isolation host; (3) growth predictions agree with 83.1% of experimental outcomes for 12 strains (690 out of 858); (4) 27 strains are auxotrophic for at least one compound, including l-tryptophan, niacin, l-histidine, l-cysteine, and p-aminobenzoate; and (5) the catabolic pathways that are important for fitness in the gastrointestinal environment are lost amongst extraintestinal serovars. Our results reveal growth differences that may reflect adaptation to particular colonization sites.
BackgroundThe efficacy of antibiotics against M. tuberculosis has been shown to be influenced by experimental media conditions. Investigations of M. tuberculosis growth in physiological conditions have described an environment that is different from common in vitro media. Thus, elucidating the interplay between available nutrient sources and antibiotic efficacy has clear medical relevance. While genome-scale reconstructions of M. tuberculosis have enabled the ability to interrogate media differences for the past 10 years, recent reconstructions have diverged from each other without standardization. A unified reconstruction of M. tuberculosis H37Rv would elucidate the impact of different nutrient conditions on antibiotic efficacy and provide new insights for therapeutic intervention.ResultsWe present a new genome-scale model of M. tuberculosis H37Rv, named iEK1011, that unifies and updates previous M. tuberculosis H37Rv genome-scale reconstructions. We functionally assess iEK1011 against previous models and show that the model increases correct gene essentiality predictions on two different experimental datasets by 6% (53% to 60%) and 18% (60% to 71%), respectively. We compared simulations between in vitro and approximated in vivo media conditions to examine the predictive capabilities of iEK1011. The simulated differences recapitulated literature defined characteristics in the rewiring of TCA metabolism including succinate secretion, gluconeogenesis, and activation of both the glyoxylate shunt and the methylcitrate cycle. To assist efforts to elucidate mechanisms of antibiotic resistance development, we curated 16 metabolic genes related to antimicrobial resistance and approximated evolutionary drivers of resistance. Comparing simulations of these antibiotic resistance features between in vivo and in vitro media highlighted condition-dependent differences that may influence the efficacy of antibiotics.ConclusionsiEK1011 provides a computational knowledge base for exploring the impact of different environmental conditions on the metabolic state of M. tuberculosis H37Rv. As more experimental data and knowledge of M. tuberculosis H37Rv become available, a unified and standardized M. tuberculosis model will prove to be a valuable resource to the research community studying the systems biology of M. tuberculosis.Electronic supplementary materialThe online version of this article (10.1186/s12918-018-0557-y) contains supplementary material, which is available to authorized users.
BackgroundThe mechanistic description of enzyme kinetics in a dynamic model of metabolism requires specifying the numerical values of a large number of kinetic parameters. The parameterization challenge is often addressed through the use of simplifying approximations to form reaction rate laws with reduced numbers of parameters. Whether such simplified models can reproduce dynamic characteristics of the full system is an important question.ResultsIn this work, we compared the local transient response properties of dynamic models constructed using rate laws with varying levels of approximation. These approximate rate laws were: 1) a Michaelis-Menten rate law with measured enzyme parameters, 2) a Michaelis-Menten rate law with approximated parameters, using the convenience kinetics convention, 3) a thermodynamic rate law resulting from a metabolite saturation assumption, and 4) a pure chemical reaction mass action rate law that removes the role of the enzyme from the reaction kinetics. We utilized in vivo data for the human red blood cell to compare the effect of rate law choices against the backdrop of physiological flux and concentration differences. We found that the Michaelis-Menten rate law with measured enzyme parameters yields an excellent approximation of the full system dynamics, while other assumptions cause greater discrepancies in system dynamic behavior. However, iteratively replacing mechanistic rate laws with approximations resulted in a model that retains a high correlation with the true model behavior. Investigating this consistency, we determined that the order of magnitude differences among fluxes and concentrations in the network were greatly influential on the network dynamics. We further identified reaction features such as thermodynamic reversibility, high substrate concentration, and lack of allosteric regulation, which make certain reactions more suitable for rate law approximations.ConclusionsOverall, our work generally supports the use of approximate rate laws when building large scale kinetic models, due to the key role that physiologically meaningful flux and concentration ranges play in determining network dynamics. However, we also showed that detailed mechanistic models show a clear benefit in prediction accuracy when data is available. The work here should help to provide guidance to future kinetic modeling efforts on the choice of rate law and parameterization approaches.Electronic supplementary materialThe online version of this article (doi:10.1186/s12918-016-0283-2) contains supplementary material, which is available to authorized users.
Current machine learning classifiers have successfully been applied to whole-genome sequencing data to identify genetic determinants of antimicrobial resistance (AMR), but they lack causal interpretation. Here we present a metabolic model-based machine learning classifier, named Metabolic Allele Classifier (MAC), that uses flux balance analysis to estimate the biochemical effects of alleles. We apply the MAC to a dataset of 1595 drug-tested Mycobacterium tuberculosis strains and show that MACs predict AMR phenotypes with accuracy on par with mechanism-agnostic machine learning models (isoniazid AUC = 0.93) while enabling a biochemical interpretation of the genotype-phenotype map. Interpretation of MACs for three antibiotics (pyrazinamide, para-aminosalicylic acid, and isoniazid) recapitulates known AMR mechanisms and suggest a biochemical basis for how the identified alleles cause AMR. Extending flux balance analysis to identify accurate sequence classifiers thus contributes mechanistic insights to GWAS, a field thus far dominated by mechanismagnostic results.
The evolution of antimicrobial resistance (AMR) poses a persistent threat to global public health. Sequencing efforts have already yielded genome sequences for thousands of resistant microbial isolates and require robust computational tools to systematically elucidate the genetic basis for AMR. Here, we present a generalizable machine learning workflow for identifying genetic features driving AMR based on constructing reference strain-agnostic pan-genomes and training random subspace ensembles (RSEs). This workflow was applied to the resistance profiles of 14 antimicrobials across three urgent threat pathogens encompassing 288 Staphylococcus aureus, 456 Pseudomonas aeruginosa, and 1588 Escherichia coli genomes. We find that feature selection by RSE detects known AMR associations more reliably than common statistical tests and previous ensemble approaches, identifying a total of 45 known AMR-conferring genes and alleles across the three organisms, as well as 25 candidate associations backed by domain-level annotations. Furthermore, we find that results from the RSE approach are consistent with existing understanding of fluoroquinolone (FQ) resistance due to mutations in the main drug targets, gyrA and parC, in all three organisms, and suggest the mutational landscape of those genes with respect to FQ resistance is simple. As larger datasets become available, we expect this approach to more reliably predict AMR determinants for a wider range of microbial pathogens.
O-antigens are glycopolymers in lipopolysaccharides expressed on the cell surface of Gram-negative bacteria. Variability in the O-antigen structure constitutes the basis for the establishment of the serotyping schema. We pursued a two-pronged approach to define the basis for O-antigen structural diversity. First, we developed a bottom-up systems biology approach to O-antigen metabolism by building a reconstruction of Salmonella O-antigen biosynthesis and used it to (i) update 410 existing Salmonella strain-specific metabolic models, (ii) predict a strain’s serogroup and its O-antigen glycan synthesis capability (yielding 98% agreement with experimental data), and (iii) extend our workflow to more than 1,400 Gram-negative strains. Second, we used a top-down pangenome analysis to elucidate the genetic basis for intraserogroup O-antigen structural variations. We assembled a database of O-antigen gene islands from over 11,000 sequenced Salmonella strains, revealing (i) that gene duplication, pseudogene formation, gene deletion, and bacteriophage insertion elements occur ubiquitously across serogroups; (ii) novel serotypes in the group O:4 B2 variant, as well as an additional genotype variant for group O:4, and (iii) two novel O-antigen gene islands in understudied subspecies. We thus comprehensively defined the genetic basis for O-antigen diversity. IMPORTANCE Lipopolysaccharides are a major component of the outer membrane in Gram-negative bacteria. They are composed of a conserved lipid structure that is embedded in the outer leaflet of the outer membrane and a polysaccharide known as the O-antigen. O-antigens are highly variable in structure across strains of a species and are crucial to a bacterium’s interactions with its environment. They constitute the first line of defense against both the immune system and bacteriophage infections and have been shown to mediate antimicrobial resistance. The significance of our research is in identifying the metabolic and genetic differences within and across O-antigen groups in Salmonella strains. Our effort constitutes a first step toward characterizing the O-antigen metabolic network across Gram-negative organisms and a comprehensive overview of genetic variations in Salmonella.
Supplementary data are available at Bioinformatics online.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.