Mycobacterium abscessus is an emerging rapidly growing mycobacterium (RGM) causing a pseudotuberculous lung disease to which patients with cystic fibrosis (CF) are particularly susceptible. We report here its complete genome sequence. The genome of M. abscessus (CIP 104536T) consists of a 5,067,172-bp circular chromosome including 4920 predicted coding sequences (CDS), an 81-kb full-length prophage and 5 IS elements, and a 23-kb mercury resistance plasmid almost identical to pMM23 from Mycobacterium marinum. The chromosome encodes many virulence proteins and virulence protein families absent or present in only small numbers in the model RGM species Mycobacterium smegmatis. Many of these proteins are encoded by genes belonging to a “mycobacterial” gene pool (e.g. PE and PPE proteins, MCE and YrbE proteins, lipoprotein LpqH precursors). However, many others (e.g. phospholipase C, MgtC, MsrA, ABC Fe(3+) transporter) appear to have been horizontally acquired from distantly related environmental bacteria with a high G+C content, mostly actinobacteria (e.g. Rhodococcus sp., Streptomyces sp.) and pseudomonads. We also identified several metabolic regions acquired from actinobacteria and pseudomonads (relating to phenazine biosynthesis, homogentisate catabolism, phenylacetic acid degradation, DNA degradation) not present in the M. smegmatis genome. Many of the “non mycobacterial” factors detected in M. abscessus are also present in two of the pathogens most frequently isolated from CF patients, Pseudomonas aeruginosa and Burkholderia cepacia. This study elucidates the genetic basis of the unique pathogenicity of M. abscessus among RGM, and raises the question of similar mechanisms of pathogenicity shared by unrelated organisms in CF patients.
BackgroundThe outermost layer of the bacterial surface is of crucial importance because it is in constant interaction with the host. Glycopeptidolipids (GPLs) are major surface glycolipids present on various mycobacterial species. In the fast-grower model organism Mycobacterium smegmatis, GPL biosynthesis involves approximately 30 genes all mapping to a single region of 65 kb.ResultsWe have recently sequenced the complete genomes of two fast-growers causing human infections, Mycobacterium abscessus (CIP 104536T) and M. chelonae (CIP 104535T). We show here that these two species contain genes corresponding to all those of the M. smegmatis "GPL locus", with extensive conservation of the predicted protein sequences consistent with the production of GPL molecules indistinguishable by biochemical analysis. However, the GPL locus appears to be split into several parts in M. chelonae and M. abscessus. One large cluster (19 genes) comprises all genes involved in the synthesis of the tripeptide-aminoalcohol moiety, the glycosylation of the lipopeptide and methylation/acetylation modifications. We provide evidence that a duplicated acetyltransferase (atf1 and atf2) in M. abscessus and M. chelonae has evolved through specialization, being able to transfer one acetyl at once in a sequential manner. There is a second smaller and distant (M. chelonae, 900 kb; M. abscessus, 3 Mb) cluster of six genes involved in the synthesis of the fatty acyl moiety and its attachment to the tripeptide-aminoalcohol moiety. The other genes are scattered throughout the genome, including two genes encoding putative regulatory proteins.ConclusionAlthough these three species produce identical GPL molecules, the organization of GPL genes differ between them, thus constituting species-specific signatures. An hypothesis is that the compact organization of the GPL locus in M. smegmatis represents the ancestral form and that evolution has scattered various pieces throughout the genome in M. abscessus and M. chelonae.
Most proteins comprise one or several domains. New domain architectures can be created by combining previously existing domains. The elementary events that create new domain architectures may be categorized into three classes, namely domain(s) insertion or deletion (indel), exchange and repetition. Using 'DomainTeam', a tool dedicated to the search for microsyntenies of domains, we quantified the relative contribution of these events. This tool allowed us to collect homologous bacterial genes encoding proteins that have obviously evolved by modular assembly of domains. We show that indels are the most frequent elementary events and that they occur in most cases at either the N- or C-terminus of the proteins. As revealed by the genomic neighbourhood/context of the corresponding genes, we show that a substantial number of these terminal indels are the consequence of gene fusions/fissions. We provide evidence showing that the contribution of gene fusion/fission to the evolution of multi-domain bacterial proteins is lower-bounded by 27% and upper-bounded by 64%. We conclude that gene fusion/fission is a major contributor to the evolution of multi-domain bacterial proteins.
DomainSieve is implemented as a web resource and is accessible at http://stat.genopole.cnrs.fr/ds/.
Tenacibaculum maritimum is responsible for tenacibaculosis, a devastating marine fish disease. This filamentous bacterium displays a very broad host range and a worldwide geographical distribution. We analyzed and compared the genomes of 25 T. maritimum strains, including 22 newly draft-sequenced genomes from isolates selected based on available MLST data, geographical origin and host fish. The genome size (~3.356 Mb in average) of all strains is very similar. The core genome is composed of 2116 protein-coding genes accounting for ~75% of the genes in each genome. These conserved regions harbor a moderate level of nucleotide diversity (~0.0071 bp−1) whose analysis reveals an important contribution of recombination (r/m ≥ 7) in the evolutionary process of this cohesive species that appears subdivided into several subgroups. Association trends between these subgroups and specific geographical origin or ecological niche remains to be clarified. We also evaluated the potential of MALDI-TOF-MS to assess the variability between T. maritimum isolates. Using genome sequence data, several detected mass peaks were assigned to ribosomal proteins. Additionally, variations corresponding to single or multiple amino acid changes in several ribosomal proteins explaining the detected mass shifts were identified. By combining nine polymorphic biomarker ions, we identified combinations referred to as MALDI-Types (MTs). By investigating 131 bacterial isolates retrieved from a variety of isolation sources, we identified twenty MALDI-Types as well as four MALDI-Groups (MGs). We propose this MALDI-TOF-MS Multi Peak Shift Typing scheme as a cheap, fast and an accurate method for screening T. maritimum isolates for large-scale epidemiological surveys.
The detection, across several genomes, of local conservation of gene content and proximity considerably helps the prediction of features of interest, such as gene fusions or physical and functional interactions. Here, we want to process realistic models of chromosomes, in which genes (or genomic segments of several genes) can be duplicated within a chromosome, or be absent from some other chromosome(s). Our approach adopts the technique of temporarily forgetting genes and working directly with protein "domains" such as those found in Pfam. This allows the detection of strings of domains that are conserved in their content, but not necessarily in their order, which we refer to as domain teams. The prominent feature of the method is that it relaxes the rigidity of the orthology criterion and avoids many of the pitfalls of gene-families identification methods, often hampered by multidomain proteins or low levels of sequence similarity. This approach, that allows both inter-and intrachromosomal comparisons, proves to be more sensitive than the classical methods based on pairwise sequence comparisons, particularly in the simultaneous treatment of many species. The automated and fast detection of domain teams, together with its increased sensitivity at identifying segments of identical (protein-coding) gene contents as well as gene fusions, should prove a useful complement to other existing methods.
At the onset of the initiation of chromosome replication, bacterial replicative helicases are recruited and loaded on the DnaA-oriC nucleoprotein platform, assisted by proteins like DnaC/DnaI or DciA. Two orders of bacteria appear, however, to lack either of these factors, raising the question of the essentiality of these factors in bacteria. Through a phylogenomic approach, we identified a pair of genes that could have substituted for dciA. The two domesticated genes are specific of the dnaC/dnaI- and dciA-lacking organisms and apparently domesticated from lambdoid phage genes. They derive from λO and λP and were renamed dopC and dopE, respectively. DopE is expected to bring the replicative helicase to the bacterial origin of replication, while DopC might assist DopE in this function. The confirmation of the implication of DopCE in the handling of the replicative helicase at the onset of replication in these organisms would generalize to all bacteria and therefore to all living organisms the need for specific factors dedicated to this function.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.