Processes of molecular innovation require tinkering and shifting in the function of existing genes. How this occurs in terms of molecular evolution at long evolutionary scales remains poorly understood. Here, we analyse the natural history of a vast group of membrane-associated molecular systems in Bacteria and Archaea—the type IV filament (TFF) superfamily—that diversified in systems involved in flagellar or twitching motility, adhesion, protein secretion, and DNA uptake. The phylogeny of the thousands of detected systems suggests they may have been present in the last universal common ancestor. From there, two lineages—a bacterial and an archaeal—diversified by multiple gene duplications, gene fissions and deletions, and accretion of novel components. Surprisingly, we find that the ‘tight adherence’ (Tad) systems originated from the interkingdom transfer from Archaea to Bacteria of a system resembling the ‘EppA-dependent’ (Epd) pilus and were associated with the acquisition of a secretin. The phylogeny and content of ancestral systems suggest that initial bacterial pili were engaged in cell motility and/or DNA uptake. In contrast, specialised protein secretion systems arose several times independently and much later in natural history. The functional diversification of the TFF superfamily was accompanied by genetic rearrangements with implications for genetic regulation and horizontal gene transfer: systems encoded in fewer loci were more frequently exchanged between taxa. This may have contributed to their rapid evolution and spread across Bacteria and Archaea. Hence, the evolutionary history of the superfamily reveals an impressive catalogue of molecular evolution mechanisms that resulted in remarkable functional innovation and specialisation from a relatively small set of components.
The evolution of protein secretion systems of Bacteria, and related nanomachines, remains enigmatic. Secretion is important for biotic and abiotic interactions, and secretion systems evolved by co-option of machinery for motility, conjugation, injection, or adhesion. Some secretion systems emerged many times, whereas others are unique. Their evolution occurred by successive rounds of gene accretion, deletion, and horizontal transfer, resulting in machines that can be very different from the original ones. The frequency of co-option depends on the complexity of the systems, their differences to the ancestral machines, the availability of genetic material to tinker with, and possibly on the mechanisms of effector recognition. Understanding the evolution of secretion systems illuminates their functional diversification and could drive the discovery of novel systems.
Complex cellular functions are usually encoded by a set of genes in one or a few organized genetic loci in microbial genomes. MacSyFinder uses these properties to model and then annotate cellular functions in microbial genomes. This is done by integrating the identification of each individual gene at the level of the molecular system. We hereby present a major release of MacSyFinder (Macromolecular System Finder), MacSyFinder version 2 (v2). This new version is coded in Python 3 (>= 3.7). The code was improved and rationalized to facilitate future maintainability. Several new features were added to allow more flexible modelling of the systems. We introduce a more intuitive and comprehensive search engine to identify all the best candidate systems and sub-optimal ones that respect the models' constraints. We also introduce the novel macsydata companion tool that enables the easy installation and broad distribution of the models developed for MacSyFinder (macsy-models) from GitHub repositories. Finally, we have updated, improved, and made available MacSyFinder popular models for this novel version: TXSScan to identify protein secretion systems, TFFscan to identify type IV filaments, CONJscan to identify conjugative systems, and CasFinder to identify CRISPR associated proteins.
Protein secretion systems are complex molecular machineries that translocate proteins through the outer membrane and sometimes through multiple other barriers. They have evolved by co-option of components from other envelope-associated cellular machineries, making them sometimes difficult to identify and discriminate. Here, we describe how to identify protein secretion systems in bacterial genomes using the MacSyFinder program. This flexible computational tool uses the knowledge gathered from experimental studies to identify homologous systems in genome data. It can be used with a set of pre-defined MacSyFinder models-"TXSScan", to identify all major secretion systems of diderm bacteria (i.e., with inner and LPS-containing outer membranes) as well as evolutionarily related cell appendages (pili and flagella). For this, it identifies and clusters co-localized genes encoding proteins of secretion systems using sequence similarity search with Hidden Markov Model (HMM) protein profiles. Finally, it checks if the clusters' genetic content and genomic organization satisfy the constraints of the model. TXSScan models can be altered in the command line or customized to search for variants of known secretion systems. Models can also be built from scratch to identify novel systems. In this chapter, we describe a complete pipeline of analysis, starting from i) the integration of information from a reference set of experimentally studied systems, ii) the identification of conserved proteins and the construction of their HMM protein profiles, iii) the definition and optimization of "macsy-models", and iv) their use and online distribution as tools to search genomic data for secretion systems of interest. MacSyFinder is available here: https://github.com/gem-pasteur/macsyfinder, and MacSyFinder models here: https://github.com/macsy-models.
Type IV filaments (T4F), which are helical assemblies of type IV pilins, constitute a superfamily of filamentous nanomachines virtually ubiquitous in prokaryotes that mediate a wide variety of functions. The competence (Com) pilus is a widespread T4F, mediating DNA uptake (the first step in natural transformation) in bacteria with one membrane (monoderms), an important mechanism of horizontal gene transfer. Here, we report the results of genomic, phylogenetic, and structural analyses of ComGC, the major pilin subunit of Com pili. By performing a global comparative analysis, we show that Com pili genes are virtually ubiquitous in Bacilli, a major monoderm class of Firmicutes. This also revealed that ComGC displays extensive sequence conservation, defining a monophyletic group among type IV pilins. We further report ComGC solution structures from two naturally competent human pathogens, Streptococcus sanguinis (ComGCSS) and Streptococcus pneumoniae (ComGCSP), revealing that this pilin displays extensive structural conservation. Strikingly, ComGCSS and ComGCSP exhibit a novel type IV pilin fold that is purely helical. Results from homology modeling analyses suggest that the unusual structure of ComGC is compatible with helical filament assembly. Because ComGC displays such a widespread distribution, these results have implications for hundreds of monoderm species.
Type IV pili (T4P) are dynamic surface appendages that promote virulence, biofilm formation, horizontal gene transfer, and motility in diverse bacterial species. Pilus dynamic activity is best characterized in T4P that use distinct ATPase motors for pilus extension and retraction. Many T4P systems, however, lack a dedicated retraction motor, and the mechanism underlying this motor-independent retraction remains a mystery. Using the Vibrio cholerae competence pilus as a model system, we identify mutations in the major pilin gene that enhance motor-independent retraction. These mutants likely diminish pilin–pilin interactions within the filament to produce less-stable pili. One mutation adds a bulky residue to α1C, a universally conserved feature of T4P. We found that inserting a bulky residue into α1C of the retraction motor–dependent Acinetobacter baylyi competence T4P enhances motor-independent retraction. Conversely, removing bulky residues from α1C of the retraction motor–independent, V. cholerae toxin-coregulated T4P stabilizes the filament and diminishes pilus retraction. Furthermore, alignment of pilins from the broader type IV filament (T4F) family indicated that retraction motor–independent T4P, gram-positive Com pili, and type II secretion systems generally encode larger residues within α1C oriented toward the pilus core compared to retraction motor–dependent T4P. Together, our data demonstrate that motor-independent retraction relies, in part, on the inherent instability of the pilus filament, which may be a conserved feature of diverse T4Fs. This provides evidence for a long-standing yet previously untested model in which pili retract in the absence of a motor by spontaneous depolymerization.
Complex cellular functions are usually encoded by a set of genes in one or a few organized genetic loci in microbial genomes. Macromolecular System Finder (MacSyFinder) is a program that uses these properties to model and then annotate cellular functions in microbial genomes. This is done by integrating the identification of each individual gene at the level of the molecular system. We hereby present a major release of MacSyFinder (version 2) coded in Python 3. The code was improved and rationalized to facilitate future maintainability. Several new features were added to allow more flexible modelling of the systems. We introduce a more intuitive and comprehensive search engine to identify all the best candidate systems and sub-optimal ones that respect the models' constraints. We also introduce the novel macsydata companion tool that enables the easy installation and broad distribution of the models developed for MacSyFinder (macsy-models) from GitHub repositories. Finally, we have updated and improved MacSyFinder popular models: TXSScan to identify protein secretion systems, TFFscan to identify type IV filaments, CONJscan to identify conjugative systems, and CasFinder to identify CRISPR associated proteins. MacSyFinder and the updated models are available at: https://github.com/gempasteur/macsyfinder and https://github.com/macsy-models.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.