The last eukaryote common ancestor (LECA) possessed mitochondria and all key traits that make eukaryotic cells more complex than their prokaryotic ancestors, yet the timing of mitochondrial acquisition and the role of mitochondria in the origin of eukaryote complexity remain debated. Here we report evidence from gene duplications in LECA indicating an early origin of mitochondria. Among 163,545 duplications in 24,571 gene trees spanning 150 sequenced eukaryotic genomes, we identify 713 gene duplication events that occurred in LECA. LECA's bacterial derived genes include numerous mitochondrial functions and were duplicated significantly more often than archaeal derived and eukaryote specific genes. The surplus of bacterial derived duplications in LECA most likely reflects the serial copying of genes from the mitochondrial endosymbiont to the archaeal host's chromosomes. Clustering, phylogenies and likelihood ratio tests for 22.4 million genes from 5,655 prokaryotic and 150 eukaryotic genomes reveal no evidence for lineage specific gene acquisitions in eukaryotes, except from the plastid in the plant lineage. That finding, and the functions of bacterial genes duplicated in LECA, suggest that the bacterial genes in eukaryotes are acquisitions from the mitochondrion, followed by vertical gene evolution and differential loss across eukaryotic lineages, flanked by concomitant lateral gene transfer among prokaryotes. Overall, the data indicate that recurrent gene transfer via the copying of genes from a resident mitochondrial endosymbiont to archaeal host chromosomes preceded the onset of eukaryotic cellular complexity, favoring mitochondria-early over mitochondria-late hypotheses for eukaryote origin.
Metagenomic studies permit the exploration of microbial diversity in a defined habitat and binning procedures enable phylogenomic analyses, taxon description and even phenotypic characterizations in the absence of morphological evidence. Such lineages include asgard archaea, which were initially reported to represent archaea with eukaryotic cell complexity, although the first images of such an archaeon show simple cells with prokaryotic characteristics. However, these metagenome-assembled genomes (MAGs) might suffer from data quality problems not encountered in sequences from cultured organisms due to two common analytical procedures of bioinformatics: assembly of metagenomic sequences and binning of assembled sequences on the basis of innate sequence properties and abundance across samples. Consequently, genomic sequences of distantly related taxa, or domains, can in principle be assigned to the same MAG and result in chimeric sequences. The impacts of low-quality or chimeric MAGs on phylogenomic and metabolic prediction remain unknown. Debates that asgard archaeal data are contaminated with eukaryotic sequences are overshadowed by the lack of evidence indicating that individual asgard MAGs stem from the same chromosome. Here we show that universal proteins including ribosomal proteins of asgard archaeal MAGs fail to meet the basic phylogenetic criterion fulfilled by genome sequences of cultured archaea investigated to date: these proteins do not share common evolutionary histories to the same extent as pure culture genomes (PCGs) do, pointing to a chimeric nature of asgard archaeal MAGs. Our analysis suggests that some asgard archaeal MAGs represent unnatural constructs, genome-like patchworks of genes resulting from assembly and/or the binning process.
BackgroundThe origin of eukaryotic cells was an important transition in evolution. The factors underlying the origin and evolutionary success of the eukaryote lineage are still discussed. One camp argues that mitochondria were essential for eukaryote origin because of the unique configuration of internalized bioenergetic membranes that they conferred to the common ancestor of all known eukaryotic lineages. A recent paper by Lynch and Marinov concluded that mitochondria were energetically irrelevant to eukaryote origin, a conclusion based on analyses of previously published numbers of various molecules and ribosomes per cell and cell volumes as a presumed proxy for the role of mitochondria in evolution. Their numbers were purportedly extracted from the literature.ResultsWe have examined the numbers upon which the recent study was based. We report that for a sample of 80 numbers that were purportedly extracted from the literature and that underlie key inferences of the recent study, more than 50% of the values do not exist in the cited papers to which the numbers are attributed. The published result cannot be independently reproduced. Other numbers that the recent study reports differ inexplicably from those in the literature to which they are ascribed. We list the discrepancies between the recently published numbers and the purported literature sources of those numbers in a head to head manner so that the discrepancies are readily evident, although the source of error underlying the discrepancies remains obscure.ConclusionThe data purportedly supporting the view that mitochondria had no impact upon eukaryotic evolution data exhibits notable irregularities. The paper in question evokes the impression that the published numbers are of up to seven significant digit accuracy, when in fact more than half the numbers are nowhere to be found in the literature to which they are attributed. Though the reasons for the discrepancies are unknown, it is important to air these issues, lest the prominent paper in question become a point source of a snowballing error through the literature or become interpreted as a form of evidence that mitochondria were irrelevant to eukaryote evolution.ReviewersThis article was reviewed by Eric Bapteste, Jianzhi Zhang and Martin Lercher.
Metagenomic studies have claimed the existence of novel lineages with unprecedented properties never before observed in prokaryotes. Such lineages include Asgard archaea 1-3 , which are purported to represent archaea with eukaryotic cell complexity, and the Candidate Phyla Radiation (CPR), a novel domain level taxon erected solely on the basis of metagenomic data 4 . However, it has escaped the attention of most biologists that these metagenomic sequences are not assembled into genomes by sequence overlap, as for cultured archaea and bacteria. Instead, short contigs are sorted into computer files by a process called binning in which they receive taxonomic assignment on the basis of sequence properties like GC content, dinucleotide frequencies, and stoichiometric co-occurrence across samples.Consequently, they are not genome sequences as we know them, reflecting the gene content of real organisms. Rather they are metagenome assembled genomes (MAGs). Debates that Asgard data are contaminated with individual eukaryotic sequences 5-7 are overshadowed by the more pressing issue that no evidence exists to indicate that any sequences in binned Asgard MAGs actually stem from the same chromosome, as opposed to simply stemming from the same environment. Here we show that Asgard and CPR MAGs fail spectacularly to meet the most basic phylogenetic criterion 8 fulfilled by genome sequences of all cultured prokaryotes investigated to date: the ribosomal proteins of Asgard and CPR MAGs do not share common evolutionary histories. Their phylogenetic behavior is anomalous to a degree never observed with genomes of real organisms. CPR and Asgard MAGs are binning artefacts, assembled from environments where up to 90% of the DNA is from dead cells 9-12 . Asgard and CPR MAGs are unnatural constructs, genome-like patchworks of genes that have been stitched together into computer files by binning.The sequencing of environmental DNA (metagenomics) has become an essential tool of modern science because it allows microbiologists to uncover the existence of genes and species in environments such as marine sediment or the deep biosphere from which representatives cannot readily be cultured 13,14 . Initially an endeavor involving rRNA sequences 15 , metagenomic investigations have led to the binning-assembly and deposition in databases of MAGs, sequences of similar GC content and stoichiometry across samples. Because rRNA has limited phylogenetic resolution, concatenated sequences of ribosomal proteins (r-proteins) and other universally distributed proteins are commonly used for phylogeny. This practice is well established with over 20 years of tradition, whereby the validity of using concatenated rproteins for phylogeny lies in the reproducible crosschecking result that individual r-proteins 3 from the same sequenced genome give the same or very similar trees [16][17][18][19][20][21] . Based in such precedence, it became common practice to use concatenated r-proteins from sequenced genomes for microbial phylogeny without first crosschecking w...
In prokaryotes, known mechanisms of lateral gene transfer (transformation, transduction, conjugation, and gene transfer agents) generate new combinations of genes among chromosomes during evolution. In eukaryotes, whose host lineage is descended from archaea, lateral gene transfer from organelles to the nucleus occurs at endosymbiotic events. Recent genome analyses studying gene distributions have uncovered evidence for sporadic, discontinuous events of gene transfer from bacteria to archaea during evolution. Other studies have used traditional models designed to investigate gene family size evolution (Count) to support claims that gene transfer to archaea was continuous during evolution, rather than involving occasional periodic mass gene influx events. Here, we show that the methodology used in analyses favoring continuous gene transfers to archaea was misapplied in other studies and does not recover known events of single simultaneous origin for many genes followed by differential loss in real data: plastid genomes. Using the same software and the same settings, we reanalyzed presence/absence pattern data for proteins encoded in plastid genomes and for eukaryotic protein families acquired from plastids. Contrary to expectations under a plastid origin model, we found that the methodology employed inferred that gene acquisitions occurred uniformly across the plant tree. Sometimes as many as nine different acquisitions by plastid DNA were inferred for the same protein family. That is, the methodology that recovered gradual and continuous lateral gene transfer among lineages for archaea obtains the same result for plastids, even though it is known that massive gains followed by gradual differential loss is the true evolutionary process that generated plastid gene distribution data. Our findings caution against the use of models designed to study gene family size evolution for investigating gene transfer processes, especially when transfers involving more than one gene per event are possible.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.