Transcription factors (TFs) have a central role in genome regulation directing gene transcription through binding specific DNA sequences. Eukaryotic genomes encode a large diversity of TF classes, each defined by unique DNA-interaction domains. Recent advances in genome sequencing and phylogenetic placement of diverse eukaryotic and archaeal species are redefining the evolutionary history of eukaryotic TFs. The emerging view from a comparative genomics perspective is that the Last Eukaryotic Common Ancestor (LECA) had an extensive repertoire of TFs, most of which represent eukaryotic evolutionary novelties. This burst of TF innovation coincides with the emergence of genomic nuclear segregation and complex chromatin organization.pronounced in plants and animals, both of which encode the most diverse and abundant TF repertoires [3]. This review discusses the emergence and diversification of eukaryotic TF classes, as well as the modes of TF acquisition and the evidence of conserved TF functionality across eukaryotes.
Revisiting Transcription Factor diversity across the tree of lifeThe continuously growing availability of genome sequence data from key branches of the tree of life is transforming our understanding of the evolution of major eukaryotic gene families. For example, several deep-branching eukaryotic species have recently been either described and/or sequenced for the first time [5][6][7][8]. Similarly, the discovery and placement of Asgard archaea as the sister group to eukaryotes reshaped our view on eukaryotic origins [9,10]. Although there is not yet a consensus on the phylogenetic root of eukaryotes, phylogenomic analyses have reduced the potential eukaryotic tree topologies to a few alternative options, which chiefly differ on the phylogenetic position of Discoba and Metamonada [5,11]. Taking advantage of these new genomic data, we reviewed the distribution of a curated list of DBDs representing 74 TF classes in 158 eukaryotic species, 265 archaea and 5,394 bacteria (Figure 1, 2) [12]. Some TF classes have pre-eukaryotic origins. For example, the basal transcription factor machinery is present in multiple archaeal species [13,14], including the TBP (TATA box binding protein), NFYB (Nuclear transcription factor Y subunit beta) and the TFIIB (Figure 2). CSD TFs are also found across all domains of life. Interestingly, some Asgard archaea also encode E2F/TDP, which is a key cell cycle regulator in eukaryotes [15]. This constitutes a new example of a gene family shared between Asgard archaea and eukaryotes but absent from other archeal lineages [9,10], thus reinforcing the view of an Asgard-like ancestor as the initial step towards eukaryogenesis. An additional group of TFs are found in a small number of bacterial species. For example, AP2 and Myb TFs are found in 149 and 257 bacterial species respectively. There are three possible explanations for these observed distributions. First, this could indicate that these TFs have bacterial origins [13]. Second, some bacterial lineages could have acquired eukary...