Functional partnerships between proteins are at the core of complex cellular phenotypes, and the networks formed by interacting proteins provide researchers with crucial scaffolds for modeling, data reduction and annotation. STRING is a database and web resource dedicated to protein–protein interactions, including both physical and functional interactions. It weights and integrates information from numerous sources, including experimental repositories, computational prediction methods and public text collections, thus acting as a meta-database that maps all interaction evidence onto a common set of genomes and proteins. The most important new developments in STRING 8 over previous releases include a URL-based programming interface, which can be used to query STRING from other resources, improved interaction prediction via genomic neighborhood in prokaryotes, and the inclusion of protein structures. Version 8.0 of STRING covers about 2.5 million proteins from 630 organisms, providing the most comprehensive view on protein–protein interactions currently available. STRING can be reached at http://string-db.org/.
Changes in gene expression are thought to underlie many of the phenotypic differences between species. However, large-scale analyses of gene expression evolution were until recently prevented by technological limitations. Here we report the sequencing of polyadenylated RNA from six organs across ten species that represent all major mammalian lineages (placentals, marsupials and monotremes) and birds (the evolutionary outgroup), with the goal of understanding the dynamics of mammalian transcriptome evolution. We show that the rate of gene expression evolution varies among organs, lineages and chromosomes, owing to differences in selective pressures: transcriptome change was slow in nervous tissues and rapid in testes, slower in rodents than in apes and monotremes, and rapid for the X chromosome right after its formation. Although gene expression evolution in mammals was strongly shaped by purifying selection, we identify numerous potentially selectively driven expression switches, which occurred at different rates across lineages and tissues and which probably contributed to the specific organ biology of various mammals.
The identification of orthologous genes forms the basis for most comparative genomics studies. Existing approaches either lack functional annotation of the identified orthologous groups, hampering the interpretation of subsequent results, or are manually annotated and thus lag behind the rapid sequencing of new genomes. Here we present the eggNOG database (‘evolutionary genealogy of genes: Non-supervised Orthologous Groups’), which contains orthologous groups constructed from Smith–Waterman alignments through identification of reciprocal best matches and triangular linkage clustering. Applying this procedure to 312 bacterial, 26 archaeal and 35 eukaryotic genomes yielded 43 582 course-grained orthologous groups of which 9724 are extended versions of those from the original COG/KOG database. We also constructed more fine-grained groups for selected subsets of organisms, such as the 19 914 mammalian orthologous groups. We automatically annotated our non-supervised orthologous groups with functional descriptions, which were derived by identifying common denominators for the genes based on their individual textual descriptions, annotated functional categories, and predicted protein domains. The orthologous groups in eggNOG contain 1 241 751 genes and provide at least a broad functional description for 77% of them. Users can query the resource for individual genes via a web interface or download the complete set of orthologous groups at http://eggnog.embl.de.
A large-scale comparative gene expression study reveals the different ways in which the chromosome-wide gene dosage reductions resulting from sex chromosome differentiation events were compensated during mammalian and avian evolution.
The identification of orthologous relationships forms the basis for most comparative genomics studies. Here, we present the second version of the eggNOG database, which contains orthologous groups (OGs) constructed through identification of reciprocal best BLAST matches and triangular linkage clustering. We applied this procedure to 630 complete genomes (529 bacteria, 46 archaea and 55 eukaryotes), which is a 2-fold increase relative to the previous version. The pipeline yielded 224 847 OGs, including 9724 extended versions of the original COG and KOG. We computed OGs for different levels of the tree of life; in addition to the species groups included in our first release (i.e. fungi, metazoa, insects, vertebrates and mammals), we have now constructed OGs for archaea, fishes, rodents and primates. We automatically annotate the non-supervised orthologous groups (NOGs) with functional descriptions, protein domains, and functional categories as defined initially for the COG/KOG database. In-depth analysis is facilitated by precomputed high-quality multiple sequence alignments and maximum-likelihood trees for each of the available OGs. Altogether, eggNOG covers 2 242 035 proteins (built from 2 590 259 proteins) and provides a broad functional description for at least 1 966 709 (88%) of them. Users can access the complete set of orthologous groups via a web interface at: http://eggnog.embl.de.
Sex chromosomes differentiated from different ancestral autosomes in various vertebrate lineages. Here, we trace the functional evolution of the XY Chromosomes of the green anole lizard (Anolis carolinensis), on the basis of extensive high-throughput genome, transcriptome and histone modification sequencing data and revisit dosage compensation evolution in representative mammals and birds with substantial new expression data. Our analyses show that Anolis sex chromosomes represent an ancient XY system that originated at least ≈160 million years ago in the ancestor of Iguania lizards, shortly after the separation from the snake lineage. The age of this system approximately coincides with the ages of the avian and two mammalian sex chromosomes systems. To compensate for the almost complete Y Chromosome degeneration, X-linked genes have become twofold up-regulated, restoring ancestral expression levels. The highly efficient dosage compensation mechanism of Anolis represents the only vertebrate case identified so far to fully support Ohno's original dosage compensation hypothesis. Further analyses reveal that X up-regulation occurs only in males and is mediated by a male-specific chromatin machinery that leads to global hyperacetylation of histone H4 at lysine 16 specifically on the X Chromosome. The green anole dosage compensation mechanism is highly reminiscent of that of the fruit fly, Drosophila melanogaster. Altogether, our work unveils the convergent emergence of a Drosophila-like dosage compensation mechanism in an ancient reptilian sex chromosome system and highlights that the evolutionary pressures imposed by sex chromosome dosage reductions in different amniotes were resolved in fundamentally different ways.
The properties of genotype–phenotype landscapes are crucial for understanding evolution but are not characterized for most traits. Here, we present a >95% complete local landscape for a defined molecular function—the alternative splicing of a human exon (FAS/CD95 exon 6, involved in the control of apoptosis). The landscape provides important mechanistic insights, revealing that regulatory information is dispersed throughout nearly every nucleotide in an exon, that the exon is more robust to the effects of mutations than its immediate neighbours in genotype space, and that high mutation sensitivity (evolvability) will drive the rapid divergence of alternative splicing between species unless it is constrained by selection. Moreover, the extensive epistasis in the landscape predicts that exonic regulatory sequences may diverge between species even when exon inclusion levels are functionally important and conserved by selection.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.