We performed a systematic, large-scale analysis of human protein complexes comprising gene products implicated in many different categories of human disease to create a phenome-interactome network. This was done by integrating quality-controlled interactions of human proteins with a validated, computationally derived phenotype similarity score, permitting identification of previously unknown complexes likely to be associated with disease. Using a phenomic ranking of protein complexes linked to human disease, we developed a Bayesian predictor that in 298 of 669 linkage intervals correctly ranks the known disease-causing protein as the top candidate, and in 870 intervals with no identified disease-causing gene, provides novel candidates implicated in disorders such as retinitis pigmentosa, epithelial ovarian cancer, inflammatory bowel disease, amyotrophic lateral sclerosis, Alzheimer disease, type 2 diabetes and coronary heart disease. Our publicly available draft of protein complexes associated with pathology comprises 506 complexes, which reveal functional relationships between disease-promoting genes that will inform future experimentation.
The var gene encoded hyper-variable Plasmodium falciparum erythrocyte membrane protein 1 (PfEMP1) family mediates cytoadhesion of infected erythrocytes to human endothelium. Antibodies blocking cytoadhesion are important mediators of malaria immunity acquired by endemic populations. The development of a PfEMP1 based vaccine mimicking natural acquired immunity depends on a thorough understanding of the evolved PfEMP1 diversity, balancing antigenic variation against conserved receptor binding affinities. This study redefines and reclassifies the domains of PfEMP1 from seven genomes. Analysis of domains in 399 different PfEMP1 sequences allowed identification of several novel domain classes, and a high degree of PfEMP1 domain compositional order, including conserved domain cassettes not always associated with the established group A–E division of PfEMP1. A novel iterative homology block (HB) detection method was applied, allowing identification of 628 conserved minimal PfEMP1 building blocks, describing on average 83% of a PfEMP1 sequence. Using the HBs, similarities between domain classes were determined, and Duffy binding-like (DBL) domain subclasses were found in many cases to be hybrids of major domain classes. Related to this, a recombination hotspot was uncovered between DBL subdomains S2 and S3. The VarDom server is introduced, from which information on domain classes and homology blocks can be retrieved, and new sequences can be classified. Several conserved sequence elements were found, including: (1) residues conserved in all DBL domains predicted to interact and hold together the three DBL subdomains, (2) potential integrin binding sites in DBLα domains, (3) an acylation motif conserved in group A var genes suggesting N-terminal N-myristoylation, (4) PfEMP1 inter-domain regions proposed to be elastic disordered structures, and (5) several conserved predicted phosphorylation sites. Ideally, this comprehensive categorization of PfEMP1 will provide a platform for future studies on var/PfEMP1 expression and function.
SummaryThe bacteria Yersinia pestis is the etiological agent of plague and has caused human pandemics with millions of deaths in historic times. How and when it originated remains contentious. Here, we report the oldest direct evidence of Yersinia pestis identified by ancient DNA in human teeth from Asia and Europe dating from 2,800 to 5,000 years ago. By sequencing the genomes, we find that these ancient plague strains are basal to all known Yersinia pestis. We find the origins of the Yersinia pestis lineage to be at least two times older than previous estimates. We also identify a temporal sequence of genetic changes that lead to increased virulence and the emergence of the bubonic plague. Our results show that plague infection was endemic in the human populations of Eurasia at least 3,000 years before any historical recordings of pandemics.
The description of comammox Nitrospira spp., performing complete ammonia-to-nitrate oxidation, and their co-occurrence with canonical β-proteobacterial ammonia oxidizing bacteria (β-AOB) in the environment, calls into question the metabolic potential of comammox Nitrospira and the evolutionary history of their ammonia oxidation pathway. We report four new comammox Nitrospira genomes, constituting two novel species, and the first comparative genomic analysis on comammox Nitrospira. Unlike canonical Nitrospira, comammox Nitrospira genomes lack genes for assimilatory nitrite reduction, suggesting that they have lost the potential to use external nitrite nitrogen sources. By contrast, compared to canonical Nitrospira, comammox Nitrospira harbor a higher diversity of urea transporters and copper homeostasis genes and lack cyanate hydratase genes. Additionally, the two comammox clades differ in their ammonium uptake systems. Contrary to β-AOB, comammox Nitrospira genomes have single copies of the two central ammonia oxidation pathway operons. Similar to ammonia oxidizing archaea and some oligotrophic AOB strains, they lack genes involved in nitric oxide reduction. Furthermore, comammox Nitrospira genomes encode genes that might allow efficient growth at low oxygen concentrations. Regarding the evolutionary history of comammox Nitrospira, our analyses indicate that several genes belonging to the ammonia oxidation pathway could have been laterally transferred from β-AOB to comammox Nitrospira. We postulate that the absence of comammox genes in other sublineage II Nitrospira genomes is the result of subsequent loss.
Pregnancy-associated malaria is a major health problem, which mainly affects primigravidae living in malaria endemic areas. The syndrome is precipitated by accumulation of infected erythrocytes in placental tissue through an interaction between chondroitin sulphate A on syncytiotrophoblasts and a parasite-encoded protein on the surface of infected erythrocytes, believed to be VAR2CSA. VAR2CSA is a polymorphic protein of approximately 3,000 amino acids forming six Duffy-binding-like (DBL) domains. For vaccine development it is important to define the antigenic targets for protective antibodies and to characterize the consequences of sequence variation. In this study, we used a combination of in silico tools, peptide arrays, and structural modeling to show that sequence variation mainly occurs in regions under strong diversifying selection, predicted to form flexible loops. These regions are the main targets of naturally acquired immunoglobulin gamma and accessible for antibodies reacting with native VAR2CSA on infected erythrocytes. Interestingly, surface reactive anti-VAR2CSA antibodies also target a conserved DBL3X region predicted to form an α-helix. Finally, we could identify DBL3X sequence motifs that were more likely to occur in parasites isolated from primi- and multigravidae, respectively. These findings strengthen the vaccine candidacy of VAR2CSA and will be important for choosing epitopes and variants of DBL3X to be included in a vaccine protecting women against pregnancy-associated malaria.
Some proteins are highly conserved across all species, whereas others diverge significantly even between closely related species. Attempts have been made to correlate the rate of protein evolution to amino acid composition, protein dispensability, and the number of protein-protein interactions, but in all cases, conflicting studies have shown that the theories are hard to confirm experimentally. The only correlation that is undisputed so far is that highly/broadly expressed proteins seem to evolve at a lower rate. Consequently, it has been suggested that correlations between evolution rate and factors like protein dispensability or the number of protein-protein interactions could be just secondary effects due to differences in expression. The purpose of this study was to analyze mammalian proteins/genes with known subcellular location for variations in evolution rates. We show that proteins that are exported (extracellular proteins) evolve faster than proteins that reside inside the cell (intracellular proteins). We find weak, but significant, correlations between evolution rates and expression levels, percentage of tissues in which the proteins are expressed (expression broadness), and the number of protein interaction partners. More important, we show that the observed difference in evolution rate between extra- and intracellular proteins is largely independent of expression levels, expression broadness, and the number of protein-protein interactions. We also find that the difference is not caused by an overrepresentation of immunological proteins or disulfide bridge-containing proteins among the extracellular data set. We conclude that the subcellular location of a mammalian protein has a larger effect on its evolution rate than any of the other factors studied in this paper, including expression levels/patterns. We observe a difference in evolution rates between extracellular and intracellular proteins for a yeast data set as well and again show that it is completely independent of expression levels.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.