Congenital heart disease (CHD) is the most frequent birth defect, affecting 0.8% of live births1. Many cases occur sporadically and impair reproductive fitness, suggesting a role for de novo mutations. By analysis of exome sequencing of parent-offspring trios, we compared the incidence of de novo mutations in 362 severe CHD cases and 264 controls. CHD cases showed a significant excess of protein-altering de novo mutations in genes expressed in the developing heart, with an odds ratio of 7.5 for damaging mutations. Similar odds ratios were seen across major classes of severe CHD. We found a marked excess of de novo mutations in genes involved in production, removal or reading of H3K4 methylation (H3K4me), or ubiquitination of H2BK120, which is required for H3K4 methylation2–4. There were also two de novo mutations in SMAD2; SMAD2 signaling in the embryonic left-right organizer induces demethylation of H3K27me5. H3K4me and H3K27me mark `poised' promoters and enhancers that regulate expression of key developmental genes6. These findings implicate de novo point mutations in several hundred genes that collectively contribute to ~10% of severe CHD.
Universal trees of life based on small-subunit (SSU) ribosomal RNA (rRNA) support the separate mono/holophyly of the domains Archaea (archaebacteria), Bacteria (eubacteria) and Eucarya (eukaryotes) and the placement of extreme thermophiles at the base of the Bacteria. The concept of universal tree reconstruction recently has been upset by protein trees that show intermixing of species from different domains. Such tree topologies have been attributed to either extensive horizontal gene transfer or degradation of phylogenetic signals because of saturation for amino acid substitutions. Here we use large combined alignments of 23 orthologous proteins conserved across 45 species from all domains to construct highly robust universal trees. Although individual protein trees are variable in their support of domain integrity, trees based on combined protein data sets strongly support separate monophyletic domains. Within the Bacteria, we placed spirochaetes as the earliest derived bacterial group. However, elimination from the combined protein alignment of nine protein data sets, which were likely candidates for horizontal gene transfer, resulted in trees showing thermophiles as the earliest evolved bacterial lineage. Thus, combined protein universal trees are highly congruent with SSU rRNA trees in their strong support for the separate monophyly of domains as well as the early evolution of thermophilic Bacteria.
Rationale Congenital heart disease (CHD) is among the most common birth defects. Most cases are of unknown etiology. Objective To determine the contribution of de novo copy number variants (CNVs) in the etiology of sporadic CHD. Methods and Results We studied 538 CHD trios using genome-wide dense single nucleotide polymorphism (SNP) arrays and/or whole exome sequencing (WES). Results were experimentally validated using digital droplet PCR. We compared validated CNVs in CHD cases to CNVs in 1,301 healthy control trios. The two complementary high-resolution technologies identified 63 validated de novo CNVs in 51 CHD cases. A significant increase in CNV burden was observed when comparing CHD trios with healthy trios, using either SNP array (p=7x10−5, Odds Ratio (OR)=4.6) or WES data (p=6x10−4, OR=3.5) and remained after removing 16% of de novo CNV loci previously reported as pathogenic (p=0.02, OR=2.7). We observed recurrent de novo CNVs on 15q11.2 encompassing CYFIP1, NIPA1, and NIPA2 and single de novo CNVs encompassing DUSP1, JUN, JUP, MED15, MED9, PTPRE SREBF1, TOP2A, and ZEB2, genes that interact with established CHD proteins NKX2-5 and GATA4. Integrating de novo variants in WES and CNV data suggests that ETS1 is the pathogenic gene altered by 11q24.2-q25 deletions in Jacobsen syndrome and that CTBP2 is the pathogenic gene in 10q sub-telomeric deletions. Conclusions We demonstrate a significantly increased frequency of rare de novo CNVs in CHD patients compared with healthy controls and suggest several novel genetic loci for CHD.
Horizontal gene transfer (HGT) has long been recognized as a principal force in the evolution of genomes. Genome sequences of Archaea and Bacteria have revealed the existence of genes whose similarity to loci in distantly related organisms is explained most parsimoniously by HGT events. In most multicellular organisms, such genetic fixation can occur only in the germ line. Therefore, it is notable that the publication of the human genome reports 113 incidents of direct HGT between bacteria and vertebrates, without any apparent occurrence in evolutionary intermediates, that is, non-vertebrate eukaryotes. Phylogenetic analysis arguably provides the most objective approach for determining the occurrence and directionality of HGT. Here we report a phylogenetic analysis of 28 proposed HGT genes, whose presence in the human genome had been confirmed by polymerase chain reaction (PCR). The results indicate that most putative HGT genes are present in more anciently derived eukaryotes (many such sequences available in non-vertebrate EST databases) and can be explained in terms of descent through common ancestry. They are, therefore, unlikely to be examples of direct HGT from bacteria to vertebrates.
We have developed a method for the prediction of an amino acid sequence that is compatible with a three-dimensional backbone structure. Using only a backbone structure of a protein as input, the algorithm is capable of designing sequences that closely resemble natural members of the protein family to which the template structure belongs. In general, the predicted sequences are shown to have multiple sequence profile scores that are dramatically higher than those of random sequences, and sometimes better than some of the natural sequences that make up the superfamily. As anticipated, highly conserved but poorly predicted residues are often those that contribute to the functional rather than structural properties of the protein. Overall, our analysis suggests that statistical profile scores of designed sequences are a novel and valuable figure of merit for assessing and improving protein design algorithms.Keywords: genetic algorithm; homeodomain; multiple sequence alignment; Pfam; profile; protein design; RRM; SH3 There has been considerable recent success in the development of computational methods for the design of protein sequences, at various degrees of sophistication. Several groups have presented results in which computer algorithms were used to design novel
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.