Structural variants (SVs) are a major source of genetic and phenotypic variation, but remain challenging to accurately type and are hence poorly characterized in most species. We present an approach for reliable SV discovery in non-model species using whole genome sequencing and report 15,483 high-confidence SVs in 492 Atlantic salmon (Salmo salar L.) sampled from a broad phylogeographic distribution. These SVs recover population genetic structure with high resolution, include an active DNA transposon, widely affect functional features, and overlap more duplicated genes retained from an ancestral salmonid autotetraploidization event than expected. Changes in SV allele frequency between wild and farmed fish indicate polygenic selection on behavioural traits during domestication, targeting brain-expressed synaptic networks linked to neurological disorders in humans. This study offers novel insights into the role of SVs in genome evolution and the genetic architecture of domestication traits, along with resources supporting reliable SV discovery in non-model species.
Background Whole genome duplication (WGD) events have played a major role in eukaryotic genome evolution, but the consequence of these extreme events in adaptive genome evolution is still not well understood. To address this knowledge gap, we used a comparative phylogenetic model and transcriptomic data from seven species to infer selection on gene expression in duplicated genes (ohnologs) following the salmonid WGD 80–100 million years ago. Results We find rare cases of tissue-specific expression evolution but pervasive expression evolution affecting many tissues, reflecting strong selection on maintenance of genome stability following genome doubling. Ohnolog expression levels have evolved mostly asymmetrically, by diverting one ohnolog copy down a path towards lower expression and possible pseudogenization. Loss of expression in one ohnolog is significantly associated with transposable element insertions in promoters and likely driven by selection on gene dosage including selection on stoichiometric balance. We also find symmetric expression shifts, and these are associated with genes under strong evolutionary constraints such as ribosome subunit genes. This possibly reflects selection operating to achieve a gene dose reduction while avoiding accumulation of “toxic mutations”. Mechanistically, ohnolog regulatory divergence is dictated by the number of bound transcription factors in promoters, with transposable elements being one likely source of novel binding sites driving tissue-specific gains in expression. Conclusions Our results imply pervasive adaptive expression evolution following WGD to overcome the immediate challenges posed by genome doubling and to exploit the long-term genetic opportunities for novel phenotype evolution.
1Structural variants (SVs) are a major source of genetic and phenotypic variation, but remain challenging to 2 accurately type and are hence poorly characterized in most species. We present an approach for reliable SV 3 discovery in non-model species using whole genome sequencing and report 15,483 high-confidence SVs in 4 492 Atlantic salmon (Salmo salar L.) sampled from a broad phylogeographic distribution. These SVs 5 recover population genetic structure with high resolution, include an active DNA transposon, widely affect 6 functional features, and overlap more duplicated genes retained from an ancestral salmonid 7 autotetraploidization event than expected. Changes in SV allele frequency between wild and farmed fish 8 indicate polygenic selection on behavioural traits during domestication, targeting brain-expressed synaptic 9 networks linked to neurological disorders in humans. This study offers novel insights into the role of SVs 10 in genome evolution and the genetic architecture of domestication traits, along with resources supporting 11 reliable SV discovery in non-model species. 12Main 13Modern genetics remains primarily focused on single nucleotide polymorphism (SNP) analyses, with a 14 growing recognition of the importance of larger structural variants (SVs) including inversions, insertions, 15 deletions and copy number variations (CNVs) (defined here as variants ≥100 bp), among others 1 . SVs 16 affect a larger proportion of bases in human genomes than SNPs 4 , are not always reliably tagged by SNPs 5 , 17 more frequently have regulatory impacts 6 , and have been shown to alter the structure, presence, number, 18 dosage, and regulation of many genes 1 . Nonetheless, SVs remain challenging to accurately type using 19 whole genome sequence data 2-3 , limiting our understanding of their biological roles and exploitation as 20 genetic markers. Consequently, there is a need for reliable SV detection approaches to fully exploit the fast-21 accumulating genome sequencing datasets in both model and non-model species, allowing for more 22 complete genetics investigations. Many tools exist for SV discovery using short-read sequencing data, but 23 all suffer from high false discovery rates (10-89%) 2,3,7 . This poses a challenge for truly de novo SV 24 detection in previously unstudied species lacking 'gold standard' reference SVs to help distinguish true 25 from false calls. Most studies rely on combining an ensemble of signals from different SV detection 26 methods, although this strategy does not reliably improve performance and can in some cases aggravate 27 false discovery 3 . Researchers therefore often apply independent experimental 8-9 or visualization methods 10 28 to validate a subset of SV calls. Overall, there remains an unsatisfactory lack of consensus on how to 29 validate the quality of de novo SV datasets in most species 3 . 31Salmonids have the highest combined economic, ecological and scientific importance among all fish 32 lineages, and have consequently been subject to hundreds of genetics stu...
The enzymatic degradation of recalcitrant polysaccharides is accomplished by synergistic enzyme cocktails of glycoside hydrolases (GHs) and accessory enzymes. Many GHs are processive which means that they remain attached to the substrate in between subsequent hydrolytic reactions. Chitinases are GHs that catalyze the hydrolysis of chitin (β-1,4-linked N-acetylglucosamine). Previously, a relationship between active site topology and processivity has been suggested while recent computational efforts have suggested a link between the degree of processivity and ligand binding free energy. We have investigated these relationships by employing computational (molecular dynamics (MD)) and experimental (isothermal titration calorimetry (ITC)) approaches to gain insight into the thermodynamics of substrate binding to Serratia marcescens chitinases ChiA, ChiB, and ChiC. We show that increased processive ability indeed corresponds to more favorable binding free energy and that this likely is a general feature of GHs. Moreover, ligand binding in ChiB is entropically driven; in ChiC it is enthalpically driven, and the enthalpic and entropic contributions to ligand binding in ChiA are equal. Furthermore, water is shown to be especially important in ChiA-binding. This work provides new insight into oligosaccharide binding, getting us one step closer to understand how GHs efficiently degrade recalcitrant polysaccharides.
Microorganisms use a host of enzymes, including processive glycoside hydrolases, to deconstruct recalcitrant polysaccharides to sugars. Processive glycoside hydrolases closely associate with polymer chains and repeatedly cleave glycosidic linkages without dissociating from the crystalline surface after each hydrolytic step; they are typically the most abundant enzymes in both natural secretomes and industrial cocktails by virtue of their significant hydrolytic potential. The ubiquity of aromatic residues lining the enzyme catalytic tunnels and clefts is a notable feature of processive glycoside hydrolases. We hypothesized that these aromatic residues have uniquely defined roles, such as substrate chain acquisition and binding in the catalytic tunnel, that are defined by their local environment and position relative to the substrate and the catalytic center. Here, we investigated this hypothesis with variants of Serratia marcescens family 18 processive chitinases ChiA and ChiB. We applied molecular simulation and free energy calculations to assess active site dynamics and ligand binding free energies. Isothermal titration calorimetry provided further insight into enthalpic and entropic contributions to ligand binding free energy. Thus, the roles of six aromatic residues, Trp-167, Trp-275, and Phe-396 in ChiA, and Trp-97, Trp-220, and Phe-190 in ChiB, have been examined. We observed that point mutation of the tryptophan residues to alanine results in unfavorable changes in the free energy of binding relative to wild-type. The most drastic effects were observed for residues positioned at the "entrances" of the deep substrate-binding clefts and known to be important for processivity. Interestingly, phenylalanine mutations in ChiA and ChiB had little to no effect on chito-oligomer binding, in accordance with the limited effects of their removal on chitinase functionality.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.