Screening amino acid sequence space via experiments to discover peptides that self-assemble into amyloid fibrils is challenging. We have developed a computational peptide assembly design (PepAD) algorithm, that enables the discovery of amyloid-forming peptides. Discontinuous molecular dynamics (DMD) simulation with the PRIME20 force field combined with the FoldAmyloid tool is used to examine the fibrilization kinetics of PepAD-generated peptides. PepAD screening of ∼10,000 7-mer peptides resulted in twelve top-scoring peptides with two distinct hydration properties. Our studies revealed that eight of the twelve in-silico discovered peptides spontaneously form amyloid fibrils in the DMD simulations and that all eight have at least five residues that the FoldAmyloid tool classifies as being aggregation-prone. Based on these observations, we re-examined the PepAD-generated peptides in the sequence pool returned by PepAD and extracted five sequence patterns as well as associated sequence signatures for the 7-mer amyloid-forming peptides. Experimental results from Fourier transform infrared spectroscopy (FTIR), thioflavin T (ThT) fluorescence, circular dichroism (CD), and transmission electron microscopy (TEM) indicate that all the peptides predicted to assemble in-silico assemble into antiparallel β-sheet nanofibers in a concentration-dependent manner. This is the first attempt to use a computational approach to search for amyloid-forming peptides based on customized settings. Our efforts facilitate the identification of β-sheet-based self-assembling peptides, and contribute insights towards answering a fundamental scientific question: “What does it take, sequence-wise, for a peptide to self-assemble?”
Gene duplication and alternative splicing are important sources of proteomic diversity. Despite research indicating that gene duplication and alternative splicing are negatively correlated, the evolutionary relationship between the two remains unclear. One manner in which alternative splicing and gene duplication may be related is through the process of subfunctionalization, in which an alternatively spliced gene upon duplication divides distinct splice isoforms among the newly generated daughter genes, in this way reducing the number of alternatively spliced transcripts duplicate genes produce. Previously, it has been shown that splice form subfunctionalization will result in duplicate pairs with divergent exon structure when distinct isoforms become fixed in each paralog. However, the effects of exon structure divergence between paralogs have never before been studied on a genome-wide scale. Here, using genomic data from human, mouse, and zebrafish, we demonstrate that gene duplication followed by exon structure divergence between paralogs results in a significant reduction in levels of alternative splicing. In addition, by comparing the exon structure of zebrafish duplicates to the co-orthologous human gene, we have demonstrated that a considerable fraction of exon divergent duplicates maintain the structural signature of splice form subfunctionalization. Furthermore, we find that paralogs with divergent exon structure demonstrate reduced breadth of expression in a variety of tissues when compared to paralogs with identical exon structures and singletons. Taken together, our results are consistent with subfunctionalization partitioning alternatively spliced isoforms among duplicate genes and as such highlight the relationship between gene duplication and alternative splicing.
BackgroundThe common ancestor of salmonid fishes, including rainbow trout (Oncorhynchus mykiss), experienced a whole genome duplication between 20 and 100 million years ago, and many of the duplicated genes have been retained in the trout genome. This retention complicates efforts to detect allelic variation in salmonid fishes. Specifically, single nucleotide polymorphism (SNP) detection is problematic because nucleotide variation can be found between the duplicate copies (paralogs) of a gene as well as between alleles.ResultsWe present a method of differentiating between allelic and paralogous (gene copy) sequence variants, allowing identification of SNPs in organisms with multiple copies of a gene or set of genes. The basic strategy is to: 1) identify windows of unique cDNA sequences with homology to each other, 2) compare these unique cDNAs if they are not shared between individuals (i.e. the cDNA is homozygous in one individual and homozygous for another cDNA in the other individual), and 3) give a “SNP score” value between zero and one to each candidate sequence variant based on six criteria. Using this strategy we were able to detect about seven thousand potential SNPs from the transcriptomes of several clonal lines of rainbow trout. When directly compared to a pre-validated set of SNPs in polyploid wheat, we were also able to estimate the false-positive rate of this strategy as 0 to 28% depending on parameters used.ConclusionsThis strategy has an advantage over traditional techniques of SNP identification because another dimension of sequencing information is utilized. This method is especially well suited for identifying SNPs in polyploids, both outbred and inbred, but would tend to be conservative for diploid organisms.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.