Do the frequencies of disease mutations in human populations reflect a simple balance between mutation and purifying selection? What other factors shape the prevalence of disease mutations? To begin to answer these questions, we focused on one of the simplest cases: recessive mutations that alone cause lethal diseases or complete sterility. To this end, we generated a hand-curated set of 417 Mendelian mutations in 32 genes reported to cause a recessive, lethal Mendelian disease. We then considered analytic models of mutation-selection balance in infinite and finite populations of constant sizes and simulations of purifying selection in a more realistic demographic setting, and tested how well these models fit allele frequencies estimated from 33,370 individuals of European ancestry. In doing so, we distinguished between CpG transitions, which occur at a substantially elevated rate, and three other mutation types. Intriguingly, the observed frequency for CpG transitions is slightly higher than expectation but close, whereas the frequencies observed for the three other mutation types are an order of magnitude higher than expected, with a bigger deviation from expectation seen for less mutable types. This discrepancy is even larger when subtle fitness effects in heterozygotes or lethal compound heterozygotes are taken into account. In principle, higher than expected frequencies of disease mutations could be due to widespread errors in reporting causal variants, compensation by other mutations, or balancing selection. It is unclear why these factors would have a greater impact on disease mutations that occur at lower rates, however. We argue instead that the unexpectedly high frequency of disease mutations and the relationship to the mutation rate likely reflect an ascertainment bias: of all the mutations that cause recessive lethal diseases, those that by chance have reached higher frequencies are more likely to have been identified and thus to have been included in this study. Beyond the specific application, this study highlights the parameters likely to be important in shaping the frequencies of Mendelian disease alleles.
Hypermutable strains of Drosophila simulans have been studied for 20 years. Several mutants were isolated and characterized, some of which had phenotypes associated with alteration in development; for example, showing ectopic legs with eyes being expressed in place of antennae. The causal agent of this hypermutability is a non-autonomous hobo-related sequence (hoboVA). Around 100 mobilizable copies of this element are present in the D. simulans genome, and these are likely mobilized by the autonomous and canonical hobo element. We have shown that hoboVA has transcription factor binding sites for the developmental genes, hunchback and even-skipped, and that this transposon is expressed in embryos, following the patterns of these genes. We suggest that hobo and hobo-related elements can be material for the emergence of new regulatory networks.
The mobilome, portion of the genome composed of transposable elements (TEs), of Anopheles darlingi was described together with the genome of this species. Here, this mobilome was revised using similarity and de novo search approaches. A total of 5.6% of the A. darlingi genome is derived of TEs. Class I gypsy and copia were the most abundant superfamilies, corresponding to 22.36% of the mobilome. Non-LTR elements of the R1 and Jockey superfamilies account for 11% of the TEs. Among Class II TEs, the mariner superfamily is the most abundant (16.01%). Approximately 87% of the A. darlingi mobilome consist of short, truncated and/or degenerated copies of TEs. Only three retrotransposons, two belonging to gypsy and one to copia superfamilies, are putatively active elements. Only one Class II element, belonging to the mariner superfamily, is putatively active, having 12 copies in the genome. The TE landscape of A. darlingi is formed primarily by degenerated elements and, therefore, somewhat stable. Future applications of TE-based vectors for genetic transformation of A. darlingi should take into consideration mariner and piggyBac transposons, because full length and putatively active copies of these elements are present in its genome.
Abstract1 Do the frequencies of disease mutations in human populations reflect a simple balance between 2 mutation and purifying selection? What other factors shape the prevalence of disease mutations? 3To begin to answer these questions, we focused on one of the simplest cases: recessive mutations 4 that alone cause lethal diseases or complete sterility. To this end, we generated a hand-curated set 5 of 417 Mendelian mutations in 32 genes, reported to cause a recessive, lethal Mendelian disease. 6We then considered analytic models of mutation-selection balance in infinite and finite populations 7 of constant sizes and simulations of purifying selection in a more realistic demographic setting, and 8 tested how well these models fit allele frequencies estimated from 33,370 individuals of European 9ancestry. In doing so, we distinguished between CpG transitions, which occur at a substantially 10 elevated rate, and three other mutation types. The observed frequency for CpG transitions is 11 slightly higher than expectation but close, whereas the frequencies observed for the three other 12 mutation types are an order of magnitude higher than expected. This discrepancy is even larger 13 when subtle fitness effects in heterozygotes or lethal compound heterozygotes are taken into 14 account. In principle, higher than expected frequencies of disease mutations could be due to 15 widespread errors in reporting causal variants, compensation by other mutations, or balancing 16 selection. It is unclear why these factors would have a greater impact on variants with lower 17 mutation rates, however. We argue instead that the unexpectedly high frequency of disease 18 mutations and the relationship to the mutation rate likely reflect an ascertainment bias: of all the 19 mutations that cause recessive lethal diseases, those that by chance have reached higher
Macrocystis pyrifera (giant kelp), a haplodiplontic brown macroalga that alternates between a macroscopic diploid (sporophyte) and a microscopic haploid (gametophyte) phase, provides an ideal system to investigate how ploidy background affects the evolutionary history of a gene. In M. pyrifera, the same genome is subjected to different selective pressures and environments as it alternates between haploid and diploid life stages. We assembled M. pyrifera gene models using available expression data and validated 8,292 genes models using the model alga Ectocarpus siliculosus. Differential expression analysis identified gene models expressed in either or both the haploid and diploid life stages while functional annotation identified processes enriched in each stage. Genes expressed preferentially or exclusively in the gametophyte stage were found to have higher nucleotide diversity (π = 2.3 × 10–3 and 2.8 × 10–3, respectively) than those for sporophytes (π = 1.1 × 10–3 and 1 × 10–3, respectively). While gametophyte-biased genes show faster sequence evolution, the sequence evolution exhibits less signatures of adaptations when compared to sporophyte-biased genes. Our findings contrast the standing masking hypothesis, which predicts higher standing genetic variation at the sporophyte stage, and support the strength of expression theory, which posits that genes expressed more strongly are expected to evolve slower. We argue that the sporophyte stage undergoes more stringent selection compared with the gametophyte stage, which carries a heavy genetic load associated with broadcast spawning. Furthermore, using whole-genome sequencing, we confirm the strong population structure in wild M. pyrifera populations previously established using microsatellite markers, and estimate population genetic parameters, such as pairwise genetic diversity and Tajima’s D, important for conservation and domestication of M. pyrifera.
The use of Drosophila as a scientific model is well established, but the use of cockroaches as experimental organisms has been increasing, mainly in toxicology research. Nauphoeta cinerea is one of the species that has been studied, and among its advantages is its easy laboratory maintenance. However, a limited amount of genetic data about N. cinerea is available, impeding gene identification and expression analyses, genetic manipulation, and a deeper understanding of its functional biology. Here we describe the N. cinerea fat body and head transcriptome, in order to provide a database of genetic sequences to better understand the metabolic role of these tissues, and describe detoxification and stress response genes. After removing low-quality sequences, we obtained 62,121 transcripts, of which more than 50% had a length of 604 pb. The assembled sequences were annotated according to their genes ontology (GO). We identified 367 genes related to stress and detoxification; among these, the more frequent were p450 genes. The results presented here are the first large-scale sequencing of N. cinerea and will facilitate the genetic understanding of the species' biochemistry processes in future works.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.