Each human carries a large number of deleterious mutations. Together, these mutations make a significant contribution to human disease. Identification of deleterious mutations within individual genome sequences could substantially impact an individual's health through personalized prevention and treatment of disease. Yet, distinguishing deleterious mutations from the massive number of nonfunctional variants that occur within a single genome is a considerable challenge. Using a comparative genomics data set of 32 vertebrate species we show that a likelihood ratio test (LRT) can accurately identify a subset of deleterious mutations that disrupt highly conserved amino acids within protein-coding sequences, which are likely to be unconditionally deleterious. The LRT is also able to identify known human disease alleles and performs as well as two commonly used heuristic methods, SIFT and PolyPhen. Application of the LRT to three human genomes reveals 796-837 deleterious mutations per individual, ;40% of which are estimated to be at <5% allele frequency. However, the overlap between predictions made by the LRT, SIFT, and PolyPhen, is low; 76% of predictions are unique to one of the three methods, and only 5% of predictions are shared across all three methods. Our results indicate that only a small subset of deleterious mutations can be reliably identified, but that this subset provides the raw material for personalized medicine.
Saccharomyces cerevisiae is predominantly found in association with human activities, particularly the production of alcoholic beverages. S. paradoxus, the closest known relative of S. cerevisiae, is commonly found on exudates and bark of deciduous trees and in associated soils. This has lead to the idea that S. cerevisiae is a domesticated species, specialized for the fermentation of alcoholic beverages, and isolates of S. cerevisiae from other sources simply represent migrants from these fermentations. We have surveyed DNA sequence diversity at five loci in 81 strains of S. cerevisiae that were isolated from a variety of human and natural fermentations as well as sources unrelated to alcoholic beverage production, such as tree exudates and immunocompromised patients. Diversity within vineyard strains and within saké strains is low, consistent with their status as domesticated stocks. The oldest lineages and the majority of variation are found in strains from sources unrelated to wine production. We propose a model whereby two specialized breeds of S. cerevisiae have been created, one for the production of grape wine and one for the production of saké wine. We estimate that these two breeds have remained isolated from one another for thousands of years, consistent with the earliest archeological evidence for winemaking. We conclude that although there are clearly strains of S. cerevisiae specialized for the production of alcoholic beverages, these have been derived from natural populations unassociated with alcoholic beverage production, rather than the opposite.
The cohesin complex is a chromosomal component required for sister chromatid cohesion that is conserved from yeast to man. The similarly conserved Nipped-B protein is needed for cohesin to bind to chromosomes. In higher organisms, Nipped-B and cohesin regulate gene expression and
Although positive selection has been detected in many genes, its overall contribution to protein evolution is debatable. If the bulk of molecular evolution is neutral, then the ratio of amino-acid (A) to synonymous (S) polymorphism should, on average, equal that of divergence. A comparison of the A/S ratio of polymorphism in Drosophila melanogaster with that of divergence from Drosophila simulans shows that the A/S ratio of divergence is twice as high---a difference that is often attributed to positive selection. But an increase in selective constraint owing to an increase in effective population size could also explain this observation, and, if so, all genes should be affected similarly. Here we show that the difference between polymorphism and divergence is limited to only a fraction of the genes, which are also evolving more rapidly, and this implies that positive selection is responsible. A higher A/S ratio of divergence than of polymorphism is also observed in other species, which suggests a rate of adaptive evolution that is far higher than permitted by the neutral theory of molecular evolution.
The abundance and identity of functional variation segregating in natural populations is paramount to dissecting the molecular basis of quantitative traits as well as human genetic diseases. Genome sequencing of multiple organisms of the same species provides an efficient means of cataloging rearrangements, insertion, or deletion polymorphisms (InDels) and single-nucleotide polymorphisms (SNPs). While inbreeding depression and heterosis imply that a substantial amount of polymorphism is deleterious, distinguishing deleterious from neutral polymorphism remains a significant challenge. To identify deleterious and neutral DNA sequence variation within Saccharomyces cerevisiae, we sequenced the genome of a vineyard and oak tree strain and compared them to a reference genome. Among these three strains, 6% of the genome is variable, mostly attributable to variation in genome content that results from large InDels. Out of the 88,000 polymorphisms identified, 93% are SNPs and a small but significant fraction can be attributed to recent interspecific introgression and ectopic gene conversion. In comparison to the reference genome, there is substantial evidence for functional variation in gene content and structure that results from large InDels, frame-shifts, and polymorphic start and stop codons. Comparison of polymorphism to divergence reveals scant evidence for positive selection but an abundance of evidence for deleterious SNPs. We estimate that 12% of coding and 7% of noncoding SNPs are deleterious. Based on divergence among 11 yeast species, we identified 1,666 nonsynonymous SNPs that disrupt conserved amino acids and 1,863 noncoding SNPs that disrupt conserved noncoding motifs. The deleterious coding SNPs include those known to affect quantitative traits, and a subset of the deleterious noncoding SNPs occurs in the promoters of genes that show allele-specific expression, implying that some cis-regulatory SNPs are deleterious. Our results show that the genome sequences of both closely and distantly related species provide a means of identifying deleterious polymorphisms that disrupt functionally conserved coding and noncoding sequences.
Humans have had a significant impact on the distribution and abundance of Saccharomyces cerevisiae through its widespread use in beer, bread and wine production. Yet, similar to other Saccharomyces species, S. cerevisiae has also been isolated from habitats unrelated to fermentations. Strains of S. cerevisiae isolated from grapes, wine must and vineyards worldwide are genetically differentiated from strains isolated from oak-tree bark, exudate and associated soil in North America. However, the causes and consequences of this differentiation have not yet been resolved. Historical differentiation of these two groups may have been influenced by geographic, ecological or human-associated barriers to gene flow. Here, we make use of the relatively recent establishment of vineyards across North America to identify and characterize any active barriers to gene flow between these two groups. We examined S. cerevisiae strains isolated from grapes and oak-trees within three North American vineyards and compared them to those isolated from oak-trees outside of vineyards. Within vineyards we found evidence of migration between grapes and oak-trees and potential gene flow between the divergent oak-tree and vineyard groups. Yet, we found no vineyard genotypes on oak-trees outside of vineyards. In contrast, S. paradoxus isolated from the same sources showed population structure characterized by isolation by distance. The apparent absence of ecological or genetic barriers between sympatric vineyard and oak-tree populations of S. cerevisiae implies that vineyards play an important role in the mixing between these two groups.
The budding yeast Saccharomyces cerevisiae is important for human food production and as a model organism for biological research. The genetic diversity contained in the global population of yeast strains represents a valuable resource for a number of fields, including genetics, bioengineering, and studies of evolution and population structure. Here, we apply a multiplexed, reduced genome sequencing strategy (restriction site−associated sequencing or RAD-seq) to genotype a large collection of S. cerevisiae strains isolated from a wide range of geographical locations and environmental niches. The method permits the sequencing of the same 1% of all genomes, producing a multiple sequence alignment of 116,880 bases across 262 strains. We find diversity among these strains is principally organized by geography, with European, North American, Asian, and African/S. E. Asian populations defining the major axes of genetic variation. At a finer scale, small groups of strains from cacao, olives, and sake are defined by unique variants not present in other strains. One population, containing strains from a variety of fermentations, exhibits high levels of heterozygosity and a mixture of alleles from European and Asian populations, indicating an admixed origin for this group. We propose a model of geographic differentiation followed by human-associated admixture, primarily between European and Asian populations and more recently between European and North American populations. The large collection of genotyped yeast strains characterized here will provide a useful resource for the broad community of yeast researchers.
The genome sequences of multiple species has enabled functional inferences from comparative genomics. A primary objective is to infer biological functions from the conservation of homologous DNA sequences between species. A second, more difficult, objective is to understand what functional DNA sequences have changed over time and are responsible for species' phenotypic differences. The neutral theory of molecular evolution provides a theoretical framework in which both objectives can be explicitly tested. Development of statistical tests within this framework has provided insight into the evolutionary forces that constrain and in some cases change DNA sequences and the resulting patterns that emerge. In this article, we review recent work on how functional constraint and changes in protein function are inferred from protein polymorphism and divergence data. We relate these studies to our understanding of the neutral theory and adaptive evolution.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.