Recent advances in molecular genetic techniques will make dense marker maps available and genotyping many individuals for these markers feasible. Here we attempted to estimate the effects of ∼50,000 marker haplotypes simultaneously from a limited number of phenotypic records. A genome of 1000 cM was simulated with a marker spacing of 1 cM. The markers surrounding every 1-cM region were combined into marker haplotypes. Due to finite population size (Ne = 100), the marker haplotypes were in linkage disequilibrium with the QTL located between the markers. Using least squares, all haplotype effects could not be estimated simultaneously. When only the biggest effects were included, they were overestimated and the accuracy of predicting genetic values of the offspring of the recorded animals was only 0.32. Best linear unbiased prediction of haplotype effects assumed equal variances associated to each 1-cM chromosomal segment, which yielded an accuracy of 0.73, although this assumption was far from true. Bayesian methods that assumed a prior distribution of the variance associated with each chromosome segment increased this accuracy to 0.85, even when the prior was not correct. It was concluded that selection on genetic values predicted from markers could substantially increase the rate of genetic gain in animals and plants, especially if combined with reproductive techniques to shorten the generation interval.
A method was derived that maximizes the genetic level of selected animals while constraining their average coancestry to a predefined value. The average coancestry of the selected parents equals the inbreeding level in the next generation, so that rates of inbreeding were controlled. When this method was applied for several generations of selection, stable rates of genetic gain were attained, which indicates that the method could control the short- and long-term effects of selection on inbreeding. At equal rates of inbreeding, genetic gains were 21 to 60% greater than that with selection for BLUP-EBV, because of increased selection differentials. The difference was larger when the desirable rate of inbreeding was smallest. Selection with a constraint on inbreeding required only EBV of, and relationships between, the selection candidates and is therefore easy to apply in practice. The optimal solution is expressed in genetic contributions of selection candidates to the next generation, which is equivalent to numbers of offspring per candidate. These optimal numbers of offspring may be difficult to attain because of female reproductive limitations. The optimal method could be adapted to situations with additional reproductive constraints. The method can also be used to constrain the variance of response by restricting the average prediction error variance of the selected animals.
SummaryThe inflammatory bowel diseases (IBD) are chronic gastrointestinal inflammatory disorders that affect millions worldwide. Genome-wide association studies have identified 200 IBD-associated loci, but few have been conclusively resolved to specific functional variants. Here we report fine-mapping of 94 IBD loci using high-density genotyping in 67,852 individuals. We pinpointed 18 associations to a single causal variant with >95% certainty, and an additional 27 associations to a single variant with >50% certainty. These 45 variants are significantly enriched for protein-coding changes (n=13), direct disruption of transcription factor binding sites (n=3) and tissue specific epigenetic marks (n=10), with the latter category showing enrichment in specific immune cells among associations stronger in CD and in gut mucosa among associations stronger in UC. The results of this study suggest that high-resolution fine-mapping in large samples can convert many GWAS discoveries into statistically convincing causal variants, providing a powerful substrate for experimental elucidation of disease mechanisms.
Genomic selection uses total breeding values for juvenile animals, predicted from a large number of estimated marker haplotype effects across the whole genome. In this study the accuracy of predicting breeding values is compared for four different models including a large number of markers, at different marker densities for traits with heritabilities of 50 and 10%. The models estimated the effect of (1) each single-marker allele ½single-nucleotide polymorphism (SNP)1, (2) haplotypes constructed from two adjacent marker alleles (SNP2), and (3) haplotypes constructed from 2 or 10 markers, including the covariance between haplotypes by combining linkage disequilibrium and linkage analysis (HAP_IBD2 and HAP_IBD10). Between 119 and 2343 polymorphic SNPs were simulated on a 3-M genome. For the trait with a heritability of 10%, the differences between models were small and none of them yielded the highest accuracies across all marker densities. For the trait with a heritability of 50%, the HAP_IBD10 model yielded the highest accuracies of estimated total breeding values for juvenile and phenotyped animals at all marker densities. It was concluded that genomic selection is considerably more accurate than traditional selection, especially for a low-heritability trait.T HE availability of many thousands of singlenucleotide polymorphisms (SNPs) spread across the genome for different livestock species opens up possibilities to include genomewide marker information in prediction of total breeding values, to perform genomic selection. Compared to traditional breeding practice, including genomic information yields a considerable increase in selection responses for juvenile animals that do not have phenotypic records and potentially can reduce the costs of a breeding program up to 90% (Schaeffer 2006).Genomic selection as described by predicts total breeding values on the basis of a large number of marker haplotypes across the entire genome. The underlying assumption of genomic selection is that haplotypes at some loci are in linkage disequilibrium (LD) with QTL alleles that affect the traits that are subject to selection. Different ways of deriving haplotypes of combinations of marker alleles, and the relationship between haplotypes at a locus, have been described. One method (SNP1) is to consider each different marker allele at a single locus to be a different haplotype, considering no relationships between different haplotypes, and thus breeding values are estimated directly for the marker alleles (Xu 2003). A second method is to construct haplotypes from two alleles at adjacent markers, assuming a zero relation between haplotypes at the same locus (SNP2) . A third method is to construct haplotypes (HAP_IBD) using two or more surrounding marker alleles and derive identical-by-descent (IBD) probabilities between the different haplotypes at the same locus (Meuwissen and Goddard 2001).The SNP1 model considers only two haplotypes at a locus and therefore may be suited for applications in, for instance, double-haploid ...
Whole-genome resequencing technology has improved rapidly during recent years and is expected to improve further such that the sequencing of an entire human genome sequence for $1000 is within reach. Our main aim here is to use whole-genome sequence data for the prediction of genetic values of individuals for complex traits and to explore the accuracy of such predictions. This is relevant for the fields of plant and animal breeding and, in human genetics, for the prediction of an individual's risk for complex diseases. Here, population history and genomic architectures were simulated under the WrightFisher population and infinite-sites mutation model, and prediction of genetic value was by the genomic selection approach, where a Bayesian nonlinear model was used to predict the effects of individual SNPs. The Bayesian model assumed a priori that only few SNPs are causative, i.e., have an effect different from zero. When using whole-genome sequence data, accuracies of prediction of genetic value were .40% increased relative to the use of dense $30K SNP chips. At equal high density, the inclusion of the causative mutations yielded an extra increase of accuracy of 2.5-3.7%. Predictions of genetic value remained accurate even when the training and evaluation data were 10 generations apart. Best linear unbiased prediction (BLUP) of SNP effects does not take full advantage of the genome sequence data, and nonlinear predictions, such as the Bayesian method used here, are needed to achieve maximum accuracy. On the basis of theoretical work, the results could be extended to more realistic genome and population sizes. G ENOME resequencing technologies are currently developing at a very rapid rate, which we for simplicity call genome sequencing even though it is used on a species with a reference sequence. The current generation sequencing technology is two orders of magnitude faster and more cost effective than the technologies used for the sequencing of the human genome (Shendure and Ji 2008;TenBosch and Grody 2008). Future technologies are expected to reduce cost by another 100-fold so that sequencing an entire human genome for $1000 is considered achievable in the near future (Mardis 2008). The question arises: How can we make best use of entire genome sequence data on many individuals? One use will be the ability to predict the genetic value of an individual for complex traits. In the fields of animal and plant breeding, this would be of great practical benefit because most important traits are complex, quantitative traits, i.e., traits that are affected by many genes and by the environment. In humans the promise of personalized medicine relies on the ability to predict an individual's genetic risk for complex, multifactorial diseases, such as Crohn's disease (Barrett et al. 2008), and the ability to predict response to alternative treatments. The first aim of this article is to explore the accuracy of this prediction using the full genome sequence of the individual.The use of high-density SNP genotype data to...
Background Animal breeding, i.e., the selective breeding for economically important traits, was traditionally based on phenotypic recordings. Best linear unbiased prediction (BLUP) combined individual records and those of relatives into estimates of breeding values (EBV). From 1990 onward, advances in molecular genetics held the promise that information at the DNA level would lead to more genetic improvement than using only phenotypic records. This resulted in research into MAS, which consists of two steps: 1) detect and (fine) map genes underlying the traits of interest, i.e., so called quantitative trait loci (QTL); 2) include the QTL information into the BLUP-EBV (Fernando and Grossman,1989). The QTL mapping step (1) was successful in the sense that most mapping studies detected QTL. But the repeatability of the mapping studies was low, i.e., QTL positions moved/(dis)appeared from one study to the next. One reason for this is that the majority of QTL have very small effects. When this is combined with testing a large number of markers, there is a marked "Beavis effect" in which the estimated effect of significant markers is overestimated (Beavis, 1994). For instance, if we test 100 markers for their statistical significance using a P-value of 1%, we expect one (false) positive result even if all true marker effects are zero. Conversely, if all of the markers have very small effects, few (randomly picked) markers will reach higher levels of significance and most will fail to reach the threshold and be declared nonsignificant. In genome-wide association studies (GWAS), the number of tests equals the number of genotyped independent SNPs, which is typically many thousands in livestock and hundreds of thousands in human genetics. With so many SNPs, the multiple-testing problem becomes so large that in human genetics, P-values of < 5 × 10-8 are commonly used. In addition, human genetics journals demand a confirmation of the QTL in an independent dataset. These very stringent tests resulted in only the largest QTL being found. For some traits, such large QTL were detected, e.g., DGAT1 affecting fat content in milk (Grisart et al., 2001) and CDH1 affecting infectious pancreatic necrosis virus (IPNV) resistance in Atlantic salmon (Moen et al., 2015). However, for many other traits, no reliable QTL were found, and less than 10% of the variation of the overall breeding objective, i.e., a combination of all the economically important traits, was explained by QTL. This was even the case for dairy cattle, where many powerful QTL mapping studies were conducted. Less than 10% of the genetic variance of the breeding objective explained by QTL implied that more than 90% of the genetic differences between animals had to be handled by traditional selection. Hence, by 2005, the uptake of MAS in livestock breeding was very limited. In human genetics, the result that very powerful GWAS studies (e.g., 160,000 individuals genotyped for 500,000 SNPs) explained only a (very) limited fraction of the total genetic variance was termed the ...
Three recent breakthroughs have resulted in the current widespread use of DNA information: the genomic selection (GS) methodology, which is a form of marker-assisted selection on a genome-wide scale, and the discovery of large numbers of single-nucleotide markers and cost effective methods to genotype them. GS estimates the effect of thousands of DNA markers simultaneously. Nonlinear estimation methods yield higher accuracy, especially for traits with major genes. The marker effects are estimated in a genotyped and phenotyped training population and are used for the estimation of breeding values of selection candidates by combining their genotypes with the estimated marker effects. The benefits of GS are greatest when selection is for traits that are not themselves recorded on the selection candidates before they can be selected. In the future, genome sequence data may replace SNP genotypes as markers. This could increase GS accuracy because the causative mutations should be included in the data.
Background: Recent developments in SNP discovery and high throughput genotyping technology have made the use of high-density SNP markers to predict breeding values feasible. This involves estimation of the SNP effects in a training data set, and use of these estimates to evaluate the breeding values of other 'evaluation' individuals. Simulation studies have shown that these predictions of breeding values can be accurate, when training and evaluation individuals are (closely) related. However, many general applications of genomic selection require the prediction of breeding values of 'unrelated' individuals, i.e. individuals from the same population, but not particularly closely related to the training individuals.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.