BackgroundHigh density (HD) SNP genotyping arrays are an important tool for genetic analyses of animals and plants. Although the chicken is one of the most important farm animals, no HD array is yet available for high resolution genetic analysis of this species.ResultsWe report here the development of a 600 K Affymetrix® Axiom® HD genotyping array designed using SNPs segregating in a wide variety of chicken populations. In order to generate a large catalogue of segregating SNPs, we re-sequenced 243 chickens from 24 chicken lines derived from diverse sources (experimental, commercial broiler and layer lines) by pooling 10–15 samples within each line. About 139 million (M) putative SNPs were detected by mapping sequence reads to the new reference genome (Gallus_gallus_4.0) of which ~78 M appeared to be segregating in different lines. Using criteria such as high SNP-quality score, acceptable design scores predicting high conversion performance in the final array and uniformity of distribution across the genome, we selected ~1.8 M SNPs for validation through genotyping on an independent set of samples (n = 282). About 64% of the SNPs were polymorphic with high call rates (>98%), good cluster separation and stable Mendelian inheritance. Polymorphic SNPs were further analysed for their population characteristics and genomic effects. SNPs with extreme breach of Hardy-Weinberg equilibrium (P < 0.00001) were excluded from the panel. The final array, designed on the basis of these analyses, consists of 580,954 SNPs and includes 21,534 coding variants. SNPs were selected to achieve an essentially uniform distribution based on genetic map distance for both broiler and layer lines. Due to a lower extent of LD in broilers compared to layers, as reported in previous studies, the ratio of broiler and layer SNPs in the array was kept as 3:2. The final panel was shown to genotype a wide range of samples including broilers and layers with over 100 K to 450 K informative SNPs per line. A principal component analysis was used to demonstrate the ability of the array to detect the expected population structure which is an important pre-investigation step for many genome-wide analyses.ConclusionsThis Affymetrix® Axiom® array is the first SNP genotyping array for chicken that has been made commercially available to the public as a product. This array is expected to find widespread usage both in research and commercial application such as in genomic selection, genome-wide association studies, selection signature analyses, fine mapping of QTLs and detection of copy number variants.
Maize (Zea mays L.) serves as model plant for heterosis research and is the crop where hybrid breeding was pioneered. We analyzed genomic and phenotypic data of 1254 hybrids of a typical maize hybrid breeding program based on the important Dent 3 Flint heterotic pattern. Our main objectives were to investigate genome properties of the parental lines (e.g., allele frequencies, linkage disequilibrium, and phases) and examine the prospects of genomic prediction of hybrid performance. We found high consistency of linkage phases and large differences in allele frequencies between the Dent and Flint heterotic groups in pericentromeric regions. These results can be explained by the Hill-Robertson effect and support the hypothesis of differential fixation of alleles due to pseudooverdominance in these regions. In pericentromeric regions we also found indications for consistent marker-QTL linkage between heterotic groups. With prediction methods GBLUP and BayesB, the cross-validation prediction accuracy ranged from 0.75 to 0.92 for grain yield and from 0.59 to 0.95 for grain moisture. The prediction accuracy of untested hybrids was highest, if both parents were parents of other hybrids in the training set, and lowest, if none of them were involved in any training set hybrid. Optimizing the composition of the training set in terms of number of lines and hybrids per line could further increase prediction accuracy. We conclude that genomic prediction facilitates a paradigm shift in hybrid breeding by focusing on the performance of experimental hybrids rather than the performance of parental lines in testcrosses.H YBRID breeding was pioneered in maize (Shull 1908) and plays an ever increasing role in other globally important field (Duvick 1999) and vegetable crops (Silva Dias 2010). Maize has also served as a model species for research in heterosis, the phenomenon behind the success of hybrid varieties, for which the genetic mechanisms have been elusive (Duvick 1999;Lippman and Zamir 2006). In recent years, evidence emerged for the importance of (pseudo-)overdominance in the manifestation of heterosis in maize (Lippman and Zamir 2006;Schön et al. 2010) and the particular role of the centromeres in this process (Gore et al. 2009;McMullen et al. 2009). Today, the availability of high-density marker data and whole-genome regression methods developed in the context of genomic prediction (Meuwissen et al. 2001) allows us to revisit this hypothesis by studying key genome properties such as allele frequencies and linkage phases.Consistency of linkage phases between quantitative trait loci (QTL) and markers is a key prerequisite for pooling of diverse breeds and germplams to increase sample size for genetic studies and transferability of their results to different populations (De Roos et al. 2008). Weber et al. (2012) used whole-genome estimates of marker effects of several cattle breeds to investigate across-breed marker-QTL linkage phase consistency. Such a study is still missing for maize and other important crops. For o...
This is the first large-scale experimental study on genome-based prediction of testcross values in an advanced cycle breeding population of maize. The study comprised testcross progenies of 1,380 doubled haploid lines of maize derived from 36 crosses and phenotyped for grain yield and grain dry matter content in seven locations. The lines were genotyped with 1,152 single nucleotide polymorphism markers. Pedigree data were available for three generations. We used best linear unbiased prediction and stratified cross-validation to evaluate the performance of prediction models differing in the modeling of relatedness between inbred lines and in the calculation of genome-based coefficients of similarity. The choice of similarity coefficient did not affect prediction accuracies. Models including genomic information yielded significantly higher prediction accuracies than the model based on pedigree information alone. Average prediction accuracies based on genomic data were high even for a complex trait like grain yield (0.72-0.74) when the cross-validation scheme allowed for a high degree of relatedness between the estimation and the test set. When predictions were performed across distantly related families, prediction accuracies decreased significantly (0.47-0.48). Prediction accuracies decreased with decreasing sample size but were still high when the population size was halved (0.67-0.69). The results from this study are encouraging with respect to genome-based prediction of the genetic value of untested lines in advanced cycle breeding populations and the implementation of genomic selection in the breeding process.
Predicting organismal phenotypes from genotype data is important for plant and animal breeding, medicine, and evolutionary biology. Genomic-based phenotype prediction has been applied for single-nucleotide polymorphism (SNP) genotyping platforms, but not using complete genome sequences. Here, we report genomic prediction for starvation stress resistance and startle response in Drosophila melanogaster, using ∼2.5 million SNPs determined by sequencing the Drosophila Genetic Reference Panel population of inbred lines. We constructed a genomic relationship matrix from the SNP data and used it in a genomic best linear unbiased prediction (GBLUP) model. We assessed predictive ability as the correlation between predicted genetic values and observed phenotypes by cross-validation, and found a predictive ability of 0.239±0.008 (0.230±0.012) for starvation resistance (startle response). The predictive ability of BayesB, a Bayesian method with internal SNP selection, was not greater than GBLUP. Selection of the 5% SNPs with either the highest absolute effect or variance explained did not improve predictive ability. Predictive ability decreased only when fewer than 150,000 SNPs were used to construct the genomic relationship matrix. We hypothesize that predictive power in this population stems from the SNP–based modeling of the subtle relationship structure caused by long-range linkage disequilibrium and not from population structure or SNPs in linkage disequilibrium with causal variants. We discuss the implications of these results for genomic prediction in other organisms.
Human driven selection during domestication and subsequent breed formation has likely left detectable signatures within the genome of modern cattle. The elucidation of these signatures of selection is of interest from the perspective of evolutionary biology, and for identifying domestication-related genes that ultimately may help to further genetically improve this economically important animal. To this end, we employed a panel of more than 15 million autosomal SNPs identified from re-sequencing of 43 Fleckvieh animals. We mainly applied two somewhat complementary statistics, the integrated Haplotype Homozygosity Score (iHS) reflecting primarily ongoing selection, and the Composite of Likelihood Ratio (CLR) having the most power to detect completed selection after fixation of the advantageous allele. We find 106 candidate selection regions, many of which are harboring genes related to phenotypes relevant in domestication, such as coat coloring pattern, neurobehavioral functioning and sensory perception including KIT, MITF, MC1R, NRG4, Erbb4, TMEM132D and TAS2R16, among others. To further investigate the relationship between genes with signatures of selection and genes identified in QTL mapping studies, we use a sample of 3062 animals to perform four genome-wide association analyses using appearance traits, body size and somatic cell count. We show that regions associated with coat coloring significantly (P<0.0001) overlap with the candidate selection regions, suggesting that the selection signals we identify are associated with traits known to be affected by selection during domestication. Results also provide further evidence regarding the complexity of the genetics underlying coat coloring in cattle. This study illustrates the potential of population genetic approaches for identifying genomic regions affecting domestication-related phenotypes and further helps to identify specific regions targeted by selection during speciation, domestication and breed formation of cattle. We also show that Linkage Disequilibrium (LD) decays in cattle at a much faster rate than previously thought.
Utilizing the whole genomic variation of complex traits to predict the yet-to-be observed phenotypes or unobserved genetic values via whole genome prediction (WGP) and to infer the underlying genetic architecture via genome wide association study (GWAS) is an interesting and fast developing area in the context of human disease studies as well as in animal and plant breeding. Though thousands of significant loci for several species were detected via GWAS in the past decade, they were not used directly to improve WGP due to lack of proper models. Here, we propose a generalized way of building trait-specific genomic relationship matrices which can exploit GWAS results in WGP via a best linear unbiased prediction (BLUP) model for which we suggest the name BLUP|GA. Results from two illustrative examples show that using already existing GWAS results from public databases in BLUP|GA improved the accuracy of WGP for two out of the three model traits in a dairy cattle data set, and for nine out of the 11 traits in a rice diversity data set, compared to the reference methods GBLUP and BayesB. While BLUP|GA outperforms BayesB, its required computing time is comparable to GBLUP. Further simulation results suggest that accounting for publicly available GWAS results is potentially more useful for WGP utilizing smaller data sets and/or traits of low heritability, depending on the genetic architecture of the trait under consideration. To our knowledge, this is the first study incorporating public GWAS results formally into the standard GBLUP model and we think that the BLUP|GA approach deserves further investigations in animal breeding, plant breeding as well as human genetics.
This study presents a second generation of linkage disequilibrium (LD) map statistics for the whole genome of the Holstein-Friesian population, which has a four times higher resolution compared with that of the maps available so far. We used DNA samples of 810 German Holstein-Friesian cattle genotyped by the Illumina Bovine SNP50K BeadChip to analyse LD structure. A panel of 40 854 (75.6%) markers was included in the final analysis. The pairwise r(2) statistic of SNPs up to 5 Mb apart across the genome was estimated. A mean value of r(2) = 0.30 +/- 0.32 was observed in pairwise distances of <25 kb and it dropped to 0.20 +/- 0.24 at 50-75 kb, which is nearly the average inter-marker space in this study. The proportion of SNPs in useful LD (r(2) > or = 0.25) was 26% for the distance of 50 and 75 kb between SNPs. We found a lower level of LD for SNP pairs at the distance < or =100 kb than previously thought. Analysis revealed 712 haplo-blocks spanning 4.7% of the genome and containing 8.0% of all SNPs. Mean and median block length were estimated as 164 +/- 117 kb and 144 kb respectively. Allele frequencies of the SNPs have a considerable and systematic impact on the estimate of r(2). It is shown that minimizing the allele frequency difference between SNPs reduces the influence of frequency on r(2) estimates. Analysis of past effective population size based on the direct estimates of recombination rates from SNP data showed a decline in effective population size to N(e) = 103 up to approximately 4 generations ago. Systematic effects of marker density and effective population size on observed LD and haplotype structure are discussed.
The data from the newly available 50 K SNP chip was used for tagging the genome-wide footprints of positive selection in Holstein-Friesian cattle. For this purpose, we employed the recently described Extended Haplotype Homozygosity test, which detects selection by measuring the characteristics of haplotypes within a single population. To assess formally the significance of these results, we compared the combination of frequency and the Relative Extended Haplotype Homozygosity value of each core haplotype with equally frequent haplotypes across the genome. A subset of the putative regions showing the highest significance in the genome-wide EHH tests was mapped. We annotated genes to identify possible influence they have in beneficial traits by using the Gene Ontology database. A panel of genes, including FABP3, CLPN3, SPERT, HTR2A5, ABCE1, BMP4 and PTGER2, was detected, which overlapped with the most extreme P-values. This panel comprises some interesting candidate genes and QTL, representing a broad range of economically important traits such as milk yield and composition, as well as reproductive and behavioural traits. We also report high values of linkage disequilibrium and a slower decay of haplotype homozygosity for some candidate regions harbouring major genes related to dairy quality. The results of this study provide a genome-wide map of selection footprints in the Holstein genome, and can be used to better understand the mechanisms of selection in dairy cattle breeding.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.