We report mapping of a quantitative trait locus (QTL) with a major effect on bovine stature to a ∼780-kb interval using a Hidden Markov Model-based approach that simultaneously exploits linkage and linkage disequilibrium. We re-sequenced the interval in six sires with known QTL genotype and identified 13 clustered candidate quantitative trait nucleotides (QTNs) out of >9,572 discovered variants. We eliminated five candidate QTNs by studying the phenotypic effect of a recombinant haplotype identified in a breed diversity panel. We show that the QTL influences fetal expression of seven of the nine genes mapping to the ∼780-kb interval. We further show that two of the eight candidate QTNs, mapping to the PLAG1-CHCHD7 intergenic region, influence bidirectional promoter strength and affect binding of nuclear factors. By performing expression QTL analyses, we identified a splice site variant in CHCHD7 and exploited this naturally occurring null allele to exclude CHCHD7 as single causative gene.
Faithful reconstruction of haplotypes from diploid marker data (phasing) is important for many kinds of genetic analyses, including mapping of trait loci, prediction of genomic breeding values, and identification of signatures of selection. In human genetics, phasing most often exploits population information (linkage disequilibrium), while in animal genetics the primary source of information is familial (Mendelian segregation and linkage). We herein develop and evaluate a method that simultaneously exploits both sources of information. It builds on hidden Markov models that were initially developed to exploit population information only. We demonstrate that the approach improves the accuracy of allele phasing as well as imputation of missing genotypes. Reconstructed haplotypes are assigned to hidden states that are shown to correspond to clusters of genealogically related chromosomes. We show that these cluster states can directly be used to fine map QTL. The method is computationally effective at handling large data sets based on high-density SNP panels. Present-day genotyping platforms do not directly provide information about linkage phase; i.e., co-inherited alleles at adjacent heterozygous markers (haplotypes) are not identified as such. As haplotype information may considerably empower genetic analyses, indirect phasing strategies have been devised: haplotypes can be reconstructed from unphased genotypes using either familial information (Mendelian segregation and linkage) and/ or population information (linkage disequilibrium, LD, and surrogate parents) (e.g., Windig and Meuwissen 2004;Scheet and Stephens 2006;Kong et al. 2008).Haplotype-based approaches are routinely applied in animal genetics for combined linkage and LD mapping of QTL (e.g., Meuwissen and Goddard 2000;Blott et al. 2003). In these studies, phasing has so far relied on familial information provided by the extended pedigrees typical of livestock (e.g., Windig and Meuwissen 2004). This approach, however, leaves a nonnegligible proportion of genotypes unphased, especially for the less connected individuals. After phasing, identity-bydescent (IBD) probabilities conditional on haplotype data-needed for QTL mapping-are computed for all chromosome pairs, using familial as well as population information (hence combined linkage and LD mapping -L 1 LD) (e.g., . However, the use of high-density SNP chips and the analysis of ever larger cohorts render the computation of pairwise IBD probabilities a bottleneck.We herein propose a more efficient, heuristic approach based on hidden Markov models (HMM). It simultaneously phases and sorts haplotypes in clusters that can be used directly for mapping or other purposes. The proposed method exploits familial as well as population information, and imputes missing genotypes. We herein describe the accuracy of the proposed method and its use for L 1 LD mapping of QTL. MATERIALS AND METHODSHaplotype reconstruction and clustering: Due to systematic recording of familial relationship in domestic animals, i...
Genomic prediction from whole-genome sequence data is attractive, as the accuracy of genomic prediction is no longer bounded by extent of linkage disequilibrium between DNA markers and causal mutations affecting the trait, given the causal mutations are in the data set. A cost-effective strategy could be to sequence a small proportion of the population, and impute sequence data to the rest of the reference population. Here, we describe strategies for selecting individuals for sequencing, based on either pedigree relationships or haplotype diversity. Performance of these strategies (number of variants detected and accuracy of imputation) were evaluated in sequence data simulated through a real Belgian Blue cattle pedigree. A strategy (AHAP), which selected a subset of individuals for sequencing that maximized the number of unique haplotypes (from single-nucleotide polymorphism panel data) sequenced gave good performance across a range of variant minor allele frequencies. We then investigated the optimum number of individuals to sequence by fold coverage given a maximum total sequencing effort. At 600 total fold coverage (x 600), the optimum strategy was to sequence 75 individuals at eightfold coverage. Finally, we investigated the accuracy of genomic predictions that could be achieved. The advantage of using imputed sequence data compared with dense SNP array genotypes was highly dependent on the allele frequency spectrum of the causative mutations affecting the trait. When this followed a neutral distribution, the advantage of the imputed sequence data was small; however, when the causal mutations all had low minor allele frequencies, using the sequence data improved the accuracy of genomic prediction by up to 30%.
Several functions were used to model the fixed part of the lactation curve and genetic parameters of milk test-day records to estimate using French Holstein data. Parametric curves (Legendre polynomials, Ali-Schaeffer curve, Wilmink curve), fixed classes curves (5-d classes), and regression splines were tested. The latter were appealing because they adjusted the data well, were relatively insensitive to outliers, were flexible, and resulted in smooth curves without requiring the estimation of a large number of parameters. Genetic parameters were estimated with an Average Information REML algorithm where the average information matrix and the first derivatives of the likelihood functions were pooled over 10 samples. This approach made it possible to handle larger data sets. The residual variance was modeled as a quadratic function of days in milk. Quartic Legendre polynomials were used to estimate (co)variances of random effects. The estimates were within the range of most other studies. The greatest genetic variance was in the middle of the lactation while residual and permanent environmental variances mostly decreased during the lactation. The resulting heritability ranged from 0.15 to 0.40. The genetic correlation between the extreme parts of the lactation was 0.35 but genetic correlations were higher than 0.90 for a large part of the lactation. The use of the pooling approach resulted in smaller standard errors for the genetic parameters when compared to those obtained with a single sample.
BackgroundSize of the reference population and reliability of phenotypes are crucial factors influencing the reliability of genomic predictions. It is therefore useful to combine closely related populations. Increased accuracies of genomic predictions depend on the number of individuals added to the reference population, the reliability of their phenotypes, and the relatedness of the populations that are combined.MethodsThis paper assesses the increase in reliability achieved when combining four Holstein reference populations of 4000 bulls each, from European breeding organizations, i.e. UNCEIA (France), VikingGenetics (Denmark, Sweden, Finland), DHV-VIT (Germany) and CRV (The Netherlands, Flanders). Each partner validated its own bulls using their national reference data and the combined data, respectively.ResultsCombining the data significantly increased the reliability of genomic predictions for bulls in all four populations. Reliabilities increased by 10%, compared to reliabilities obtained with national reference populations alone, when they were averaged over countries and the traits evaluated. For different traits and countries, the increase in reliability ranged from 2% to 19%.ConclusionsGenomic selection programs benefit greatly from combining data from several closely related populations into a single large reference population.
Colour sidedness is a dominantly inherited phenotype of cattle characterized by the polarization of pigmented sectors on the flanks, snout and ear tips. It is also referred to as 'lineback' or 'witrik' (which means white back), as colour-sided animals typically display a white band along their spine. Colour sidedness is documented at least since the Middle Ages and is presently segregating in several cattle breeds around the globe, including in Belgian blue and brown Swiss. Here we report that colour sidedness is determined by a first allele on chromosome 29 (Cs(29)), which results from the translocation of a 492-kilobase chromosome 6 segment encompassing KIT to chromosome 29, and a second allele on chromosome 6 (Cs(6)), derived from the first by repatriation of fused 575-kilobase chromosome 6 and 29 sequences to the KIT locus. We provide evidence that both translocation events involved circular intermediates. This is the first example, to our knowledge, of a phenotype determined by homologous yet non-syntenic alleles that result from a novel copy-number-variant-generating mechanism.
Inbreeding results from the mating of related individuals and may be associated with reduced fitness because it brings together deleterious variants in one individual. In general, inbreeding is estimated with respect to an arbitrary base population consisting of ancestors that are assumed unrelated. We herein propose a model-based approach to estimate and characterize individual inbreeding at both global and local genomic scales by assuming the individual genome is a mosaic of homozygous-by-descent (HBD) and non-HBD segments. The HBD segments may originate from ancestors tracing back to different periods in the past defining distinct age-related classes. The lengths of the HBD segments are exponentially distributed with class-specific parameters reflecting that inbreeding of older origin generates on average shorter stretches of observed homozygous markers. The model is implemented in a hidden Markov model framework that uses marker allele frequencies, genetic distances, genotyping error rates and the sequences of observed genotypes. Note that genotyping errors, low-fold sequencing or genotype-by-sequencing data are easily accommodated under this framework. Based on simulations under the inference model, we show that the genomewide inbreeding coefficients and the parameters of the model are accurately estimated. In addition, when several inbreeding classes are simulated, the model captures them if their ages are sufficiently different. Complementary analyses, either on data sets simulated under more realistic models or on human, dog and sheep real data, illustrate the range of applications of the approach and how it can reveal recent demographic histories among populations (e.g., very recent bottlenecks or founder effects). The method also allows to clearly identify individuals resulting from extreme consanguineous matings.
In dairy cattle, the widespread use of artificial insemination has resulted in increased selection intensity, which has led to spectacular increase in productivity. However, cow fertility has concomitantly severely declined. It is generally assumed that this reduction is primarily due to the negative energy balance of high-producing cows at the peak of lactation. We herein describe the fine-mapping of a major fertility QTL in Nordic Red cattle, and identify a 660-kb deletion encompassing four genes as the causative variant. We show that the deletion is a recessive embryonically lethal mutation. This probably results from the loss of RNASEH2B, which is known to cause embryonic death in mice. Despite its dramatic effect on fertility, 13%, 23% and 32% of the animals carry the deletion in Danish, Swedish and Finnish Red Cattle, respectively. To explain this, we searched for favorable effects on other traits and found that the deletion has strong positive effects on milk yield. This study demonstrates that embryonic lethal mutations account for a non-negligible fraction of the decline in fertility of domestic cattle, and that associated positive effects on milk yield may account for part of the negative genetic correlation. Our study adds to the evidence that structural variants contribute to animal phenotypic variation, and that balancing selection might be more common in livestock species than previously appreciated.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.