Genome-wide association studies (GWAS) based on high throughput SNP genotyping technologies open a broad avenue for exploring genes associated with milk production traits in dairy cattle. Motivated by pinpointing novel quantitative trait nucleotide (QTN) across Bos Taurus genome, the present study is to perform GWAS to identify genes affecting milk production traits using current state-of-the-art SNP genotyping technology, i.e., the Illumina BovineSNP50 BeadChip. In the analyses, the five most commonly evaluated milk production traits are involved, including milk yield (MY), milk fat yield (FY), milk protein yield (PY), milk fat percentage (FP) and milk protein percentage (PP). Estimated breeding values (EBVs) of 2,093 daughters from 14 paternal half-sib families are considered as phenotypes within the framework of a daughter design. Association tests between each trait and the 54K SNPs are achieved via two different analysis approaches, a paternal transmission disequilibrium test (TDT)-based approach (L1-TDT) and a mixed model based regression analysis (MMRA). In total, 105 SNPs were detected to be significantly associated genome-wise with one or multiple milk production traits. Of the 105 SNPs, 38 were commonly detected by both methods, while four and 63 were solely detected by L1-TDT and MMRA, respectively. The majority (86 out of 105) of the significant SNPs is located within the reported QTL regions and some are within or close to the reported candidate genes. In particular, two SNPs, ARS-BFGL-NGS-4939 and BFGL-NGS-118998, are located close to the DGAT1 gene (160bp apart) and within the GHR gene, respectively. Our findings herein not only provide confirmatory evidences for previously findings, but also explore a suite of novel SNPs associated with milk production traits, and thus form a solid basis for eventually unraveling the causal mutations for milk production traits in dairy cattle.
BackgroundWith the availability of high density whole-genome single nucleotide polymorphism chips, genomic selection has become a promising method to estimate genetic merit with potentially high accuracy for animal, plant and aquaculture species of economic importance. With markers covering the entire genome, genetic merit of genotyped individuals can be predicted directly within the framework of mixed model equations, by using a matrix of relationships among individuals that is derived from the markers. Here we extend that approach by deriving a marker-based relationship matrix specifically for the trait of interest.Methodology/Principal FindingsIn the framework of mixed model equations, a new best linear unbiased prediction (BLUP) method including a trait-specific relationship matrix (TA) was presented and termed TABLUP. The TA matrix was constructed on the basis of marker genotypes and their weights in relation to the trait of interest. A simulation study with 1,000 individuals as the training population and five successive generations as candidate population was carried out to validate the proposed method. The proposed TABLUP method outperformed the ridge regression BLUP (RRBLUP) and BLUP with realized relationship matrix (GBLUP). It performed slightly worse than BayesB with an accuracy of 0.79 in the standard scenario.Conclusions/SignificanceThe proposed TABLUP method is an improvement of the RRBLUP and GBLUP method. It might be equivalent to the BayesB method but it has additional benefits like the calculation of accuracies for individual breeding values. The results also showed that the TA-matrix performs better in predicting ability than the classical numerator relationship matrix and the realized relationship matrix which are derived solely from pedigree or markers without regard to the trait. This is because the TA-matrix not only accounts for the Mendelian sampling term, but also puts the greater emphasis on those markers that explain more of the genetic variance in the trait.
Genome-wide association (GWA) study is becoming a powerful tool in deciphering genetic basis of complex human diseases/traits. Currently, the univariate analysis is the most commonly used method to identify genes associated with a certain disease/phenotype under study. A major limitation with the univariate analysis is that it may not make use of the information of multiple correlated phenotypes, which are usually measured and collected in practical studies. The multivariate analysis has proven to be a powerful approach in linkage studies of complex diseases/traits, but it has received little attention in GWA. In this study, we aim to develop a bivariate analytical method for GWAS, which can be used for a complex situation that a continuous trait and a binary trait measured are under study. Based on the modified extended generalized estimating equation (EGEE) method we proposed herein, we assessed the performance of our bivariate analyses through extensive simulations as well as real data analyses. In the study, to develop an EGEE approach for bivariate genetic analyses, we combined two different generalized linear models corresponding to phenotypic variables using a Seemingly Unrelated Regression (SUR) model. The simulation results demonstrated that our EGEE-based bivariate analytical method outperforms univariate analyses in increasing statistical power under a variety of simulation scenarios. Notably, EGEE-based bivariate analyses have consistent advantages over univariate analyses whether or not there exits a phenotypic correlation between the two traits. Our study has practical importance, as one can always use multivariate analyses as a screening tool when multiple phenotypes are available, without extra costs of statistical power and false positive rate. Analyses on empirical GWA data further affirm the advantages of our bivariate analytical method.
BackgroundCopy Number Variations (CNVs) have been shown important in both normal phenotypic variability and disease susceptibility, and are increasingly accepted as another important source of genetic variation complementary to single nucleotide polymorphism (SNP). Comprehensive identification and cataloging of pig CNVs would be of benefit to the functional analyses of genome variation.ResultsIn this study, we performed a genome-wide CNV detection based on the Porcine SNP60 genotyping data of 474 pigs from three pure breed populations (Yorkshire, Landrace and Songliao Black) and one Duroc × Erhualian crossbred population. A total of 382 CNV regions (CNVRs) across genome were identified, which cover 95.76Mb of the pig genome and correspond to 4.23% of the autosomal genome sequence. The length of these CNVRs ranged from 5.03 to 2,702.7kb with an average of 250.7kb, and the frequencies of them varied from 0.42 to 20.87%. These CNVRs contains 1468 annotated genes, which possess a great variety of molecular functions, making them a promising resource for exploring the genetic basis of phenotypic variation within and among breeds. To confirmation of these findings, 18 CNVRs representing different predicted status and frequencies were chosen for validation via quantitative real time PCR (qPCR). Accordingly, 12 (66.67%) of them was successfully confirmed.ConclusionsOur results demonstrated that currently available Porcine SNP60 BeadChip can be used to capture CNVs efficiently. Our study firstly provides a comprehensive map of copy number variation in the pig genome, which would be of help for understanding the pig genome and provide preliminary foundation for investigating the association between various phenotypes and CNVs.
BackgroundCopy number variations (CNVs) confer significant effects on genetic innovation and phenotypic variation. Previous CNV studies in swine seldom focused on in-depth characterization of global CNVs.ResultsUsing whole-genome assembly comparison (WGAC) and whole-genome shotgun sequence detection (WSSD) approaches by next generation sequencing (NGS), we probed formation signatures of both segmental duplications (SDs) and individualized CNVs in an integrated fashion, building the finest resolution CNV and SD maps of pigs so far. We obtained copy number estimates of all protein-coding genes with copy number variation carried by individuals, and further confirmed two genes with high copy numbers in Meishan pigs through an enlarged population. We determined genome-wide CNV hotspots, which were significantly enriched in SD regions, suggesting evolution of CNV hotspots may be affected by ancestral SDs. Through systematically enrichment analyses based on simulations and bioinformatics analyses, we revealed CNV-related genes undergo a different selective constraint from those CNV-unrelated regions, and CNVs may be associated with or affect pig health and production performance under recent selection.ConclusionsOur studies lay out one way for characterization of CNVs in the pig genome, provide insight into the pig genome variation and prompt CNV mechanisms studies when using pigs as biomedical models for human diseases.Electronic supplementary materialThe online version of this article (doi:10.1186/1471-2164-15-593) contains supplementary material, which is available to authorized users.
Recent studies of mammalian genomes have uncovered the vast extent of copy number variations (CNVs) that contribute to phenotypic diversity. Compared to SNP, a CNV can cover a wider chromosome region, which may potentially incur substantial sequence changes and induce more significant effects on phenotypes. CNV has been becoming an alternative promising genetic marker in the field of genetic analyses. Here we firstly report an account of CNV regions in the cattle genome in Chinese Holstein population. The Illumina Bovine SNP50K Beadchips were used for screening 2047 Holstein individuals. Three different programes (PennCNV, cnvPartition and GADA) were implemented to detect potential CNVs. After a strict CNV calling pipeline, a total of 99 CNV regions were identified in cattle genome. These CNV regions cover 23.24 Mb in total with an average size of 151.69 Kb. 52 out of these CNV regions have frequencies of above 1%. 51 out of these CNV regions completely or partially overlap with 138 cattle genes, which are significantly enriched for specific biological functions, such as signaling pathway, sensory perception response and cellular processes. The results provide valuable information for constructing a more comprehensive CNV map in the cattle genome and offer an important resource for investigation of genome structure and genomic variation underlying traits of interest in cattle.
BackgroundCopy number variation (CNV) is important and widespread in the genome, and is a major cause of disease and phenotypic diversity. Herein, we performed a genome-wide CNV analysis in 12 diversified chicken genomes based on whole genome sequencing.ResultsA total of 8,840 CNV regions (CNVRs) covering 98.2 Mb and representing 9.4% of the chicken genome were identified, ranging in size from 1.1 to 268.8 kb with an average of 11.1 kb. Sequencing-based predictions were confirmed at a high validation rate by two independent approaches, including array comparative genomic hybridization (aCGH) and quantitative PCR (qPCR). The Pearson’s correlation coefficients between sequencing and aCGH results ranged from 0.435 to 0.755, and qPCR experiments revealed a positive validation rate of 91.71% and a false negative rate of 22.43%. In total, 2,214 (25.0%) predicted CNVRs span 2,216 (36.4%) RefSeq genes associated with specific biological functions. Besides two previously reported copy number variable genes EDN3 and PRLR, we also found some promising genes with potential in phenotypic variation. Two genes, FZD6 and LIMS1, related to disease susceptibility/resistance are covered by CNVRs. The highly duplicated SOCS2 may lead to higher bone mineral density. Entire or partial duplication of some genes like POPDC3 may have great economic importance in poultry breeding.ConclusionsOur results based on extensive genetic diversity provide a more refined chicken CNV map and genome-wide gene copy number estimates, and warrant future CNV association studies for important traits in chickens.Electronic supplementary materialThe online version of this article (doi:10.1186/1471-2164-15-962) contains supplementary material, which is available to authorized users.
Genomic selection has been widely implemented in national and international genetic evaluation in the dairy cattle industry, because of its potential advantages over traditional selection methods and the availability of commercial high-density (HD) single nucleotide polymorphism (SNP) panels. However, this method may not be cost-effective for cow selection and for other livestock species, because the cost of HD SNP panels is still relatively high. One possible solution that can enable other species to benefit from this promising method is genomic selection with low-density (LD) SNP panels. In this simulation study, LD SNP panels designed with different strategies and different SNP densities were compared. The effects of number of quantitative trait loci, heritability, and effective population size were evaluated in the framework of genomic selection with LD SNP panels. Methodologies of Bayesian variable selection; BLUP with a trait-specific, marker-derived relationship matrix; and BLUP with a realized relationship matrix were employed to predict genomic estimated breeding values with both HD and LD SNP panels. Up to 95% of accuracy obtained by using an HD panel can be obtained by using only a small proportion of markers. The LD panel with markers selected on the basis of their effects always performs better than the LD panel with evenly spaced markers. Both the genetic architecture of the trait and the effective population size have a significant effect on the performance of the LD panels. We concluded that, to implement genomic selection with LD panels, a training population of sufficient size and genotyped with an HD panel is necessary. The trade-off between the LD panels with evenly spaced markers and selected markers must be considered, which depends on the number of target traits in a breeding program and the genetic architecture of these traits. Genomic selection with LD panels could be feasible and cost-effective, though before implementation, a further detailed genetic and economic analysis is recommended.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.