Machine learning (ML) is perhaps the most useful tool for the interpretation of large genomic datasets. However, the performance of a single machine learning method in genomic selection (GS) is currently unsatisfactory. To improve the genomic predictions, we constructed a stacking ensemble learning framework (SELF), integrating three machine learning methods, to predict genomic estimated breeding values (GEBVs). The present study evaluated the prediction ability of SELF by analyzing three real datasets, with different genetic architecture; comparing the prediction accuracy of SELF, base learners, genomic best linear unbiased prediction (GBLUP) and BayesB. For each trait, SELF performed better than base learners, which included support vector regression (SVR), kernel ridge regression (KRR) and elastic net (ENET). The prediction accuracy of SELF was, on average, 7.70% higher than GBLUP in three datasets. Except for the milk fat percentage (MFP) traits, of the German Holstein dairy cattle dataset, SELF was more robust than BayesB in all remaining traits. Therefore, we believed that SEFL has the potential to be promoted to estimate GEBVs in other animals and plants.
The objective of the present study was to perform a genome-wide association study (GWAS) for growth curve parameters using nonlinear models that fit original weight–age records. In this study, data from 808 Chinese Simmental beef cattle that were weighed at 0, 6, 12, and 18 months of age were used to fit the growth curve. The Gompertz model showed the highest coefficient of determination (R2 = 0.954). The parameters’ mature body weight (A), time-scale parameter (b), and maturity rate (K) were treated as phenotypes for single-trait GWAS and multi-trait GWAS. In total, 9, 49, and 7 significant SNPs associated with A, b, and K were identified by single-trait GWAS; 22 significant single nucleotide polymorphisms (SNPs) were identified by multi-trait GWAS. Among them, we observed several candidate genes, including PLIN3, KCNS3, TMCO1, PRKAG3, ANGPTL2, IGF-1, SHISA9, and STK3, which were previously reported to associate with growth and development. Further research for these candidate genes may be useful for exploring the full genetic architecture underlying growth and development traits in livestock.
Genomic selection (GS) involves estimating genome estimate breeding values (GEBVs) using molecular markers spanning the whole-genome (Meuwissen et al., 2001), which is not limited to traits determined by a few major genes (Montesinos-López et al., 2019). Compared with the previous selection methods that based on pedigree information and progeny testing, GS possesses the natural advantages that the phenotype and the genomic breeding values data can be obtained as soon as the descendant arrives, which dramatically accelerates the breeding process. A large number of researches have proved that GS facilitates the rapid selection of superior genotypes and accelerates genetic gain by shortening the breeding cy-
Sex reversal has been studied extensively in vertebrate species, particularly in domestic goats, because polled intersex syndrome (PIS) has seriously affected their production efficiency. In the present study, we used histopathologically diagnosed cases of PIS to identify correlated genomic regions and variants using representative selection signatures and performed GWAS using Restriction-Site Associated Resequencing DNA. We identified 171 single-nucleotide polymorphisms (SNPs) that may have contributed to this phenotype, and 53 SNPs were determined to be located in coding regions using a general linear model. The transcriptome data sets of differentially expressed genes (DEGs) in the pituitary tissues of intersexual and nonintersexual goats were examined using high-throughput technology. A total of 10,063 DEGs and 337 long noncoding RNAs were identified. The DEGs were clustered into 56 GO categories and determined to be significantly enriched in 53 signaling pathways by KEGG analysis. In addition, according to qPCR results, PSPO2 and FSH were significantly more highly expressed in sexually mature pituitary tissues of intersexual goats compared to healthy controls (nonintersexual). These results demonstrate that certain novel potential genomic regions may be responsible for intersexual goats, and the transcriptome data indicate that the regulation of various physiological systems is involved in intersexual goat development. Therefore, these results provide helpful data for understanding the molecular mechanisms of intersex syndrome in goats.
Nowadays, advances in high-throughput sequencing benefit the increasing application of genomic prediction (GP) in breeding programs. In this research, we designed a Cosine kernel–based KRR named KCRR to perform GP. This paper assessed the prediction accuracies of 12 traits with various heritability and genetic architectures from four populations using the genomic best linear unbiased prediction (GBLUP), BayesB, support vector regression (SVR), and KCRR. On the whole, KCRR performed stably for all traits of multiple species, indicating that the hypothesis of KCRR had the potential to be adapted to a wide range of genetic architectures. Moreover, we defined a modified genomic similarity matrix named Cosine similarity matrix (CS matrix). The results indicated that the accuracies between GBLUP_kinship and GBLUP_CS almost unanimously for all traits, but the computing efficiency has increased by an average of 20 times. Our research will be a significant promising strategy in future GP.
Summary As one of the best‐known commercial goat breeds in the world, Boer goat has undergone long‐term artificial selection for nearly 100 years, and its excellent growth rate and meat production performance have attracted considerable worldwide attention. Herein, we used single nucleotide polymorphisms (SNPs) called from the whole‐genome sequencing data of 46 Australian Boer goats to detect polymorphisms and identify genomic regions related to muscle development in comparison with those of 81 non‐specialized meat goat individuals from Europe, Africa, and Asia. A total of 13 795 202 SNPs were identified, and the whole‐genome selective signal screen with a π ratio of nucleotide diversity (πcase/πcontrol) and pairwise fixation index (FST) was analyzed. Finally, we identified 1741 candidate selective windows based on the top 5% threshold of both parameters; here, 449 candidate genes were only found in 727 of these regions. A total of 433 genes out of the 449 genes obtained were annotated to 2729 gene ontology terms, of which 51 were directly linked to muscle development (e.g., muscle organ development, muscle cell differentiation) by 30 candidate genes (e.g., JAK2, KCNQ1, PDE5A, PDLIM5, TBX5). In addition, 246 signaling pathways were annotated by 178 genes, and two pathways related to muscle contraction, including vascular smooth muscle contraction (ADCY7, PRKCB, PLA2G4E, ROCK2) and cardiac muscle contraction (CACNA2D3, CASQ2, COX6B1), were identified. The results could improve the current understanding of the genetic effects of artificial selection on the muscle development of goat. More importantly, this study provides valuable candidate genes for future breeding of goats.
Background: Machine learning (ML) is perhaps the most useful for the interpretation of large genomic datasets. However, the performance of a single machine learning method in genomic selection (GS) was unsatisfactory in existing research. To improve the genomic predictions, we constructed a stacking ensemble learning framework (SELF) integrated three machine learning methods to predict genomic estimated breeding values (GEBVs). Results: We evaluated the prediction ability of SELF by three real datasets and compared the prediction accuracy of SELF, base learners, GBLUP and BayesB. For each trait, SELF performed better than base learners, which included support vector regression (SVR), kernel ridge regression (KRR) and elastic net (ENET). The prediction accuracy of SELF had an average 7.70% improvement compared with GBLUP in three datasets. Except for the milk fat percentage (MFP) traits of the German Holstein dairy cattle dataset, SELF more robust than BayesB in the remaining traits.Conclusions: In this study, we utilized a stacking ensemble learning framework (SELF) to genomic prediction and it performed much better than GBLUP and BayesB in three real datasets with different genetic architecture. Therefore, we believed SEFL had the potential to be promoted to estimate GEBVs in other animals and plants.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.