Core Ideas Pumpkin, with rich nutritional compounds, is a staple food in many developing countries. Genomic prediction is a practical tool for hybrid performance evaluation in plant breeding. Genomic prediction can reduce cost and accelerate a breeding program. Genomic prediction applied to select potential hybrid combinations and superior parental lines. Genomic prediction has become an increasingly popular tool for hybrid performance evaluation in plant breeding mainly because that it can reduce cost and accelerate a breeding program. In this study, we propose a systematic procedure to predict hybrid performance using a genomic selection (GS) model that takes both additive and dominance marker effects into account. We first demonstrate the advantage of the additive–dominance effects model over the only additive effects model through a simulation study. Based on the additive–dominance model, we predict genomic estimated breeding values (GEBVs) for individual hybrid combinations and their parental lines. The GEBV‐based specific combining ability (SCA) for each hybrid and general combining ability (GCA) for its parental lines are then derived to quantify the degree of midparent heterosis (MPH) or better‐parent heterosis (BPH) of the hybrid. Finally, we estimate the variance components resulting from additive and dominance gene action effects and heritability using a genomic best linear unbiased predictor (g‐BLUP) model. These estimates are used to justify the results of the genomic prediction study. A pumpkin (Cucurbita spp.) data set is given to illustrate the provided procedure. The data set consists of 320 parental lines with 61,179 collected single nucleotide polymorphism (SNP) markers; 119, 120, and 120 phenotypic values of hybrids on three quantitative traits within C.maxima Duchesne; and 89, 111, and 90 phenotypic values of hybrids on the same three quantitative traits within C. moshata Dechesne.
Background Genomic prediction (GP) based on single nucleotide polymorphisms (SNP) has become a broadly used tool to increase the gain of selection in plant breeding. However, using predictors that are biologically closer to the phenotypes such as transcriptome and metabolome may increase the prediction ability in GP. The objectives of this study were to (i) assess the prediction ability for three yield-related phenotypic traits using different omic datasets as single predictors compared to a SNP array, where these omic datasets included different types of sequence variants (full-SV, deleterious-dSV, and tolerant-tSV), different types of transcriptome (expression presence/absence variation-ePAV, gene expression-GE, and transcript expression-TE) sampled from two tissues, leaf and seedling, and metabolites (M); (ii) investigate the improvement in prediction ability when combining multiple omic datasets information to predict phenotypic variation in barley breeding programs; (iii) explore the predictive performance when using SV, GE, and ePAV from simulated 3’end mRNA sequencing of different lengths as predictors. Results The prediction ability from genomic best linear unbiased prediction (GBLUP) for the three traits using dSV information was higher than when using tSV, all SV information, or the SNP array. Any predictors from the transcriptome (GE, TE, as well as ePAV) and metabolome provided higher prediction abilities compared to the SNP array and SV on average across the three traits. In addition, some (di)-similarity existed between different omic datasets, and therefore provided complementary biological perspectives to phenotypic variation. Optimal combining the information of dSV, TE, ePAV, as well as metabolites into GP models could improve the prediction ability over that of the single predictors alone. Conclusions The use of integrated omic datasets in GP model is highly recommended. Furthermore, we evaluated a cost-effective approach generating 3’end mRNA sequencing with transcriptome data extracted from seedling without losing prediction ability in comparison to the full-length mRNA sequencing, paving the path for the use of such prediction methods in commercial breeding programs.
In human genetics, several studies have shown that phenotypic variation is more likely to be caused by structural variants (SV) than by single nucleotide variants (SNV). However, accurate while cost-efficient discovery of SV in complex genomes remains challenging. The objectives of our study were to (i) facilitate SV discovery studies by benchmarking SV callers and their combinations with respect to their sensitivity and precision to detect SV in the barley genome, (ii) characterize the occurrence and distribution of SV clusters in the genomes of 23 barley inbreds that are the parents of a unique resource for mapping quantitative traits, the double round robin population, (iii) quantify the association of SV clusters with transcript abundance, and (iv) evaluate the use of SV clusters for the prediction of phenotypic traits. In our computer simulations based on a sequencing coverage of 25x, a sensitivity >70% and precision >95% was observed for all combinations of SV types and SV length categories if the best combination of SV callers was used. We observed a significant (P < 0.05) association of gene-associated SV clusters with global gene-specific gene expression. Furthermore, about 9% of all SV clusters that were within 5kb of a gene were significantly (P < 0.05) associated with the gene expression of the corresponding gene. The prediction ability of SV clusters was higher compared to that of single nucleotide polymorphisms from an array across the seven studied phenotypic traits. These findings suggest the usefulness of exploiting SV information when fine mapping and cloning the causal genes underlying quantitative traits as well as the high potential of using SV clusters for the prediction of phenotypes in diverse germplasm sets.
Quantitative trait loci (QTL) hotspots (genomic locations enriched in QTL) are a common and notable feature when collecting many QTL for various traits in many areas of biological studies. The QTL hotspots are important and attractive since they are highly informative and may harbor genes for the quantitative traits. So far, the current statistical methods for QTL hotspot detection use either the individual-level data from the genetical genomics experiments or the summarized data from public QTL databases to proceed with the detection analysis. These methods may suffer from the problems of ignoring the correlation structure among traits, neglecting the magnitude of LOD scores for the QTL, or paying a very high computational cost, which often lead to detection of excessive spurious hotspots, failure to discover biologically interesting hotspots composed of a small to moderate number of QTL with strong LOD scores, and computational intractability, respectively, during the detection process. In this article, we describe a statistical framework that can handle both types of data as well as address all the problems at a time for QTL hotspot detection. Our statistical framework directly operates on the QTL matrix and hence has a very cheap computational cost, and is deployed to take advantage of the QTL mapping results for assisting the detection analysis. Two special devices, trait grouping and top γn,α profile, are introduced into the framework. The trait grouping attempts to group the traits controlled by closely linked or pleiotropic QTL together into the same trait groups, and randomly allocate these QTL together across the genomic positions separately by trait group to account for the correlation structure among traits, so as to have the ability to obtain much stricter thresholds and dismiss spurious hotspots. The top γn,α profile is designed to outline the LOD-score pattern of QTL in a hotspot across the different hotspot architectures, so that it can serve to identify and characterize the types of QTL hotspots with varying sizes and LOD-score distributions. Real examples, numerical analysis and simulation study are performed to validate our statistical framework, investigate the detection properties, and also compare with the current methods in QTL hotspot detection. The results demonstrate that the proposed statistical framework can effectively accommodate the correlation structure among traits, identify the types of hotspots and still keep the notable features of easy implementation and fast computation for practical QTL hotspot detection.
Grain number, size and weight primarily determine the yield of barley. Although the genes regulating grain number are well studied in barley, the genetic loci and the causal gene for sink capacity are poorly understood. Therefore, the primary objective of our work was to dissect the genetic architecture of grain size and weight in barley. We used a multi-parent population developed from a genetic cross between 23 diverse barley inbreds in a double round-robin design. Seed size-related parameters such as grain length, grain width, grain area and thousand-grain weight were evaluated in the HvDRR population comprising 45 recombinant inbred line sub-populations. We found significant genotypic variation for all seed size characters and observed 84 % or higher heritability across four environments. The quantitative trait locus (QTL) detection results indicate that the genetic architecture of grain size is more complex than reported previously. In addition, both cultivars and landraces contributed positive alleles at grain size QTLs. Candidate genes identified using genome-wide variant calling data for all parental inbred lines indicated overlapping and potential novel regulators of grain size in cereals. Furthermore, our results indicated that sink capacity was the primary determinant of grain weight in barley.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.