At present, single-trait best linear unbiased prediction (BLUP) is the standard method for genetic selection in soybean. However, when genetic selection is performed based on two or more genetically correlated traits and these are analyzed individually, selection bias may arise. Under these conditions, considering the correlation structure between the evaluated traits may provide more-accurate genetic estimates for the evaluated parameters, even under environmental influences. The present study was thus developed to examine the efficiency and applicability of multi-trait multi-environment (MTME) models by the residual maximum likelihood (REML/BLUP) and Bayesian approaches in the genetic selection of segregating soybean progeny. The study involved data pertaining to 203 soybean F 2:4 progeny assessed in two environments for the following traits: number of days to maturity (DM), 100-seed weight (SW), and average seed yield per plot (SY). Variance components and genetic and non-genetic parameters were estimated via the REML/BLUP and Bayesian methods. The variance components estimated and the breeding values and genetic gains predicted with selection through the Bayesian procedure were similar to those obtained by REML/BLUP. The frequentist and Bayesian MTME models provided higher estimates of broad-sense heritability per plot (or heritability of total effects of progeny; ) and mean accuracy of progeny than their respective single-trait versions. Bayesian analysis provided the credibility intervals for the estimates of . Therefore, MTME led to greater predicted gains from selection. On this basis, this procedure can be efficiently applied in the genetic selection of segregating soybean progeny.
Flowering is an important agronomic trait. Quantile regression (QR) can be used to fit models for all portions of a probability distribution. In Genome-wide association studies (GWAS), QR can estimate SNP (Single Nucleotide Polymorphism) effects on each quantile of interest. The objectives of this study were to estimate genetic parameters and to use QR to identify genomic regions for phenological traits (Days to first flower—DFF; Days for flowering—DTF; Days to end of flowering—DEF) in common bean. A total of 80 genotypes of common beans, with 3 replicates were raised at 4 locations and seasons. Plants were genotyped for 384 SNPs. Traditional single-SNP and 9 QR models, ranging from equally spaced quantiles (τ) 0.1 to 0.9, were used to associate SNPs to phenotype. Heritabilities were moderate high, ranging from 0.32 to 0.58. Genetic and phenotypic correlations were all high, averaging 0.66 and 0.98, respectively. Traditional single-SNP GWAS model was not able to find any SNP-trait association. On the other hand, when using QR methodology considering one extreme quantile (τ = 0.1) we found, respectively 1 and 7, significant SNPs associated for DFF and DTF. Significant SNPs were found on Pv01, Pv02, Pv03, Pv07, Pv10 and Pv11 chromosomes. We investigated potential candidate genes in the region around these significant SNPs. Three genes involved in the flowering pathways were identified, including Phvul.001G214500, Phvul.007G229300 and Phvul.010G142900.1 on Pv01, Pv07 and Pv10, respectively. These results indicate that GWAS-based QR was able to enhance the understanding on genetic architecture of phenological traits (DFF and DTF) in common bean.
RESUMO -O objetivo principal neste estudo foi analisar a influência de variáveis técnicas e econômicas sobre os índices de eficiência técnica de produtores de leite de Minas Gerais ao longo de pontos distintos da distribuição dos índices de eficiência utilizando-se a técnica de regressão quantílica. Os índices de eficiência técnica foram estimados com base em um modelo de fronteira estocástica utilizando-se dados de 875 produtores de leite do estado de Minas Gerais coletados no ano de 2005. Os principais resultados revelaram, na fronteira de produção, que possivelmente está havendo utilização extensiva do fator terra.De modo geral, a variável percentual de vacas em lactação foi a mais relevante na explicação da eficiência técnica em todos os quantis estudados, enquanto o percentual de mão-de-obra familiar utilizado foi importante para explicar apenas os menores níveis de eficiência. Além disso, foi encontrada diferença significativa entre os coeficientes estimados dos quantis em estudo, o que mostra que as variáveis explicativas não têm o mesmo impacto no aumento da eficiência em todos os pontos da distribuição.Palavras-chave: determinantes da eficiência, fronteira estocástica, pecuária leiteira Technical efficiency of milk production in Minas Gerais: an application of quantile regressionABSTRACT -The objective of this study was to evaluate the influence of technical and economic variables on the indices of technical efficiency of milk from Minas Gerais throughout distinct points of distribution of the efficiency indices by the technique of quantile regression. The technical efficiency indices were estimated based on a stochastic frontier model, using data from 875 milk producers in Minas Gerais state, Brazil, collected in 2005. The main results from production frontier showed the extensive use of the land factor. Overall, the variable percentage of lactating cows was the more relevant in explaining technical efficiency in all analyzed quantiles, whereas the percentage of household labor was important to explain only the lower levels of efficiency. Moreover, significant differences between the estimated coefficients of the quantiles were found in the study, which showed that the explanatory variables do not have the same impactation on increasing the efficiency at all points of distribution.
Genomic selection (GS) emphasizes the simultaneous prediction of the genetic effects of thousands of scattered markers over the genome. Several statistical methodologies have been used in GS for the prediction of genetic merit. In general, such methodologies require certain assumptions about the data, such as the normality of the distribution of phenotypic values. To circumvent the non-normality of phenotypic values, the literature suggests the use of Bayesian Generalized Linear Regression (GBLASSO). Another alternative is the models based on machine learning, represented by methodologies such as Artificial Neural Networks (ANN), Decision Trees (DT) and related possible refinements such as Bagging, Random Forest and Boosting. This study aimed to use DT and its refinements for predicting resistance to orange rust in Arabica coffee. Additionally, DT and its refinements were used to identify the importance of markers related to the characteristic of interest. The results were compared with those from GBLASSO and ANN. Data on coffee rust resistance of 245 Arabica coffee plants genotyped for 137 markers were used. The DT refinements presented equal or inferior values of ApparentError Rate compared to those obtained by DT, GBLASSO, and ANN. Moreover, DT refinements were able to identify important markers for the characteristic of interest. Out of 14 of the most important markers analyzed in each methodology, 9.3 markers on average were in regions of quantitative trait loci (QTLs) related to resistance to disease listed in the literature.
Motivation:In a microarray time series analysis, due to the large number of genes evaluated, the first step toward understanding the complex time network is the clustering of genes that share similar expression patterns over time. Up until now, the proposed methods do not point simultaneously to the temporal autocorrelation of the gene expression and the model-based clustering. We present a Bayesian method that considers jointly the fit of autoregressive panel data models and hierarchical gene clustering. Results: The proposed methodology was able to cluster genes that share similar expression over time, which was determined jointly by the estimates of autoregression parameters, by the average level of expression) and by the quality of the fitted model. Availability and implementation: The R codes for implementation of the proposed clustering method and for simulation study, as well as the real and simulated datasets, are freely accessible on the Web
Gene expression time series (GETS) analysis aims to characterize sets of genes according to their longitudinal patterns of expression. Due to the large number of genes evaluated in GETS analysis, an useful strategy to summarize biological functional processes and regulatory mechanisms is through clustering of genes that present similar expression pattern over time. Traditional cluster methods usually ignore the challenges in GETS, such as the lack of data normality and small number of temporal observations. Independent Component Analysis (ICA) is a statistical procedure that uses a transformation to convert raw time series data into sets of values of independent variables, which can be used for cluster analysis to identify sets of genes with similar temporal expression patterns. ICA allows clustering small series of distribution-free data while accounting for the dependence between subsequent time-points. Using temporal simulated and real (four libraries of two pig breeds at 21, 40, 70 and 90 days of gestation) RNA-seq data set we present a methodology (ICAclust) that jointly considers independent components analysis (ICA) and a hierarchical method for clustering GETS. We compare ICAclust results with those obtained for K-means clustering. ICAclust presented, on average, an absolute gain of 5.15% over the best K-means scenario. Considering the worst scenario for K-means, the gain was of 84.85%, when compared with the best ICAclust result. For the real data set, genes were grouped into six distinct clusters with 89, 51, 153, 67, 40, and 58 genes each, respectively. In general, it can be observed that the 6 clusters presented very distinct expression patterns. Overall, the proposed two-step clustering method (ICAclust) performed well compared to K-means, a traditional method used for cluster analysis of temporal gene expression data. In ICAclust, genes with similar expression pattern over time were clustered together.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
334 Leonard St
Brooklyn, NY 11211
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.