Grapevine () berry development involves a succession of physiological and biochemical changes reflecting the transcriptional modulation of thousands of genes. Although recent studies have investigated the dynamic transcriptome during berry development, most have focused on a single grapevine variety, so there is a lack of comparative data representing different cultivars. Here, we report, to our knowledge, the first genome-wide transcriptional analysis of 120 RNA samples corresponding to 10 Italian grapevine varieties collected at four growth stages. The 10 varieties, representing five red-skinned and five white-skinned berries, were all cultivated in the same experimental vineyard to reduce environmental variability. The comparison of transcriptional changes during berry formation and ripening allowed us to determine the transcriptomic traits common to all varieties, thus defining the core transcriptome of berry development, as well as the transcriptional dynamics underlying differences between red and white berry varieties. A greater variation among the red cultivars than between red and white cultivars at the transcriptome level was revealed, suggesting that anthocyanin accumulation during berry maturation has a direct impact on the transcriptomic regulation of multiple biological processes. The expression of genes related to phenylpropanoid/flavonoid biosynthesis clearly distinguished the behavior of red and white berry genotypes during ripening but also reflected the differential accumulation of anthocyanins in the red berries, indicating some form of cross talk between the activation of stilbene biosynthesis and the accumulation of anthocyanins in ripening berries.
This article considers a measure of variable importance frequently used in variableselection methods based on decision trees and tree-based ensemble models. These models include CART, random forests, and gradient boosting machine. The measure of variable importance is defined as the total heterogeneity reduction produced by a given covariate on the response variable when the sample space is recursively partitioned. Despite its popularity, some authors have shown that this measure is biased to the extent that, under certain conditions, there may be dangerous effects on variable selection. Here we present a simple and effective method for bias correction, focusing on the easily generalizable case of the Gini index as a measure of heterogeneity. 611 612 M. SANDRI AND P. ZUCCOLOTTO (a) Evaluation of the reduction of (out-of-bag) predictive accuracy after a random permutation of the values assumed by X i ; and (b) the total heterogeneity reduction produced by X i on the response variable, obtained by adding up all the decreases of the heterogeneity index in the tree nodes where X i is selected for splitting. This article focuses on the class of VI measures described in (b), originally introduced by Breiman et al. (1984) in the context of CART. There are several influential theoretical investigations (Breiman 2001a; Friedman 2001) and many empirical applications (e.g., Friedman and Meulman 2003; Svetnik et al. 2005; Menze et al. 2007; De'ath 2007) of these measures in the literature. Much of this work centered on the original form of the measures introduced by Breiman et al. (1984). In addition, these measures are often set as the default in software for data mining, like the randomForest package in R (Breiman et al. 2006), the gbm package in R (Ridgeway 2007), the boost Stata command (Schonlau 2005), and the MART package in S-Plus and R (Friedman 2002). Some authors have shown that these VI measures are biased in a way that may have, under certain conditions, potentially dangerous effects on variable selection. Breiman et al. (1984) first noted that they are biased in favor of variables that have more values (i.e., fewer missing values, more categories, or distinct numerical values) and thus offer more splits. This means that variable selection may be affected by covariate characteristics other than information content. Subsequently, White and Liu (1994), Kononenko (1995), Dobra and Gehrke (2001), and Strobl (2005) investigated in greater detail the nature of the bias in information-based VI measures and elucidated the relation between bias and the covariate's number of values.When the Gini gain is used as the splitting criterion for the tree nodes, the resulting total heterogeneity reduction is called the "Gini VI measure." Strobl et al. (2007b) reinterpreted and systematized previous results about this measure and identified three fundamental sources of bias: (a) the bias of the Gini estimator, (b) the variance of the Gini estimator, and (c) the effects of multiple comparisons.Recently, several authors have proposed...
Changes in the performance of genotypes in different environments are defined as genotype × environment (G×E) interactions. In grapevine (Vitis vinifera), complex interactions between different genotypes and climate, soil and farming practices yield unique berry qualities. However, the molecular basis of this phenomenon remains unclear. To dissect the basis of grapevine G×E interactions we characterized berry transcriptome plasticity, the genome methylation landscape and within-genotype allelic diversity in two genotypes cultivated in three different environments over two vintages. We identified, through a novel data-mining pipeline, genes with expression profiles that were: unaffected by genotype or environment, genotype-dependent but unaffected by the environment, environmentally-dependent regardless of genotype, and G×E-related. The G×E-related genes showed different degrees of within-cultivar allelic diversity in the two genotypes and were enriched for stress responses, signal transduction and secondary metabolism categories. Our study unraveled the mutual relationships between genotypic and environmental variables during G×E interaction in a woody perennial species, providing a reference model to explore how cultivated fruit crops respond to diverse environments. Also, the pivotal role of vineyard location in determining the performance of different varieties, by enhancing berry quality traits, was unraveled.
One of the main topic in the development of predictive models is the identification of variables which are predictors of a given outcome. Automated model selection methods, such as backward or forward stepwise regression, are classical solutions to this problem, but are generally based on strong assumptions about the functional form of the model or the distribution of residuals. In this paper an alternative selection method, based on the technique of Random Forests, is proposed in the context of classification, with an application to a real dataset.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.