A role for heritable transcriptomic variation in maize adaptation to temperate environments

Sun, Guangchao; Yu, Huihui; Wang, Peng; Guerrero, Martha G. Lopez; Mural, Ravi V.; Mizero, Olivier N.; Grzybowski, Marcin; Song, Baoxing; Dijk, Karin van; Schachtman, Daniel P.; Zhang, Chi; Schnable, James C.

doi:10.1101/2022.01.28.478212

Cited by 12 publications

(25 citation statements)

References 117 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…BWA-MEM (v0.7) with default parameter settings [ 66 ] was employed to align the resulting trimmed resequencing data to v4 of the B73 maize reference genome [ 19 , 20 ]. STAR (v2.7) [ 67 ] was used to align the trimmed RNA-seq reads to v4 of the B73 maize reference genome in 2 rounds as described in Sun et al [ 68 ]. Apparent polymerase chain reaction duplicates were marked within the resulting BAM alignments using picard (v2.22) [ 69 ].…”

Section: Methodsmentioning

confidence: 99%

Association mapping across a multitude of traits collected in diverse environments in maize

et al. 2022

Self Cite

View full text Add to dashboard Cite

Classical genetic studies have identified many cases of pleiotropy where mutations in individual genes alter many different phenotypes. Quantitative genetic studies of natural genetic variants frequently examine one or a few traits, limiting their potential to identify pleiotropic effects of natural genetic variants. Widely adopted community association panels have been employed by plant genetics communities to study the genetic basis of naturally occurring phenotypic variation in a wide range of traits. High-density genetic marker data—18M markers—from 2 partially overlapping maize association panels comprising 1,014 unique genotypes grown in field trials across at least 7 US states and scored for 162 distinct trait data sets enabled the identification of of 2,154 suggestive marker-trait associations and 697 confident associations in the maize genome using a resampling-based genome-wide association strategy. The precision of individual marker-trait associations was estimated to be 3 genes based on a reference set of genes with known phenotypes. Examples were observed of both genetic loci associated with variation in diverse traits (e.g., above-ground and below-ground traits), as well as individual loci associated with the same or similar traits across diverse environments. Many significant signals are located near genes whose functions were previously entirely unknown or estimated purely via functional data on homologs. This study demonstrates the potential of mining community association panel data using new higher-density genetic marker sets combined with resampling-based genome-wide association tests to develop testable hypotheses about gene functions, identify potential pleiotropic effects of natural genetic variants, and study genotype-by-environment interaction.

show abstract

Section: Methodsmentioning

confidence: 99%

Association mapping across a multitude of traits collected in diverse environments in maize

et al. 2022

Self Cite

View full text Add to dashboard Cite

show abstract

“…The V1 stage (approximately 10 days after planting) whole-seedling RNA-Seq data from 502 genotypes (PRJNA189400) Hirsch et al . (2014), and 14-day-old root RNA-Seq data on 350 genotypes (PRJNA793045), Sun et al . (2022).…”

Section: Methodsmentioning

confidence: 99%

“…(2014) and another from roots of maize plants at 14 days after planting Sun et al (2022). These gene expression data-sets include different subsets of the expanded 942-WiDiv panel Mazaheri et al (2019).…”

Section: Flowering Time and Gene Expression Data-setsmentioning

confidence: 99%

Measurement of expression from a limited number of genes is sufficient to predict flowering time in maize

Torres-Rodríguez

Sun

Mural

et al. 2022

Preprint

Self Cite

View full text Add to dashboard Cite

Changing patterns of weather and climate are limiting breeders ability to conduct trials in the same environments in which their released varieties will be grown 7-10 years later. Flowering time plays a crucial role in determining regional adaptation, and mismatch between flowering time and environment can substantially impair yield. Different approaches based on genetic markers or gene expression can be used to predict flowering time before conducting large scale field evaluation and phenotyping. The more accurate prediction of a trait using genetic markers could be hindered due to all the intermediate steps (i.e. transcription, translation, epigenetic modification, and epistasis among others) connecting the trait and their genetic basics. The use of some intermediate steps as predictors could improve the accuracy of the model. Here, we are using two public gene expression (RNA-Seq) data-sets from 14-day-old-maize-seedling roots and whole-seedling tissue at v1 stage (~10 day after planting) for which flowering data (days to anthesis and days to silking expressed in growing degree days) and genetic markers were also available to test the predictability of flowering time. In total, 20 different combinations between phenotypic and gene expression data-sets were evaluated. To explore prediction accuracy a random forest model was trained with the expression values of 44,303 gene models hosted in the current B73 maize reference version 5 and then the feature importance was scored based on the decrease in root mean squared error. Later several random forest models with different subsets of the most important features (genes) were trained, and this process was repeated ten times. Results from these analyses show a curve in the prediction accuracy, with an increase in the prediction accuracy as the top most important genes were added. The maximum accuracy was attained when 500 genes for whole-seedling and 100 genes for root gene expression data were used in the analysis, and thereafter adding more genes lead to a decrease in the prediction accuracy. The highest prediction accuracy using the top-most important genes was higher than that of using randomly selected whole-genome 400,000 SNPs. Finally, we described the genes controlling flowering time by looking at the most important genes in the Random forest model with the expression data from all genes. We further found MADS-transcription factor 69 (Mads69) using whole-seedling gene expression, and the MADS-transcription factor 67 (Mads67) using root gene expression data, both genes previously described with effect on flowering time. Here, we aim to demonstrate the potential of selecting and using the expression of most informative genes to predict a complex trait, also to demonstrate the robustness and limitations of this analysis by using phenotypic data-sets from different environments.

show abstract

“…BWA-MEM (v0.7) with default parameter settings 14 was employed to align the resulting trimmed resequencing data to v4 of the B73 maize reference genome 15,16 . STAR (v2.7) 17 was used to align the trimmed RNA-seq reads to v4 of the B73 maize reference genome in two rounds as described in Sun et al 18 . Apparent PCR duplicates were marked within the resulting BAM alignments using picard (v2.22) 19 .…”

Section: Unified Genetic Marker Datamentioning

confidence: 99%

Association Mapping Across a Multitude of Traits Collected in Diverse Environments Identifies Pleiotropic Loci in Maize

Mural

Sun

Grzybowski

et al. 2022

Preprint

Self Cite

View full text Add to dashboard Cite

Classical genetic studies have identified many cases of pleiotropy where mutations in individual genes alter many different phenotypes. Quantitative genetic studies of natural genetic variants frequently examine one or a few traits, limiting their potential to identify pleiotropic effects of natural genetic variants. Widely adopted community association panels have been employed by plant genetics communities to study the genetic basis of naturally occurring phenotypic variation in a wide range of traits. High-density genetic marker data -- 18M markers -- from two partially overlapping maize association panels comprising 1,014 unique genotypes grown in field trials across at least seven US states and scored for 162 distinct trait datasets enabled the identification of of 2,154 suggestive marker-trait associations and 697 confident associations in the maize genome using a resampling-based genome-wide association strategy. The precision of individual marker-trait associations was estimated to be three genes based a reference set of genes with known phenotypes. Examples were observed of both genetic loci associated with variation in diverse traits (e.g. above-ground and below-ground traits), as well as individual loci associated with the same or similar traits across diverse environments. Many significant signals are located near genes whose functions were previously entirely unknown or estimated purely via functional data on homologs. This study demonstrates the potential of mining community association panel data using new higher density genetic marker sets combined with resampling-based genome-wide association tests to develop testable hypotheses about gene functions, identify potential pleiotropic effects of natural genetic variants, and study genotype by environment interaction.

show abstract

A role for heritable transcriptomic variation in maize adaptation to temperate environments

Cited by 12 publications

References 117 publications

Association mapping across a multitude of traits collected in diverse environments in maize

Association mapping across a multitude of traits collected in diverse environments in maize

Measurement of expression from a limited number of genes is sufficient to predict flowering time in maize

Association Mapping Across a Multitude of Traits Collected in Diverse Environments Identifies Pleiotropic Loci in Maize

Contact Info

Product

Resources

About