We thank J. Li from the China Agricultural University for providing the seeds of SK, X. Li for helping to conduct ChIA-PET sequencing and K. Kremling for critical comments on the manuscript.
Maize was domesticated from lowland teosinte (Zea mays ssp. parviglumis), but the contribution of highland teosinte (Zea mays ssp. mexicana, hereafter mexicana) to modern maize is not clear. Here, two genomes for Mo17 (a modern maize inbred) and mexicana are assembled using a meta-assembly strategy after sequencing of 10 lines derived from a maize-teosinte cross. Comparative analyses reveal a high level of diversity between Mo17, B73, and mexicana, including three Mb-size structural rearrangements. The maize spontaneous mutation rate is estimated to be 2.17 × 10−8 ~3.87 × 10−8 per site per generation with a nonrandom distribution across the genome. A higher deleterious mutation rate is observed in the pericentromeric regions, and might be caused by differences in recombination frequency. Over 10% of the maize genome shows evidence of introgression from the mexicana genome, suggesting that mexicana contributed to maize adaptation and improvement. Our data offer a rich resource for constructing the pan-genome of Zea mays and genetic improvement of modern maize varieties.
Variation in gene expression contributes to the diversity of phenotype. The construction of the pan-transcriptome is especially necessary for species with complex genomes, such as maize. However, knowledge of the regulation mechanisms and functional consequences of the pan-transcriptome is limited. In this study, we identified 13,382 nuclear expression presence and absence variation candidates (ePAVs, expressed in 5%~95% lines; based on the reference genome) by re-analyzing the RNA sequencing data from the kernels (15 days after pollination) of 368 maize diverse inbreds. It was estimated that only ~1% of the ePAVs are explained by DNA sequence presence and absence variations (PAV). The ePAV genes tend to be regulated by distant eQTLs when compared with non-ePAV genes (called here core expression genes, expressed in more than 95% lines). When the expression presence/absence status was used as the "genotype" to perform genome-wide association study, 56 (0.42%) ePAVs were significantly associated with 15 agronomic traits and 1,967 (14.74%) with 526 metabolic traits, measured from the mature kernels. While the above was majorly based on the reference genome, by using a modified 'assemble-then-align' strategy, 2,355 high confidence novel sequences with a total length of 1.9Mb were found absent in the current B73 reference genome (v2). Ten randomly selected novel sequences were validated with genomic PCR. A simulation analysis suggested that the pan-transcriptome of the maize whole kernel is approaching a maximum value of 63,000 genes. Two novel validated sequences annotated as NBS_LRR like genes were found to associate with flavonoid content and their homologs in rice were also found to affect flavonoids and disease-resistance. Novel sequences absent in the present reference genome might be functionally important and deserve more attentions. This study provides novel perspectives and resources to discover maize quantitative trait variations and help us to better understand the kernel regulation networks, thus enhancing maize breeding.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.