Qiang Gao scite author profile

Here we analyse genetic variation, population structure and diversity among 3,010 diverse Asian cultivated rice (Oryza sativa L.) genomes from the 3,000 Rice Genomes Project. Our results are consistent with the five major groups previously recognized, but also suggest several unreported subpopulations that correlate with geographic location. We identified 29 million single nucleotide polymorphisms, 2.4 million small indels and over 90,000 structural variations that contribute to within-and between-population variation. Using pan-genome analyses, we identified more than 10,000 novel full-length protein-coding genes and a high number of presence-absence variations. The complex patterns of introgression observed in domestication genes are consistent with multiple independent rice domestication events. The public availability of data from the 3,000 Rice Genomes Project provides a resource for rice genomics research and breeding.Asian cultivated rice is grown worldwide and comprises the staple food for half of the global population. It is envisaged that by the year 2035 1 feeding this growing population will necessitate that an additional 112 million metric tons of rice be produced on a smaller area of land, using less water and under more fluctuating climatic conditions, which will require that future rice cultivars be higher yielding and resilient to multiple abiotic and biotic stresses. The foundation of the continued improvement of rice cultivars is the rich genetic diversity within domesticated populations and wild relatives [2][3][4] . For over 2,000 years, two major types of O. sativa-O. sativa Xian group (here referred to as Xian/Indica (XI) and also known as , Hsien or Indica) and O. sativa Geng Group (here referred to as Geng/Japonica (GJ) and also known as , Keng or Japonica)-have historically been recognized [5][6][7] . Varied degrees of post-reproductive barriers exist between XI and GJ rice accessions 8 ; this differentiation between XI and GJ rice types and the presence of different varietal groups are well-documented at isozyme and DNA levels 6,9 . Two other distinct groups have also been recognized using molecular markers 10 ; one of these encompasses the Aus, Boro and Rayada ecotypes from Bangladesh and India (which we term the circum-Aus group (cA)) and the other comprises the famous Basmati and Sadri aromatic varieties (which we term the circum-Basmati group (cB)).Approximately 780,000 rice accessions are available in gene banks worldwide 11 . To enable the more efficient use of these accessions in future rice improvement, the Chinese Academy of Agricultural Sciences, BGI-Shenzhen and International Rice Research Institute sequenced over 3,000 rice genomes (3K-RG) as part of the 3,000 Rice Genomes Project 12. Here we present analyses of genetic variation in the 3K-RG that focus on important aspects of O. sativa diversity, single nucleotide polymorphisms (SNPs) and structural variation (deletions, duplications, inversions and translocations). We also construct a species pangenome consisting of 'core...

show abstract

Draft genome sequence of Camellia sinensis var. sinensis provides insights into the evolution of the tea genome and tea quality

Wei

Yang

Wang

et al. 2018

Proc. Natl. Acad. Sci. U.S.A.

700

843

View full text Add to dashboard Cite

SignificanceA high-quality genome assembly of Camellia sinensis var. sinensis facilitates genomic, transcriptomic, and metabolomic analyses of the quality traits that make tea one of the world’s most-consumed beverages. The specific gene family members critical for biosynthesis of key tea metabolites, monomeric galloylated catechins and theanine, are indicated and found to have evolved specifically for these functions in the tea plant lineage. Two whole-genome duplications, critical to gene family evolution for these two metabolites, are identified and dated, but are shown to account for less amplification than subsequent paralogous duplications. These studies lay the foundation for future research to understand and utilize the genes that determine tea quality and its diversity within tea germplasm.

show abstract

Cytosine, but not adenine, base editors induce genome-wide off-target mutations in rice

Jin

Zong

Gao

et al. 2019

Science

509

383

View full text Add to dashboard Cite

Cytosine and adenine base editors (CBEs and ABEs) are promising new tools for achieving the precise genetic changes required for disease treatment and trait improvement. However, genome-wide and unbiased analyses of their off-target effects in vivo are still lacking. Our whole genome sequencing (WGS) analysis of rice plants treated with BE3, high-fidelity BE3 (HF1-BE3), or ABE revealed that BE3 and HF1-BE3, but not ABE, induce substantial genome-wide off-target mutations, which are mostly the C→T type of single nucleotide variants (SNVs) and appear to be enriched in genic regions. Notably, treatment of rice with BE3 or HF1-BE3 in the absence of single-guide RNA also results in the rise of genome-wide SNVs. Thus, the base editing unit of BE3 or HF1-BE3 needs to be optimized in order to attain high fidelity.

show abstract

Phylogenomics reveals multiple losses of nitrogen-fixing root nodule symbiosis

Griesmann

Chang

Liu

et al. 2018

Science

313

319

View full text Add to dashboard Cite

The root nodule symbiosis of plants with nitrogen-fixing bacteria affects global nitrogen cycles and food production but is restricted to a subset of genera within a single clade of flowering plants. To explore the genetic basis for this scattered occurrence, we sequenced the genomes of 10 plant species covering the diversity of nodule morphotypes, bacterial symbionts, and infection strategies. In a genome-wide comparative analysis of a total of 37 plant species, we discovered signatures of multiple independent loss-of-function events in the indispensable symbiotic regulator in 10 of 13 genomes of nonnodulating species within this clade. The discovery that multiple independent losses shaped the present-day distribution of nitrogen-fixing root nodule symbiosis in plants reveals a phylogenetically wider distribution in evolutionary history and a so-far-underestimated selection pressure against this symbiosis.

show abstract

Genome of wild olive and the evolution of oil biosynthesis

Ünver

Wu²,

Sterck

et al. 2017

Proc. Natl. Acad. Sci. U.S.A.

211

253

View full text Add to dashboard Cite

SignificanceWe sequenced the genome and transcriptomes of the wild olive (oleaster). More than 50,000 genes were predicted, and evidence was found for two relatively recent whole-genome duplication events, dated at about 28 and 59 million years ago. Whole genome sequencing, as well as gene expression studies, provide further insights into the evolution of oil biosynthesis, and will aid future studies aimed at further increasing the production of olive oil, which is a key ingredient of the healthy Mediterranean diet and has been granted a qualified health claim by FDA. 5 AbstractHere, we present the genome sequence and annotation of the wild olive tree (Olea europaea var. sylvestris), called oleaster, which is considered an ancestor of cultivated olive trees. More than 50,000 protein-coding genes were predicted, a majority of which could be anchored to 23 pseudo-chromosomes obtained through a newly constructed genetic map. The oleaster genome contains signatures of two Oleaceae-lineage specific paleopolyploidy events, dated at approximately 28 and 59 million years ago. These events contributed to the expansion and neofunctionalization of genes and gene families that play important roles in oil biosynthesis.The functional divergence of oil biosynthesis pathway genes, such as FAD2, SACPD, EAR and ACPTE, following duplication, has been responsible for the differential accumulation of oleic and linoleic acids produced in olive compared to sesame, a closely related oil crop. Duplicated oleaster FAD2 genes are regulated by a short-interfering RNA (siRNA) derived from a transposable element-rich region, leading to suppressed levels of FAD2 gene expression.Additionally, neofunctionalization of members of the SACPD gene family has led to increased expression of SACPD2, 3, 5 and 7, consequently resulting in an increased desaturation of steric acid. Taken together, decreased FAD2 expression and increased SACPD expression likely explain the accumulation of exceptionally high levels of oleic acid in olive. The oleaster genome thus provides important insights into the evolution of oil biosynthesis and will be a valuable resource for oil crop genomics. 6 /bodyAs a symbol of peace, fertility, health and longevity, the olive tree (Olea europaea L.) is a socio-economically important oil crop that is widely grown in the Mediterranean Basin.Belonging to the Oleaceae family (order Lamiales), it can biosynthesize essential unsaturated fatty acids and other important secondary metabolites, such as vitamins and phenolic compounds (1). The olive tree is a diploid (2n = 46) allogamous crop that can be vegetatively propagated and live for thousands of years (2). Paleobotanical evidence suggests that olive oil was already produced in the Bronze Age (3). It has been thought that cultivated varieties were derived from the wild olive tree, called oleaster (O. europaea var. sylvestris), in Asia Minor, which then spread to Greece (4). Nevertheless, the exact domestication history of the olive tree is unknown (5). Due to their longevity, oleaster...

show abstract

Genome sequence of the progenitor of wheat A subgenome Triticum urartu

Ling

Shi

et al. 2018

Nature

359

269

View full text Add to dashboard Cite

Triticum urartu (diploid, AA) is the progenitor of the A subgenome of tetraploid (Triticum turgidum, AABB) and hexaploid (Triticum aestivum, AABBDD) wheat. Genomic studies of T. urartu have been useful for investigating the structure, function and evolution of polyploid wheat genomes. Here we report the generation of a high-quality genome sequence of T. urartu by combining bacterial artificial chromosome (BAC)-by-BAC sequencing, single molecule real-time whole-genome shotgun sequencing , linked reads and optical mapping. We assembled seven chromosome-scale pseudomolecules and identified protein-coding genes, and we suggest a model for the evolution of T. urartu chromosomes. Comparative analyses with genomes of other grasses showed gene loss and amplification in the numbers of transposable elements in the T. urartu genome. Population genomics analysis of 147 T. urartu accessions from across the Fertile Crescent showed clustering of three groups, with differences in altitude and biostress, such as powdery mildew disease. The T. urartu genome assembly provides a valuable resource for studying genetic variation in wheat and related grasses, and promises to facilitate the discovery of genes that could be useful for wheat improvement.

show abstract

Liriodendron genome sheds light on angiosperm phylogeny and species–pair differentiation

et al. 2018

View full text Add to dashboard Cite

The genus Liriodendron belongs to the family Magnoliaceae, which resides within the magnoliids, an early diverging lineage of the Mesangiospermae. However, the phylogenetic relationship of magnoliids with eudicots and monocots has not been conclusively resolved and thus remains to be determined1–6. Liriodendron is a relict lineage from the Tertiary with two distinct species—one East Asian (L. chinense (Hemsley) Sargent) and one eastern North American (L. tulipifera Linn)—identified as a vicariad species pair. However, the genetic divergence and evolutionary trajectories of these species remain to be elucidated at the whole-genome level7. Here, we report the first de novo genome assembly of a plant in the Magnoliaceae, L. chinense. Phylogenetic analyses suggest that magnoliids are sister to the clade consisting of eudicots and monocots, with rapid diversification occurring in the common ancestor of these three lineages. Analyses of population genetic structure indicate that L. chinense has diverged into two lineages—the eastern and western groups—in China. While L. tulipifera in North America is genetically positioned between the two L. chinense groups, it is closer to the eastern group. This result is consistent with phenotypic observations that suggest that the eastern and western groups of China may have diverged long ago, possibly before the intercontinental differentiation between L. chinense and L. tulipifera. Genetic diversity analyses show that L. chinense has tenfold higher genetic diversity than L. tulipifera, suggesting that the complicated regions comprising east–west-orientated mountains and the Yangtze river basin (especially near 30° N latitude) in East Asia offered more successful refugia than the south–north-orientated mountain valleys in eastern North America during the Quaternary glacial period.

show abstract

Sequencing and de novo assembly of a near complete indica rice genome

et al. 2017

Nat Commun

258

218

View full text Add to dashboard Cite

A high-quality reference genome is critical for understanding genome structure, genetic variation and evolution of an organism. Here we report the de novo assembly of an indica rice genome Shuhui498 (R498) through the integration of single-molecule sequencing and mapping data, genetic map and fosmid sequence tags. The 390.3 Mb assembly is estimated to cover more than 99% of the R498 genome and is more continuous than the current reference genomes of japonica rice Nipponbare (MSU7) and Arabidopsis thaliana (TAIR10). We annotate high-quality protein-coding genes in R498 and identify genetic variations between R498 and Nipponbare and presence/absence variations by comparing them to 17 draft genomes in cultivated rice and its closest wild relatives. Our results demonstrate how to de novo assemble a highly contiguous and near-complete plant genome through an integrative strategy. The R498 genome will serve as a reference for the discovery of genes and structural variations in rice.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Qiang Gao

Genomic variation in 3,010 diverse accessions of Asian cultivated rice

Draft genome sequence of Camellia sinensis var. sinensis provides insights into the evolution of the tea genome and tea quality

Cytosine, but not adenine, base editors induce genome-wide off-target mutations in rice

Phylogenomics reveals multiple losses of nitrogen-fixing root nodule symbiosis

Genome of wild olive and the evolution of oil biosynthesis

Genome sequence of the progenitor of wheat A subgenome Triticum urartu

Liriodendron genome sheds light on angiosperm phylogeny and species–pair differentiation

Sequencing and de novo assembly of a near complete indica rice genome

Contact Info

Product

Resources

About