Oil palm is the most productive oil-bearing crop. Planted on only 5% of the total vegetable oil acreage, palm oil accounts for 33% of vegetable oil, and 45% of edible oil worldwide, but increased cultivation competes with dwindling rainforest reserves. We report the 1.8 gigabase (Gb) genome sequence of the African oil palm Elaeis guineensis, the predominant source of worldwide oil production. 1.535 Gb of assembled sequence and transcriptome data from 30 tissue types were used to predict at least 34,802 genes, including oil biosynthesis genes and homologues of WRINKLED1 (WRI1), and other transcriptional regulators1, which are highly expressed in the kernel. We also report the draft sequence of the S. American oil palm Elaeis oleifera, which has the same number of chromosomes (2n=32) and produces fertile interspecific hybrids with E. guineensis2, but appears to have diverged in the new world. Segmental duplications of chromosome arms define the palaeotetraploid origin of palm trees. The oil palm sequence enables the discovery of genes for important traits as well as somaclonal epigenetic alterations which restrict the use of clones in commercial plantings3, and thus helps achieve sustainability for biofuels and edible oils, reducing the rainforest footprint of this tropical plantation crop.
A key event in the domestication and breeding of the oil palm, Elaeis guineensis, was loss of the thick coconut-like shell surrounding the kernel. Modern E. guineensis has three fruit forms, dura (thick-shelled), pisifera (shell-less) and tenera (thin-shelled), a hybrid between dura and pisifera1–4. The pisifera palm is usually female-sterile but the tenera yields far more oil than dura, and is the basis for commercial palm oil production in all of Southeast Asia5. Here, we describe the mapping and identification of the Shell gene responsible for the different fruit forms. Using homozygosity mapping by sequencing we found two independent mutations in the DNA binding domain of a homologue of the MADS-box gene SEEDSTICK (STK) which controls ovule identity and seed development in Arabidopsis. The Shell gene is responsible for the tenera phenotype in both cultivated and wild palms from sub-Saharan Africa, and our findings provide a genetic explanation for the single gene heterosis attributed to Shell, via heterodimerization. This gene mutation explains the single most important economic trait in oil palm, and has implications for the competing interests of global edible oil production, biofuels and rainforest conservation6.
Oil palm, a plantation crop of major economic importance in Southeast Asia, is the predominant source of edible oil worldwide. We report the identification of the VIRESCENS (VIR) gene, which controls fruit exocarp colour and is an indicator of ripeness. VIR is a R2R3-MYB transcription factor with homology to Lilium LhMYB12 and similarity to Arabidopsis PRODUCTION OF ANTHOCYANIN PIGMENT1 (PAP1). We identify five independent mutant alleles of VIR in over 400 accessions from sub-Saharan Africa that account for the dominant-negative virescens phenotype. Each mutation results in premature termination of the carboxy-terminal domain of VIR, resembling McClintock’s C1-I allele in maize. The abundance of alleles likely reflects cultural practices, by which fruits were venerated for magical and medicinal properties. The identification of VIR will allow selection of the trait at the seed or early-nursery stage, 3-6 years before fruits are produced, greatly advancing introgression into elite breeding material.
Clonal reproduction of oil palm by means of tissue culture is a very inefficient process. Tissue culturability is known to be genotype dependent with some genotypes being more amenable to tissue culture than others. In this study, genetic linkage maps enriched with simple sequence repeat (SSR) markers were developed for dura (ENL48) and pisifera (ML161), the two fruit forms of oil palm, Elaeis guineensis. The SSR markers were mapped onto earlier reported parental maps based on amplified fragment length polymorphism (AFLP) and restriction fragment length polymorphism (RFLP) markers. The new linkage map of ENL48 contains 148 markers (33 AFLPs, 38 RFLPs and 77 SSRs) in 23 linkage groups (LGs), covering a total map length of 798.0 cM. The ML161 map contains 240 markers (50 AFLPs, 71 RFLPs and 119 SSRs) in 24 LGs covering a total of 1,328.1 cM. Using the improved maps, two quantitative trait loci (QTLs) associated with tissue culturability were identified each for callusing rate and embryogenesis rate. A QTL for callogenesis was identified in LGD4b of ENL48 and explained 17.5% of the phenotypic variation. For embryogenesis rate, a QTL was detected on LGP16b in ML161 and explained 20.1% of the variation. This study is the first attempt to identify QTL associated with tissue culture amenity in oil palm which is an important step towards understanding the molecular processes underlying clonal regeneration of oil palm.
BackgroundOil palm is an important source of edible oil. The importance of the crop, as well as its long breeding cycle (10-12 years) has led to the sequencing of its genome in 2013 to pave the way for genomics-guided breeding. Nevertheless, the first set of gene predictions, although useful, had many fragmented genes. Classification and characterization of genes associated with traits of interest, such as those for fatty acid biosynthesis and disease resistance, were also limited. Lipid-, especially fatty acid (FA)-related genes are of particular interest for the oil palm as they specify oil yields and quality. This paper presents the characterization of the oil palm genome using different gene prediction methods and comparative genomics analysis, identification of FA biosynthesis and disease resistance genes, and the development of an annotation database and bioinformatics tools.ResultsUsing two independent gene-prediction pipelines, Fgenesh++ and Seqping, 26,059 oil palm genes with transcriptome and RefSeq support were identified from the oil palm genome. These coding regions of the genome have a characteristic broad distribution of GC3 (fraction of cytosine and guanine in the third position of a codon) with over half the GC3-rich genes (GC3 ≥ 0.75286) being intronless. In comparison, only one-seventh of the oil palm genes identified are intronless. Using comparative genomics analysis, characterization of conserved domains and active sites, and expression analysis, 42 key genes involved in FA biosynthesis in oil palm were identified. For three of them, namely EgFABF, EgFABH and EgFAD3, segmental duplication events were detected. Our analysis also identified 210 candidate resistance genes in six classes, grouped by their protein domain structures.ConclusionsWe present an accurate and comprehensive annotation of the oil palm genome, focusing on analysis of important categories of genes (GC3-rich and intronless), as well as those associated with important functions, such as FA biosynthesis and disease resistance. The study demonstrated the advantages of having an integrated approach to gene prediction and developed a computational framework for combining multiple genome annotations. These results, available in the oil palm annotation database (http://palmxplore.mpob.gov.my), will provide important resources for studies on the genomes of oil palm and related crops.ReviewersThis article was reviewed by Alexander Kel, Igor Rogozin, and Vladimir A. Kuznetsov.Electronic supplementary materialThe online version of this article (doi:10.1186/s13062-017-0191-4) contains supplementary material, which is available to authorized users.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.