Background: Anoectochilus roxburghii is a medicinal plant and contains a variety of bioactive components, including triterpene, which exhibits important pharmacological properties with low toxicity. However, little is known about the biosynthetic pathway of triterpene or about the genome and transcriptome in A. roxburghii.
Results: In order to analyze transcriptional determinants related to the biosynthesis of the bioactive components, we performed transcriptome sequencing in A. roxburghii (SRX1818644, SRX1818642 and SRX1818641) and annotated the sequences from three samples. In total, 137,679,059 clean reads were obtained, corresponding to 12.20 Gb of total nucleotides. They were then assembled into 86,382 contigs and 68,938 unigenes, which were further annotated according to sequence similarity with known genes in COG, EST, Nr, Pfam and Uniprot databases, leading to 10,040,29,442,39,551,34,991 and 28,082 unigenes, respectively. GO analysis classified all unigenes into three functional categories, i.e. biological processes (43,206 unigenes in 22 categories), molecular functions (46,978 unigenes in 15 categories) and cellular components (20,951 unigenes in 18 categories). Candidate triterpenes biosynthetic genes ArHMGR1 in MEV pathway, ArDXS1, ArDXS4 ArDXS5, ArDXS8-10, ArDXR1-2 and ArHDR1-2 in MEP pathway and ArFDS1, ArSM and ArOCS were selected based on RNA-seq and gene-to-metabolites correlation analysis.
Conclusion: The transcriptomes of A. roxburghii plant include 86,382 contigs and 68,938 unigenes. The assembled dataset allowed identification of genes encoding enzymes in the biosynthesis of bioactive components in A. roxburghii plant. Candidate genes that encode enzymes being important in triterpenes biosynthetic pathway were selected. This will facilitate the study of expression and regulation in the biosynthesis of bioactive component in A.roxburghii.