SUMMARYThe ability to evolve novel metabolites has been instrumental for the defence of plants against antagonists. A few species in the Barbarea genus are the only crucifers known to produce saponins, some of which make plants resistant to specialist herbivores, like Plutella xylostella, the diamondback moth. Genetic mapping in Barbarea vulgaris revealed that genes for saponin biosynthesis are not clustered but are located in different linkage groups. Using co-location with quantitative trait loci (QTLs) for resistance, transcriptome and genome sequences, we identified two 2,3-oxidosqualene cyclases that form the major triterpenoid backbones. LUP2 mainly produces lupeol, and is preferentially expressed in insect-susceptible B. vulgaris plants, whereas LUP5 produces b-amyrin and a-amyrin, and is preferentially expressed in resistant plants; b-amyrin is the backbone for the resistance-conferring saponins in Barbarea. Two loci for cytochromes P450, predicted to add functional groups to the saponin backbone, were identified: CYP72As co-localized with insect resistance, whereas CYP716As did not. When B. vulgaris sapogenin biosynthesis genes were transiently expressed by CPMV-HT technology in Nicotiana benthamiana, high levels of hydroxylated and carboxylated triterpenoid structures accumulated, including oleanolic acid, which is a precursor of the major resistanceconferring saponins. When the B. vulgaris gene for sapogenin 3-O-glucosylation was co-expressed, the insect deterrent 3-O-oleanolic acid monoglucoside accumulated, as well as triterpene structures with up to six hexoses, demonstrating that N. benthamiana further decorates the monoglucosides. We argue that saponin biosynthesis in the Barbarea genus evolved by a neofunctionalized glucosyl transferase, whereas the difference between resistant and susceptible B. vulgaris chemotypes evolved by different expression of oxidosqualene cyclases (OSCs).
BackgroundThe yield obtained from next generation sequencers has increased almost exponentially in recent years, making sample multiplexing common practice. While barcodes (known sequences of fixed length) primarily encode the sample identity of sequenced DNA fragments, barcodes made of random sequences (Unique Molecular Identifier or UMIs) are often used to distinguish between PCR duplicates and transcript abundance in, for example, single-cell RNA sequencing (scRNA-seq). In paired-end sequencing, different barcodes can be inserted at each fragment end to either increase the number of multiplexed samples in the library or to use one of the barcodes as UMI. Alternatively, UMIs can be combined with the sample barcodes into composite barcodes, or with standard Illumina® indexing. Subsequent analysis must take read duplicates and sample identity into account, by identifying UMIs.ResultsExisting tools do not support these complex barcoding configurations and custom code development is frequently required. Here, we present Je, a suite of tools that accommodates complex barcoding strategies, extracts UMIs and filters read duplicates taking UMIs into account. Using Je on publicly available scRNA-seq and iCLIP data containing UMIs, the number of unique reads increased by up to 36 %, compared to when UMIs are ignored.ConclusionsJe is implemented in JAVA and uses the Picard API. Code, executables and documentation are freely available at http://gbcs.embl.de/Je. Je can also be easily installed in Galaxy through the Galaxy toolshed.Electronic supplementary materialThe online version of this article (doi:10.1186/s12859-016-1284-2) contains supplementary material, which is available to authorized users.
Human cancer cell lines are an important resource for research and drug development. However, the available annotations of cell lines are sparse, incomplete, and distributed in multiple repositories. Re-analyzing publicly available raw RNA-Seq data, we determined the human leukocyte antigen (HLA) type and abundance, identified expressed viruses and calculated gene expression of 1,082 cancer cell lines. Using the determined HLA types, public databases of cell line mutations, and existing HLA binding prediction algorithms, we predicted antigenic mutations in each cell line. We integrated the results into a comprehensive knowledgebase. Using the Django web framework, we provide an interactive user interface with advanced search capabilities to find and explore cell lines and an application programming interface to extract cell line information. The portal is available at http://celllines.tron-mainz.de.Electronic supplementary materialThe online version of this article (doi:10.1186/s13073-015-0240-5) contains supplementary material, which is available to authorized users.
The code and documentation are available at http://tron-mainz.de/tron-facilities/computational-medicine/galaxy-lims/
Background The vast ecosystem of single-cell RNA-sequencing tools has until recently been plagued by an excess of diverging analysis strategies, inconsistent file formats, and compatibility issues between different software suites. The uptake of 10x Genomics datasets has begun to calm this diversity, and the bioinformatics community leans once more towards the large computing requirements and the statistically driven methods needed to process and understand these ever-growing datasets. Results Here we outline several Galaxy workflows and learning resources for single-cell RNA-sequencing, with the aim of providing a comprehensive analysis environment paired with a thorough user learning experience that bridges the knowledge gap between the computational methods and the underlying cell biology. The Galaxy reproducible bioinformatics framework provides tools, workflows, and trainings that not only enable users to perform 1-click 10x preprocessing but also empower them to demultiplex raw sequencing from custom tagged and full-length sequencing protocols. The downstream analysis supports a range of high-quality interoperable suites separated into common stages of analysis: inspection, filtering, normalization, confounder removal, and clustering. The teaching resources cover concepts from computer science to cell biology. Access to all resources is provided at the singlecell.usegalaxy.eu portal. Conclusions The reproducible and training-oriented Galaxy framework provides a sustainable high-performance computing environment for users to run flexible analyses on both 10x and alternative platforms. The tutorials from the Galaxy Training Network along with the frequent training workshops hosted by the Galaxy community provide a means for users to learn, publish, and teach single-cell RNA-sequencing analysis.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.