MINTbase is a repository that comprises nuclear and mitochondrial tRNA-derived fragments (‘tRFs’) found in multiple human tissues. The original version of MINTbase comprised tRFs obtained from 768 transcriptomic datasets. We used our deterministic and exhaustive tRF mining pipeline to process all of The Cancer Genome Atlas datasets (TCGA). We identified 23 413 tRFs with abundance of ≥ 1.0 reads-per-million (RPM). To facilitate further studies of tRFs by the community, we just released version 2.0 of MINTbase that contains information about 26 531 distinct human tRFs from 11 719 human datasets as of October 2017. Key new elements include: the ability to filter tRFs on-the-fly by minimum abundance thresholding; the ability to filter tRFs by tissue keywords; easy access to information about a tRF’s maximum abundance and the datasets that contain it; the ability to generate relative abundance plots for tRFs across cancer types and convert them into embeddable figures; MODOMICS information about modifications of the parental tRNA, etc. Version 2.0 of MINTbase contains 15x more datasets and nearly 4x more distinct tRFs than the original version, yet continues to offer fast, interactive access to its contents. Version 2.0 is available freely at http://cm.jefferson.edu/MINTbase/.
tRNA-derived fragments (tRF) are a class of potent regulatory RNAs. We mined the datasets from The Cancer Genome Atlas (TCGA) representing 32 cancer types with a deterministic and exhaustive pipeline for tRNA fragments. We found that mitochondrial tRNAs contribute disproportionally more tRFs than nuclear tRNAs. Through integrative analyses, we uncovered a multitude of statistically significant and contextdependent associations between the identified tRFs and mRNAs. In many of the 32 cancer types, these associations involve mRNAs from developmental processes, receptor tyrosine kinase signaling, the proteasome, and metabolic pathways that include glycolysis, oxidative phosphorylation, and ATP synthesis. Even though the pathways are common to multiple cancers, the association of specific mRNAs with tRFs depends on and differs from cancer to cancer. The associations between tRFs and mRNAs extend to genomic properties as well; specifically, tRFs are positively correlated with shorter genes that have a higher density in repeats, such as ALUs, MIRs, and ERVLs. Conversely, tRFs are negatively correlated with longer genes that have a lower repeat density, suggesting a possible dichotomy between cell proliferation and differentiation. Analyses of bladder, lung, and kidney cancer data indicate that the tRF-mRNA wiring can also depend on a patient's sex. Sex-dependent associations involve cyclindependent kinases in bladder cancer, the MAPK signaling pathway in lung cancer, and purine metabolism in kidney cancer. Taken together, these findings suggest diverse and wide-ranging roles for tRFs and highlight the extensive interconnections of tRFs with key cellular processes and human genomic architecture.Significance: Across 32 TCGA cancer contexts, nuclear and mitochondrial tRNA fragments exhibit associations with mRNAs that belong to concrete pathways, encode proteins with particular destinations, have a biased repeat content, and are sex dependent.
Motivation: It has been known that mature transfer RNAs (tRNAs) that are encoded in the nuclear genome give rise to short molecules, collectively known as tRNA fragments or tRFs. Recently, we reported that, in healthy individuals and in patients, tRFs are constitutive, arise from mitochondrial as well as from nuclear tRNAs, and have composition and abundances that depend on a person’s sex, population origin and race as well as on tissue, disease and disease subtype. Our findings as well as similar work by other groups highlight the importance of tRFs and presage an increase in the community’s interest in elucidating the roles of tRFs in health and disease.Results: We created MINTbase, a web-based framework that serves the dual-purpose of being a content repository for tRFs and a tool for the interactive exploration of these newly discovered molecules. A key feature of MINTbase is that it deterministically and exhaustively enumerates all possible genomic locations where a sequence fragment can be found and indicates which fragments are exclusive to tRNA space, and thus can be considered as tRFs: this is a very important consideration given that the genomes of higher organisms are riddled with partial tRNA sequences and with tRNA-lookalikes whose aberrant transcripts can be mistaken for tRFs. MINTbase is extremely flexible and integrates and presents tRF information from multiple yet interconnected vantage points (‘vistas’). Vistas permit the user to interactively personalize the information that is returned and the manner in which it is displayed. MINTbase can report comparative information on how a tRF is distributed across all anticodon/amino acid combinations, provides alignments between a tRNA and multiple tRFs with which the user can interact, provides details on published studies that reported a tRF as expressed, etc. Importantly, we designed MINTbase to contain all possible tRFs that could ever be produced by mature tRNAs: this allows us to report on their genomic distributions, anticodon/amino acid properties, alignments, etc. while giving users the ability to at-will investigate candidate tRF molecules before embarking on focused experimental explorations. Lastly, we also introduce a new labeling scheme that is tRF-sequence-based and allows users to associate a tRF with a universally unique label (‘tRF-license plate’) that is independent of a genome assembly and does not require any brokering mechanism.Availability and Implementation: MINTbase is freely accessible at http://cm.jefferson.edu/MINTbase/. Dataset submissions to MINTbase can be initiated at http://cm.jefferson.edu/MINTsubmit/.Contact: isidore.rigoutsos@jefferson.eduSupplementary information: Supplementary data are available at Bioinformatics online.
Background: The advent of next generation sequencing (NGS) has allowed the discovery of short and long noncoding RNAs (ncRNAs) in an unbiased manner using reverse genetics approaches, enabling the discovery of multiple categories of ncRNAs and characterization of the way their expression is regulated. We previously showed that the identities and abundances of microRNA isoforms (isomiRs) and transfer RNA-derived fragments (tRFs) are tightly regulated, and that they depend on a person's sex and population origin, as well as on tissue type, tissue state, and disease type. Here, we characterize the regulation and distribution of fragments derived from ribosomal RNAs (rRNAs). rRNAs form a group that includes four (5S, 5.8S, 18S, 28S) rRNAs encoded by the human nuclear genome and two (12S, 16S) by the mitochondrial genome. rRNAs constitute the most abundant RNA type in eukaryotic cells. Results: We analyzed rRNA-derived fragments (rRFs) across 434 transcriptomic datasets obtained from lymphoblastoid cell lines (LCLs) derived from healthy participants of the 1000 Genomes Project. The 434 datasets represent five human populations and both sexes. We examined each of the six rRNAs and their respective rRFs, and did so separately for each population and sex. Our analysis shows that all six rRNAs produce rRFs with unique identities, normalized abundances, and lengths. The rRFs arise from the 5′-end (5′-rRFs), the interior (i-rRFs), and the 3′-end (3′-rRFs) or straddle the 5′ or 3′ terminus of the parental rRNA (x-rRFs). Notably, a large number of rRFs are produced in a population-specific or sexspecific manner. Preliminary evidence suggests that rRF production is also tissue-dependent. Of note, we find that rRF production is not affected by the identity of the processing laboratory or the library preparation kit. Conclusions: Our findings suggest that rRFs are produced in a regimented manner by currently unknown processes that are influenced by both ubiquitous as well as population-specific and sex-specific factors. The properties of rRFs mirror the previously reported properties of isomiRs and tRFs and have implications for the study of homeostasis and disease.
BackgroundCRISPR/Cas (Clustered Regularly Interspaced Short Palindromic Repeats/CRISPR associated nucleases) is a powerful component of the prokaryotic immune system that has been adapted for targeted genetic engineering in higher organisms. A key element of CRISPR/Cas is the “guide” RNA (gRNA) that is ~20 nucleotides (nts) in length and designed to be complementary to the intended target site. An integral requirement of the CRISPR/Cas system is that the target site be followed by a protospacer adjacent motif (PAM). Care needs to be exercised during gRNA design to avoid unintended (“off-target”) interactions.ResultsWe designed and implemented the Off-Spotter algorithm to assist with the design of optimal gRNAs. When presented with a candidate gRNA sequence and a PAM, Off-Spotter quickly and exhaustively identifies all genomic sites that satisfy the PAM constraint and are identical or nearly-identical to the provided gRNA. Off-Spotter achieves its extreme performance through purely algorithmic means and not through hardware accelerators such as graphical processing units (GPUs). Off-Spotter also allows the user to identify on-the-fly how many and which nucleotides of the gRNA comprise the “seed”. Off-Spotter’s output includes a histogram showing the number of potential off-targets as a function of the number of mismatches. The output also includes for each potential off-target the site’s genomic location, a human genome browser hyperlink to the corresponding location, genomic annotation in the vicinity of the off-target, GC content, etc.ConclusionOff-Spotter is very fast and flexible and can help in the design of optimal gRNAs by providing several PAM choices, a run-time definition of the seed and of the allowed number of mismatches, and a flexible output interface that allows sorting of the results, optional viewing/hiding of columns, etc. A key element of Off-Spotter is that it does not have a rigid definition of the seed: instead, the user can declare both the seed’s location and extent on-the-fly. We expect that this flexibility in combination with Off-Spotter’s speed and richly annotated output will enable experimenters to interactively and quickly explore different scenarios and gRNA possibilities.ReviewedThis article was reviewed by Dr Eugene Koonin and Dr Frank Eisenhaber.Electronic supplementary materialThe online version of this article (doi:10.1186/s13062-015-0035-z) contains supplementary material, which is available to authorized users.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.