The identification of disease-related microRNAs is vital for understanding the pathogenesis of disease at the molecular level and may lead to the design of specific molecular tools for diagnosis, treatment and prevention. Experimental identification of disease-related microRNAs poses difficulties. Computational prediction of microRNA-disease associations is one of the complementary means. However, one major issue in microRNA studies is the lack of bioinformatics programs to accurately predict microRNA-disease associations. Herein, we present a machine-learning-based approach for distinguishing positive microRNA-disease associations from negative microRNA-disease associations. A set of features was extracted for each positive and negative microRNA-disease association, and a Support Vector Machine (SVM) classifier was trained, which achieved the area under the ROC curve of up to 0.8884 in 10-fold cross-validation procedure, indicating that the SVM-based approach described here can be used to predict potential microRNA-disease associations and formulate testable hypotheses to guide future biological experiments.
BackgroundThe GENCODE project has collected over 10,000 human long non-coding RNA (lncRNA) genes. However, the vast majority of them remain to be functionally characterized. Computational investigation of potential functions of human lncRNA genes is helpful to guide further experimental studies on lncRNAs.ResultsIn this study, based on expression correlation between lncRNAs and protein-coding genes across 19 human normal tissues, we used the hypergeometric test to functionally annotate a single lncRNA or a set of lncRNAs with significantly enriched functional terms among the protein-coding genes that are significantly co-expressed with the lncRNA(s). The functional terms include all nodes in the Gene Ontology (GO) and 4,380 human biological pathways collected from 12 pathway databases. We successfully mapped 9,625 human lncRNA genes to GO terms and biological pathways, and then developed the first ontology-driven user-friendly web interface named lncRNA2Function, which enables researchers to browse the lncRNAs associated with a specific functional term, the functional terms associated with a specific lncRNA, or to assign functional terms to a set of human lncRNA genes, such as a cluster of co-expressed lncRNAs. The lncRNA2Function is freely available at http://mlg.hit.edu.cn/lncrna2function.ConclusionsThe LncRNA2Function is an important resource for further investigating the functions of a single human lncRNA, or functionally annotating a set of human lncRNAs of interest.
Long non-coding RNAs (lncRNAs) have emerged as critical regulators of genes at epigenetic, transcriptional and post-transcriptional levels, yet what genes are regulated by a specific lncRNA remains to be characterized. To assess the effects of the lncRNA on gene expression, an increasing number of researchers profiled the genome-wide or individual gene expression level change after knocking down or overexpressing the lncRNA. Herein, we describe a curated database named LncRNA2Target, which stores lncRNA-to-target genes and is publicly accessible at http://www.lncrna2target.org. A gene was considered as a target of a lncRNA if it is differentially expressed after the lncRNA knockdown or overexpression. LncRNA2Target provides a web interface through which its users can search for the targets of a particular lncRNA or for the lncRNAs that target a particular gene. Both search types are performed either by browsing a provided catalog of lncRNA names or by inserting lncRNA/target gene IDs/names in a search box.
The existing large-scale genome-wide association studies (GWAS) datasets provide strong support for investigating the mechanisms of Alzheimer's disease (AD) by applying multiple methods of pathway analysis. Previous studies using selected single nucleotide polymorphisms (SNPs) with several thresholds of nominal significance for pathway analysis determined that the threshold chosen for SNPs can reflect the disease model. Presumably, then, pathway analysis with a stringent threshold to define "associated" SNPs would test the hypothesis that highly associated SNPs are enriched in one or more particular pathways. Here, we selected 599 AD variants (P < 5.00E-08) to investigate the pathways in which these variants are enriched and the cell types in which these variants are active. Our results showed that AD variants are significantly enriched in pathways of the immune system. Further analysis indicated that AD variants are significantly enriched for enhancers in a number of cell types, in particular the B-lymphocyte, which is the most substantially enriched cell type. This cell type maintains its dominance among the strongest enhancers. AD SNPs also display significant enrichment for DNase in 12 cell types, among which the top 6 significant signals are from immune cell types, including 4 B cells (top 4 significant signals) and CD14+ and CD34+ cells. In summary, our results show that these AD variants with P < 5.00E-08 are significantly enriched in pathways of the immune system and active in immune cells. To a certain degree, the genetic predisposition for development of AD is rooted in the immune system, rather than in neuronal cells.
SummaryBackgroundCOVID-19 caused by SARA-CoV-2 is a disaster sweeping over 200 countries, and more than 2,150,000 people are suffering from the disease and 140,000 people died. ACE2 is a receptor protein of SARS- CoV-2, and TMPRSS2 promotes virus proliferation and transmission. Some patients developed multiple organ dysfunction syndromes other than lungs. Therefore, studying the viral susceptibility of other organs is important for a deeper understanding of viral pathogenesis.MethodsThe advantage of scRNA-seq data is the identification of cell types by clustering the gene expression of cells. ACE2 and TMPRSS2 are highly expressed in AT2 of lungs, we compared the ACE2 and TMPRSS2 expression levels of cell types from 31 organs, with AT2 of lungs to evaluate the risk of the viral infection using scRNA-seq data.FindingsFor the first time, we found the brain, gall bladder, and fallopian tube are vulnerable to COVID-19 infection. Besides, the nose, heart, small intestine, large intestine, esophagus, testis and kidney are also identified to be high-risk organs with high expression levels of ACE2 and TMPRSS2. Moreover, the susceptible organs are grouped into three risk levels based on the TMPRSS2 expression. As a result, the respiratory system, digestive system and reproductive system are at the top-risk level to COVID-19 infection.InterpretationThis study provides evidence for COVID-19 infection in the human nervous system, digestive system, reproductive system, respiratory system, circulatory system and urinary system using scRNA-seq data, which helps for the clinical diagnosis and treatment of patients.
Background Epithelial-to-mesenchymal transition (EMT) is a process linked to metastasis and drug resistance with non-coding RNAs (ncRNAs) playing pivotal roles. We previously showed that miR-100 and miR-125b, embedded within the third intron of the ncRNA host gene MIR100HG, confer resistance to cetuximab, an anti-epidermal growth factor receptor (EGFR) monoclonal antibody, in colorectal cancer (CRC). However, whether the MIR100HG transcript itself has a role in cetuximab resistance or EMT is unknown. Methods The correlation between MIR100HG and EMT was analyzed by curating public CRC data repositories. The biological roles of MIR100HG in EMT, metastasis and cetuximab resistance in CRC were determined both in vitro and in vivo. The expression patterns of MIR100HG, hnRNPA2B1 and TCF7L2 in CRC specimens from patients who progressed on cetuximab and patients with metastatic disease were analyzed by RNAscope and immunohistochemical staining. Results The expression of MIR100HG was strongly correlated with EMT markers and acted as a positive regulator of EMT. MIR100HG sustained cetuximab resistance and facilitated invasion and metastasis in CRC cells both in vitro and in vivo. hnRNPA2B1 was identified as a binding partner of MIR100HG. Mechanistically, MIR100HG maintained mRNA stability of TCF7L2, a major transcriptional coactivator of the Wnt/β-catenin signaling, by interacting with hnRNPA2B1. hnRNPA2B1 recognized the N6-methyladenosine (m6A) site of TCF7L2 mRNA in the presence of MIR100HG. TCF7L2, in turn, activated MIR100HG transcription, forming a feed forward regulatory loop. The MIR100HG/hnRNPA2B1/TCF7L2 axis was augmented in specimens from CRC patients who either developed local or distant metastasis or had disease progression that was associated with cetuximab resistance. Conclusions MIR100HG and hnRNPA2B1 interact to control the transcriptional activity of Wnt signaling in CRC via regulation of TCF7L2 mRNA stability. Our findings identified MIR100HG as a potent EMT inducer in CRC that may contribute to cetuximab resistance and metastasis by activation of a MIR100HG/hnRNPA2B1/TCF7L2 feedback loop.
COVID-19 patients always develop multiple organ dysfunction syndromes other than lungs, suggesting the novel virus SARS-CoV-2 also invades other organs. Therefore, studying the viral susceptibility of other organs is important for a deeper understanding of viral pathogenesis. Angiotensin-converting enzyme II (ACE2) is the receptor protein of SARS-CoV-2, and TMPRSS2 promotes virus proliferation and transmission. We investigated the ACE2 and TMPRSS2 expression levels of cell types from 31 organs to evaluate the risk of viral infection using single-cell RNA sequencing (scRNA-seq) data. For the first time, we found that the gall bladder and fallopian tube are vulnerable to SARS-CoV-2 infection. Besides, the nose, heart, small intestine, large intestine, esophagus, brain, testis, and kidney are also identified to be high-risk organs with high expression levels of ACE2 and TMPRSS2. Moreover, the susceptible organs are grouped into three risk levels based on the ACE2 and TMPRSS2 expression. As a result, the respiratory system, digestive system, and urinary system are at the top-risk level for SARS-CoV-2 infection. This study provides evidence for SARS-CoV-2 infection in the human nervous system, digestive system, reproductive system, respiratory system, circulatory system, and urinary system using scRNA-seq data, which helps in the clinical diagnosis and treatment of patients.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.