LncRNAs are regulatory noncoding RNAs that play crucial roles in many biological processes. The dysregulation of lncRNA is thought to be involved in many complex diseases; lncRNAs are often the targets of miRNAs in the indirect regulation of gene expression. Numerous studies have indicated that miRNA-lncRNA interactions are closely related to the occurrence and development of cancers. Thus, it is important to develop an effective method for the identification of cancer-related miRNA-lncRNA interactions. In this study, we compiled 155653 experimentally validated and predicted miRNA-lncRNA associations, which we defined as basic interactions. We next constructed an individual-specific miRNA-lncRNA network (ISMLN) for each cancer sample and a basic miRNA-lncRNA network (BMLN) for each type of cancer by examining the expression profiles of miRNAs and lncRNAs in the TCGA (The Cancer Genome Atlas) database. We then selected potential miRNA-lncRNA biomarkers based on the BLMN. Using this method, we identified cancer-related miRNA-lncRNA biomarkers and modules specific to a certain cancer. This method of profiling will contribute to the diagnosis and treatment of cancers at the level of gene regulatory networks.
Motivation Recent studies have shown that DNA N6-methyladenine (6mA) plays an important role in epigenetic modification of eukaryotic organisms. It has been found that 6mA is closely related to embryonic development, stress response and so on. Developing a new algorithm to quickly and accurately identify 6mA sites in genomes is important for explore their biological functions. Results In this paper, we proposed a new classification method called MM-6mAPred based on a Markov model which makes use of the transition probability between adjacent nucleotides to identify 6mA site. The sensitivity and specificity of our method are 89.32% and 90.11%, respectively. The overall accuracy of our method is 89.72%, which is 6.59% higher than that of the previous method i6mA-Pred. It indicated that, compared with the 41 nucleotide chemical properties used by i6mA-Pred, the transition probability between adjacent nucleotides can capture more discriminant sequence information. Availability and implementation The web server of MM-6mAPred is freely accessible at http://www.insect-genome.com/MM-6mAPred/ Supplementary information Supplementary data are available at Bioinformatics online.
As a novel class of noncoding RNAs, long noncoding RNAs (lncRNAs) have been verified to be associated with various diseases. As large scale transcripts are generated every year, it is significant to accurately and quickly identify lncRNAs from thousands of assembled transcripts. To accurately discover new lncRNAs, we develop a classification tool of random forest (RF) named LncRNApred based on a new hybrid feature. This hybrid feature set includes three new proposed features, which are MaxORF, RMaxORF and SNR. LncRNApred is effective for classifying lncRNAs and protein coding transcripts accurately and quickly. Moreover,our RF model only requests the training using data on human coding and non-coding transcripts. Other species can also be predicted by using LncRNApred. The result shows that our method is more effective compared with the Coding Potential Calculate (CPC). The web server of LncRNApred is available for free at http://mm20132014.wicp.net:57203/LncRNApred/home.jsp.
N6-methyladenine (6mA) is an important DNA modification form associated with a wide range of biological processes. Identifying accurately 6mA sites on a genomic scale is crucial for under-standing of 6mA’s biological functions. However, the existing experimental techniques for detecting 6mA sites are cost-ineffective, which implies the great need of developing new computational methods for this problem. In this paper, we developed, without requiring any prior knowledge of 6mA and manually crafted sequence features, a deep learning framework named Deep6mA to identify DNA 6mA sites, and its performance is superior to other DNA 6mA prediction tools. Specifically, the 5-fold cross-validation on a benchmark dataset of rice gives the sensitivity and specificity of Deep6mA as 92.96% and 95.06%, respectively, and the overall prediction accuracy is 94%. Importantly, we find that the sequences with 6mA sites share similar patterns across different species. The model trained with rice data predicts well the 6mA sites of other three species: Arabidopsis thaliana, Fragaria vesca and Rosa chinensis with a prediction accuracy over 90%. In addition, we find that (1) 6mA tends to occur at GAGG motifs, which means the sequence near the 6mA site may be conservative; (2) 6mA is enriched in the TATA box of the promoter, which may be the main source of its regulating downstream gene expression.
Protein lysine crotonylation (Kcr) is an important type of posttranslational modification that is associated with a wide range of biological processes. The identification of Kcr sites is critical to better understanding their functional mechanisms. However, the existing experimental techniques for detecting Kcr sites are cost-ineffective, to a great need for new computational methods to address this problem. We here describe Adapt-Kcr, an advanced deep learning model that utilizes adaptive embedding and is based on a convolutional neural network together with a bidirectional long short-term memory network and attention architecture. On the independent testing set, Adapt-Kcr outperformed the current state-of-the-art Kcr prediction model, with an improvement of 3.2% in accuracy and 1.9% in the area under the receiver operating characteristic curve. Compared to other Kcr models, Adapt-Kcr additionally had a more robust ability to distinguish between crotonylation and other lysine modifications. Another model (Adapt-ST) was trained to predict phosphorylation sites in SARS-CoV-2, and outperformed the equivalent state-of-the-art phosphorylation site prediction model. These results indicate that self-adaptive embedding features perform better than handcrafted features in capturing discriminative information; when used in attention architecture, this could be an effective way of identifying protein Kcr sites. Together, our Adapt framework (including learning embedding features and attention architecture) has a strong potential for prediction of other protein posttranslational modification sites.
Summary Noncoding RNAs play important roles in transcriptional processes and participate in the regulation of various biological functions, in particular miRNAs and lncRNAs. Despite their importance for several biological functions, the existing signaling pathway databases do not include information on miRNA and lncRNA. Here, we redesigned a novel pathway database named NcPath by integrating and visualizing a total of 178,308 human experimentally-validated miRNA-target interactions (MTIs), 32,282 experimentally-verified lncRNA target interactions (LTIs), and 4,837 experimentally-validated human ceRNA networks across 222 KEGG pathways (including 27 sub-categories). To expand the application potential of the redesigned NcPath database, we identified 556,798 reliable lncRNA-PCG (protein-coding genes) interaction pairs by integrating co-expression relations, ceRNA relations, co-TF-binding interactions, co-Histone-modification interactions, cis-regulation relations and lncPro Tool predictions between lncRNAs and protein-coding genes. In addition, to determine the pathways in which miRNA/lncRNA targets are involved, we performed a KEGG enrichment analysis using an hypergeometric test. The NcPath database also provides information on MTIs/LTIs/ceRNA networks, PubMed IDs, gene annotations and the experimental verification method used. In summary, the NcPath database will serve as an important and continually updated platform that provides annotation and visualization of the pathways on which noncoding RNAs (miRNA and lncRNA) are involved, and provide support to multimodal noncoding RNAs enrichment analysis. The NcPath database is freely accessible at http://ncpath.pianlab.cn/. Availability and implementation NcPath database is freely available at http://ncpath.pianlab.cn/. The code and manual to use NcPath can be found at https://github.com/Marscolono/NcPath/. Supplementary information Supplementary data are available at Bioinformatics online.
miRNAs represent a type of noncoding small molecule RNA. Many studies have shown that miRNAs are widely involved in the regulation of various pathways. The key to fully understanding the regulatory function of miRNAs is the determination of the pathways in which the miRNAs participate. However, the major pathway databases such as KEGG only include information regarding protein-coding genes. Here, we redesigned a pathway database (called miR+Pathway) by integrating and visualizing the 8882 human experimentally validated miRNA-target interactions (MTIs) and 150 KEGG pathways. This database is freely accessible at http://www.insect-genome.com/miR-pathway. Researchers can intuitively determine the pathways and the genes in the pathways that are regulated by miRNAs as well as the miRNAs that target the pathways. To determine the pathways in which targets of a certain miRNA or multiple miRNAs are enriched, we performed a KEGG analysis miRNAs by using the hypergeometric test. In addition, miR+Pathway provides information regarding MTIs, PubMed IDs and the experimental verification method. Users can retrieve pathways regulated by an miRNA or a gene by inputting its names.
Long non-coding RNAs (lncRNAs) are endogenous molecules longer than 200 nucleotides, and lack coding potential. LncRNAs that interact with microRNAs (miRNAs) are known as a competing endogenous RNAs (ceRNAs) and have the ability to regulate the expression of target genes. The ceRNAs play an important role in the initiation and progression of various cancers. However, until now, there is no a database including a collection of experimentally verified, human ceRNAs. We developed the LncCeRBase database, which encompasses 432 lncRNA–miRNA–mRNA interactions, including 130 lncRNAs, 214 miRNAs and 245 genes from 300 publications. In addition, we compiled the signaling pathways associated with the included lncRNA–miRNA–mRNA interactions as a tool to explore their functions. LncCeRBase is useful for understanding the regulatory mechanisms of lncRNA.Database URL: http://lnccerbase.it1004.com
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.