MicroRNAs (miRNA) play critical roles in regulating gene expressions at the posttranscriptional levels. The prediction of disease-related miRNA is vital to the further investigation of miRNA's involvement in the pathogenesis of disease. In previous years, biological experimentation is the main method used to identify whether miRNA was associated with a given disease. With increasing biological information and the appearance of new miRNAs every year, experimental identification of disease-related miRNAs poses considerable difficulties (e.g. time-consumption and high cost). Because of the limitations of experimental methods in determining the relationship between miRNAs and diseases, computational methods have been proposed. A key to predict potential disease-related miRNA based on networks is the calculation of similarity among diseases and miRNA over the networks. Different strategies lead to different results. In this review, we summarize the existing computational approaches and present the confronted difficulties that help understand the research status. We also discuss the principles, efficiency and differences among these methods. The comprehensive comparison and discussion elucidated in this work provide constructive insights into the matter.
MicroRNAs (miRNAs) play critical roles in regulating gene expression at post-transcriptional levels. Predicting potential miRNAdisease association is beneficial not only to explore the pathogenesis of diseases, but also to understand biological processes. In this work, we propose two methods that can effectively predict potential miRNAdisease associations using our reconstructed miRNA and disease similarity networks, which are based on the latest experimental data. We reconstruct a miRNA functional similarity network using the following biological information: the miRNA family information, miRNA cluster information, experimentally valid miRNA target association and disease miRNA information. We also reconstruct a disease similarity network using disease functional information and disease semantic information. We present Katz with specific weights and Katz with machine learning, on the comprehensive heterogeneous network. These methods, which achieve corresponding AUC values of 0.897 and 0.919, exhibit performance superior to the existing methods. Comprehensive data networks and reasonable considerations guarantee the high performance of our methods. Contrary to several methods, which cannot work in such situations, the proposed methods also predict associations for diseases without any known related miRNAs. A web service for the download and prediction of relationships between diseases and miRNAs is available at http://lab.malab.cn/soft/MDPredict/.
Raw sequencing reads of miRNAs contain machine-made substitution errors, or even insertions and deletions (indels). Although the error rate can be low at 0.1%, precise rectification of these errors is critically important because isoform variation analysis at single-base resolution such as novel isomiR discovery, editing events understanding, differential expression analysis, or tissue-specific isoform identification is very sensitive to base positions and copy counts of the reads. Existing error correction methods do not work for miRNA sequencing data attributed to miRNAs’ length and per-read-coverage properties distinct from DNA or mRNA sequencing reads. We present a novel lattice structure combining kmers, (k – 1)mers and (k + 1)mers to address this problem. The method is particularly effective for the correction of indel errors. Extensive tests on datasets having known ground truth of errors demonstrate that the method is able to remove almost all of the errors, without introducing any new error, to improve the data quality from every-50-reads containing one error to every-1300-reads containing one error. Studies on experimental miRNA sequencing datasets show that the errors are often rectified at the 5′ ends and the seed regions of the reads, and that there are remarkable changes after the correction in miRNA isoform abundance, volume of singleton reads, overall entropy, isomiR families, tissue-specific miRNAs, and rare-miRNA quantities.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.