Motivation: MicroRNAs (miRNAs) are a highly abundant class of non-coding RNA genes involved in cellular regulation and thus also diseases. Despite miRNAs being important disease factors, miRNA–disease associations remain low in number and of variable reliability. Furthermore, existing databases and prediction methods do not explicitly facilitate forming hypotheses about the possible molecular causes of the association, thereby making the path to experimental follow-up longer.Results: Here we present miRPD in which miRNA–Protein–Disease associations are explicitly inferred. Besides linking miRNAs to diseases, it directly suggests the underlying proteins involved, which can be used to form hypotheses that can be experimentally tested. The inference of miRNAs and diseases is made by coupling known and predicted miRNA–protein associations with protein–disease associations text mined from the literature. We present scoring schemes that allow us to rank miRNA–disease associations inferred from both curated and predicted miRNA targets by reliability and thereby to create high- and medium-confidence sets of associations. Analyzing these, we find statistically significant enrichment for proteins involved in pathways related to cancer and type I diabetes mellitus, suggesting either a literature bias or a genuine biological trend. We show by example how the associations can be used to extract proteins for disease hypothesis.Availability and implementation: All datasets, software and a searchable Web site are available at http://mirpd.jensenlab.org.Contact:
lars.juhl.jensen@cpr.ku.dk or gorodkin@rth.dk
Alternative splicing (AS) is an important contributor to proteome diversity and is regarded as an explanatory factor for the relatively low number of human genes compared with less complex animals. To assess the evolutionary conservation of AS and its developmental regulation, we have investigated the qualitative and quantitative expression of 21 orthologous alternative splice events through the development of 2 nematode species separated by 85-110 Myr of evolutionary time. We demonstrate that most of these alternative splice events present in Caenorhabditis elegans are conserved in Caenorhabditis briggsae. Moreover, we find that relative isoform expression levels vary significantly during development for 78% of the AS events and that this quantitative variation is highly conserved between the 2 species. Our results suggest that AS is generally tightly regulated through development and that the regulatory mechanisms controlling AS are to a large extent conserved during the evolution of Caenorhabditis. This strong conservation indicates that both major and minor splice forms have important functional roles and that the relative quantities in which they are expressed are crucial. Our results therefore suggest that the quantitative regulation of isoform expression levels is an intrinsic part of most AS events. Moreover, our results indicate that AS contributes little to transcript variation in Caenorhabditis genes and that gene duplication may be the major evolutionary mechanism for the origin of novel transcripts in these 2 species.
Molecular detection of viruses has been aided by high-throughput sequencing, permitting the genomic characterization of emerging strains. In this study, we comprehensively screened 500 respiratory secretions from children with upper and/or lower respiratory tract infections for viral pathogens. The viruses detected are described, including a divergent human parainfluenza virus type 4 from GS FLX pyrosequencing of 92 specimens. Complete full-genome characterization of the virus followed, using Single Molecule, Real-Time (SMRT) sequencing. Subsequent “primer walking” combined with Sanger sequencing validated the RS platform's utility in viral sequencing from complex clinical samples. Comparative genomics reveals the divergent strain clusters with the only completely sequenced HPIV4a subtype. However, it also exhibits various structural features present in one of the HPIV4b reference strains, opening questions regarding their lifecycle and evolutionary relationships among these viruses. Clinical data from patients infected with the strain, as well as viral prevalence estimates using real-time PCR, is also described.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.