Transcriptomic analyses have identified tens of thousands of intergenic, intronic, and cis-antisense long noncoding RNAs (lncRNAs) that are expressed from mammalian genomes. Despite progress in functional characterization, little is known about the post-transcriptional regulation of lncRNAs and their half-lives. Although many are easily detectable by a variety of techniques, it has been assumed that lncRNAs are generally unstable, but this has not been examined genome-wide. Utilizing a custom noncoding RNA array, we determined the half-lives of~800 lncRNAs and~12,000 mRNAs in the mouse Neuro-2a cell line. We find only a minority of lncRNAs are unstable. LncRNA half-lives vary over a wide range, comparable to, although on average less than, that of mRNAs, suggestive of complex metabolism and widespread functionality. Combining half-lives with comprehensive lncRNA annotations identified hundreds of unstable (half-life < 2 h) intergenic, cis-antisense, and intronic lncRNAs, as well as lncRNAs showing extreme stability (half-life > 16 h). Analysis of lncRNA features revealed that intergenic and cis-antisense RNAs are more stable than those derived from introns, as are spliced lncRNAs compared to unspliced (single exon) transcripts. Subcellular localization of lncRNAs indicated widespread trafficking to different cellular locations, with nuclear-localized lncRNAs more likely to be unstable. Surprisingly, one of the least stable lncRNAs is the well-characterized paraspeckle RNA Neat1, suggesting Neat1 instability contributes to the dynamic nature of this subnuclear domain. We have created an online interactive resource (http://stability. matticklab.com) that allows easy navigation of lncRNA and mRNA stability profiles and provides a comprehensive annotation of~7200 mouse lncRNAs.
BackgroundSeveral lines of evidence suggest that transcription factors are involved in the pathogenesis of Multiple Sclerosis (MS) but complete mapping of the whole network has been elusive. One of the reasons is that there are several clinical subtypes of MS and transcription factors that may be involved in one subtype may not be in others. We investigate the possibility that this network could be mapped using microarray technologies and contemporary bioinformatics methods on a dataset derived from whole blood in 99 untreated MS patients (36 Relapse Remitting MS, 43 Primary Progressive MS, and 20 Secondary Progressive MS) and 45 age-matched healthy controls.Methodology/Principal FindingsWe have used two different analytical methodologies: a non-standard differential expression analysis and a differential co-expression analysis, which have converged on a significant number of regulatory motifs that are statistically overrepresented in genes that are either differentially expressed (or differentially co-expressed) in cases and controls (e.g., V$KROX_Q6, p-value <3.31E-6; V$CREBP1_Q2, p-value <9.93E-6, V$YY1_02, p-value <1.65E-5).Conclusions/SignificanceOur analysis uncovered a network of transcription factors that potentially dysregulate several genes in MS or one or more of its disease subtypes. The most significant transcription factor motifs were for the Early Growth Response EGR/KROX family, ATF2, YY1 (Yin and Yang 1), E2F-1/DP-1 and E2F-4/DP-2 heterodimers, SOX5, and CREB and ATF families. These transcription factors are involved in early T-lymphocyte specification and commitment as well as in oligodendrocyte dedifferentiation and development, both pathways that have significant biological plausibility in MS causation.
The public health system has restricted economic resources. Because of that, it is necessary to know how the resources are being used and if they are properly distributed. Several works have applied classical approaches based in Data Envelopment Analysis (DEA) and Stochastic Frontier Analysis (SFA) for this purpose. However, if we have hospitals with different casemix, this is not the best approach. In order to avoid biases in the comparisons, other works have recommended the use of hospital production data corrected by the weights from Diagnosis Related Groups (DRGs), to adjust the casemix of hospitals. However, not all countries have this tool fully implemented, which limits the efficiency evaluation. This paper proposes a new approach for evaluating the efficiency of hospitals. It uses a graph-based clustering algorithm to find groups of hospitals that have similar production profiles. Then, DEA is used to evaluate the technical efficiency of each group. The proposed approach is tested using the production data from 2014 of 193 Chilean public hospitals. The results allowed to identify different performance profiles of each group, that differs from other studies that employs data from partially implemented DRGs. Our results are able to deliver a better description of the resource management of the different groups of hospitals. We have created a website with the results ( bioinformatic.diinf.usach.cl/publichealth ). Data can be requested to the authors.
BackgroundBiologists aim to understand the genetic background of diseases, metabolic disorders or any other genetic condition. Microarrays are one of the main high-throughput technologies for collecting information about the behaviour of genetic information on different conditions. In order to analyse this data, clustering arises as one of the main techniques used, and it aims at finding groups of genes that have some criterion in common, like similar expression profile. However, the problem of finding groups is normally multi dimensional, making necessary to approach the clustering as a multi-objective problem where various cluster validity indexes are simultaneously optimised. They are usually based on criteria like compactness and separation, which may not be sufficient since they can not guarantee the generation of clusters that have both similar expression patterns and biological coherence.MethodWe propose a Multi-Objective Clustering algorithm Guided by a-Priori Biological Knowledge (MOC-GaPBK) to find clusters of genes with high levels of co-expression, biological coherence, and also good compactness and separation. Cluster quality indexes are used to optimise simultaneously gene relationships at expression level and biological functionality. Our proposal also includes intensification and diversification strategies to improve the search process.ResultsThe effectiveness of the proposed algorithm is demonstrated on four publicly available datasets. Comparative studies of the use of different objective functions and other widely used microarray clustering techniques are reported. Statistical, visual and biological significance tests are carried out to show the superiority of the proposed algorithm.ConclusionsIntegrating a-priori biological knowledge into a multi-objective approach and using intensification and diversification strategies allow the proposed algorithm to find solutions with higher quality than other microarray clustering techniques available in the literature in terms of co-expression, biological coherence, compactness and separation.
In microbiology, identification of all isolates by sequencing is still unfeasible in small research laboratories. Therefore, many yeast diversity studies follow a screening procedure consisting of clustering the yeast isolates using MSP-PCR fingerprinting, followed by identification of one or a few selected representatives of each cluster by sequencing. Although this procedure has been widely applied in the literature, it has not been properly validated. We evaluated a standardized protocol using MSP-PCR fingerprinting with the primers (GTG)5 and M13 for the discrimination of wine associated yeasts in South Brazil. Two datasets were used: yeasts isolated from bottled wines and vineyard environments. We compared the discriminatory power of both primers in a subset of 16 strains, choosing the primer (GTG)5 for further evaluation. Afterwards, we applied this technique to 245 strains, and compared the results with the identification obtained by partial sequencing of the LSU rRNA gene, considered as the gold standard. An array matrix was constructed for each dataset and used as input for clustering with two methods (hierarchical dendrograms and QAPGrid layout). For both yeast datasets, unrelated species were clustered in the same group. The sensitivity score of (GTG)5 MSP-PCR fingerprinting was high, but specificity was low. As a conclusion, the yeast diversity inferred in several previous studies may have been underestimated and some isolates were probably misidentified due to the compliance to this screening procedure.
BackgroundThe visualization of large volumes of data is a computationally challenging task that often promises rewarding new insights. There is great potential in the application of new algorithms and models from combinatorial optimisation. Datasets often contain “hidden regularities” and a combined identification and visualization method should reveal these structures and present them in a way that helps analysis. While several methodologies exist, including those that use non-linear optimization algorithms, severe limitations exist even when working with only a few hundred objects.Methodology/Principal FindingsWe present a new data visualization approach (QAPgrid) that reveals patterns of similarities and differences in large datasets of objects for which a similarity measure can be computed. Objects are assigned to positions on an underlying square grid in a two-dimensional space. We use the Quadratic Assignment Problem (QAP) as a mathematical model to provide an objective function for assignment of objects to positions on the grid. We employ a Memetic Algorithm (a powerful metaheuristic) to tackle the large instances of this NP-hard combinatorial optimization problem, and we show its performance on the visualization of real data sets.Conclusions/SignificanceOverall, the results show that QAPgrid algorithm is able to produce a layout that represents the relationships between objects in the data set. Furthermore, it also represents the relationships between clusters that are feed into the algorithm. We apply the QAPgrid on the 84 Indo-European languages instance, producing a near-optimal layout. Next, we produce a layout of 470 world universities with an observed high degree of correlation with the score used by the Academic Ranking of World Universities compiled in the The Shanghai Jiao Tong University Academic Ranking of World Universities without the need of an ad hoc weighting of attributes. Finally, our Gene Ontology-based study on Saccharomyces cerevisiae fully demonstrates the scalability and precision of our method as a novel alternative tool for functional genomics.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.