BackgroundmicroRNAs (miRNAs) are a class of small non-coding RNAs which have been recognized as ubiquitous post-transcriptional regulators. The analysis of interactions between different miRNAs and their target genes is necessary for the understanding of miRNAs' role in the control of cell life and death. In this paper we propose a novel data mining algorithm, called HOCCLUS2, specifically designed to bicluster miRNAs and target messenger RNAs (mRNAs) on the basis of their experimentally-verified and/or predicted interactions. Indeed, existing biclustering approaches, typically used to analyze gene expression data, fail when applied to miRNA:mRNA interactions since they usually do not extract possibly overlapping biclusters (miRNAs and their target genes may have multiple roles), extract a huge amount of biclusters (difficult to browse and rank on the basis of their importance) and work on similarities of feature values (do not limit the analysis to reliable interactions).ResultsTo overcome these limitations, HOCCLUS2 i) extracts possibly overlapping biclusters, to catch multiple roles of both miRNAs and their target genes; ii) extracts hierarchically organized biclusters, to facilitate bicluster browsing and to distinguish between universe and pathway-specific miRNAs; iii) extracts highly cohesive biclusters, to consider only reliable interactions; iv) ranks biclusters according to the functional similarities, computed on the basis of Gene Ontology, to facilitate bicluster analysis.ConclusionsOur results show that HOCCLUS2 is a valid tool to support biologists in the identification of context-specific miRNAs regulatory modules and in the detection of possibly unknown miRNAs target genes. Indeed, results prove that HOCCLUS2 is able to extract cohesiveness-preserving biclusters, when compared with competitive approaches, and statistically confirm (at a confidence level of 99%) that mRNAs which belong to the same biclusters are, on average, more functionally similar than mRNAs which belong to different biclusters. Finally, the hierarchy of biclusters provides useful insights to understand the intrinsic hierarchical organization of miRNAs and their potential multiple interactions on target genes.
Networks had an increasing impact on modern life since network cybersecurity has become an important research field. Several machine learning techniques have been developed to build network intrusion detection systems for correctly detecting unforeseen cyber-attacks at the network-level. For example, deep artificial neural network architectures have recently achieved state-of-the-art results. In this paper a novel deep neural network architecture is defined, in order to learn flexible and effective intrusion detection models, by combining an unsupervised stage for multi-channel feature learning with a supervised one exploiting feature dependencies on cross channels. The aim is to investigate whether class-specific features of the network flows could be learned and added to the original ones in order to increase the model accuracy. In particular, in the unsupervised stage, two autoencoders are separately learned on normal and attack flows, respectively. As the top layer in the decoder of these autoencoders reconstructs samples in the same space as the input one, they could be used to define two new feature vectors allowing the representation of each network flow as a multi-channel sample. In the supervised stage, a multi-channel parametric convolution is adopted, in order to learn the effect of each channel on the others. In particular, as the samples belong to two different distributions (normal and attack flows), the samples labelled as normal should be more similar to the representation reconstructed with the normal autoencoder than that of the attack one, and viceversa. This expected dependency will be exploited to better disentangle the differences between normal and attack flows. The proposed neural network architecture leads to better predictive accuracy when compared to competitive intrusion detection architectures on three benchmark datasets.
Sensor networks, communication and financial networks, web and social networks are becoming increasingly important in our day-to-day life. They contain entities which may interact with one another. These interactions are often characterized by a form of autocorrelation, where the value of an attribute at a given entity depends on the values at the entities it is interacting with. In this situation, the collective inference paradigm offers a unique opportunity to improve the performance of predictive models on network data, as interacting instances are labeled simultaneously by dealing with autocorrelation. Several recent works have shown that collective inference is a powerful paradigm, but it is mainly developed with a fully-labeled training network. In contrast, while it may be cheap to acquire the network topology, it may be costly to acquire node labels for training. In this paper, we examine how to explicitly consider autocorrelation when performing regression inference within network data. In particular, we study the transduction of collective regression when a sparsely labeled network is a common situation. We present an algorithm, called CORENA (COllective REgression in Network dAta), to assign a numeric label to each instance in the network. In particular, we iteratively augment the representation of each instance with instances sharing correlated representations across the network. In this way, the proposed learning model is able to capture autocorrelations of labels over a group of related instances and feed-back the more reliable labels predicted by the transduction in the labeled network. Empirical studies demonstrate that the proposed approach can boost regression performances in several spatial and social tasks
Most of the works on learning from networked data assume that the network is static. In this paper we consider a different scenario, where the network is dynamic, i.e. nodes/relationships can be added or removed and relationships can change in their type over time. We assume that the “core” of the network is more stable than the “marginal” part of the network, nevertheless it can change with time. These changes are of interest for this work, since they reflect a crucial step in the network evolution. Indeed, we tackle the problem of discovering evolution chains, which express the temporal evolution of the “core” of the network. To describe the “core” of the network, we follow a frequent pattern-mining approach, with the critical difference that the frequency of a pattern is computed along a time-period and not on a static dataset. The proposed method proceeds in two steps: 1) identification of changes through the discovery of emerging patterns; 2) composition of evolution chains by joining emerging patterns. We test the effectiveness of the method on both real and synthetic data
Abstract. Bisociations represent interesting relationships between seemingly unconnected concepts from two or more contexts. Most of the existing approaches that permit the discovery of bisociations from data rely on the assumption that contexts are static or considered as unchangeable domains. Actually, several real-world domains are intrinsically dynamic and can change over time. The same domain can change and can become completely different from what/how it was before: a dynamic domain observed at different time-points can present different representations and can be reasonably assimilated to a series of distinct static domains. In this work, we investigate the task of linking concepts from a dynamic domain through the discovery of bisociations which link concepts over time. This provides us with a means to unearth linkages which have not been discovered when observing the domain as static, but which may have developed over time, when considering the dynamic nature. We propose a computational solution which, assuming a time interval-based discretization of the domain, explores the spaces of association rules mined in the intervals and chains the rules on the basis of the concept generalization and information theory criteria. The application to the literature-based discovery shows how the method can re-discover known connections in biomedical terminology. Experiments and comparisons using alternative techniques highlight the additional peculiarities of this work.
Background: Many studies report about detection and functional characterization of cis-regulatory motifs in untranslated regions (UTRs) of mRNAs but little is known about the nature and functional role of their distribution. To address this issue we have developed a computational approach based on the use of data mining techniques. The idea is that of mining frequent combinations of translation regulatory motifs, since their significant co-occurrences could reveal functional relationships important for the post-transcriptional control of gene expression. The experimentation has been focused on targeted mitochondrial transcripts to elucidate the role of translational control in mitochondrial biogenesis and function.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.