The identification of early and stage-specific biomarkers for Alzheimer's disease (AD) is critical, as the development of disease-modification therapies may depend on the discovery and validation of such markers. The identification of early reliable biomarkers depends on the development of new diagnostic algorithms to computationally exploit the information in large biological datasets. To identify potential biomarkers from mRNA expression profile data, we used the Logic Mining method for the unbiased analysis of a large microarray expression dataset from the anti-NGF AD11 transgenic mouse model. The gene expression profile of AD11 brain regions was investigated at different neurodegeneration stages by whole genome microarrays. A new implementation of the Logic Mining method was applied both to early (1-3 months) and late stage (6-15 months) expression data, coupled to standard statistical methods. A small number of "fingerprinting" formulas was isolated, encompassing mRNAs whose expression levels were able to discriminate between diseased and control mice. We selected three differential "signature" genes specific for the early stage (Nudt19, Arl16, Aph1b), five common to both groups (Slc15a2, Agpat5, Sox2ot, 2210015, D19Rik, Wdfy1), and seven specific for late stage (D14Ertd449, Tia1, Txnl4, 1810014B01Rik, Snhg3, Actl6a, Rnf25). We suggest these genes as potential biomarkers for the early and late stage of AD-like neurodegeneration in this model and conclude that Logic Mining is a powerful and reliable approach for large scale expression data analysis. Its application to large expression datasets from brain or peripheral human samples may facilitate the discovery of early and stage-specific AD biomarkers.
Abstract:The Web Graph is a large-scale graph that does not fit in main memory, so that lossless compression methods have been proposed for it. This paper introduces a compression scheme that combines efficient storage with fast retrieval for the information in a node. The scheme exploits the properties of the Web Graph without assuming an ordering of the URLs, so that it may be applied to more general graphs. Tests on some datasets of use achieve space savings of about 10% over existing methods.
BackgroundDifferences in genomic sequences are crucial for the classification of viruses into different species. In this work, viral DNA sequences belonging to the human polyomaviruses BKPyV, JCPyV, KIPyV, WUPyV, and MCPyV are analyzed using a logic data mining method in order to identify the nucleotides which are able to distinguish the five different human polyomaviruses.ResultsThe approach presented in this work is successful as it discovers several logic rules that effectively characterize the different five studied polyomaviruses. The individuated logic rules are able to separate precisely one viral type from the other and to assign an unknown DNA sequence to one of the five analyzed polyomaviruses.ConclusionsThe data mining analysis is performed by considering the complete sequences of the viruses and the sequences of the different gene regions separately, obtaining in both cases extremely high correct recognition rates.
The network verification problem is that of establishing the accuracy of a high-level description of its physical topology, by making as few measurements as possible on its nodes. This task can be formalized as an optimization problem that, given a graph and a query model specifying the information returned by a query at a node, asks for finding a minimum-size subset of nodes to be queried so as to univocally identify the graph. This problem has been studied with respect to different query models, assuming that a node had some global knowledge about the network. Here, we propose a new query model based on the local knowledge a node instead usually has. Quite naturally, we assume that a query at a given node returns the associated routing table, i.e., a set of entries which provides, for each destination node, a corresponding (set of) first-hop node(s) along an underlying shortest path
The visualization of clustered graphs is a classical algorithmic topic that has several practical applications and is attracting increasing research interest. In this paper we deal with the visualization of clustered trees, a problem that is somehow foundational with respect to the one of visualizing a general clustered graph. We show many, in our opinion, surprising results that put in evidence how drawing clustered trees has many sharp differences with respect to drawing "plain" trees. We study a wide class of drawing standards, giving both negative and positive results. Namely, we show that there are clustered trees that do not have any drawing in certain standards and others that require exponential area. On the contrary, for many drawing conventions there are efficient algorithms that allow to draw clustered trees with polynomial asymptotically-optimal area.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.