Lung cancer has the highest mortality rate of all of the cancers in the world and asbestos-related lung cancer is one of the leading occupational cancers. The identification of asbestos-related molecular changes has long been a topic of increasing research interest. The aim of this study was to identify novel asbestos-related molecular correlates by integrating miRNA expression profiling with previously obtained profiling data (aCGH and mRNA expression) from the same patient material. miRNA profiling was performed on 26 tumor and corresponding normal lung tissue samples from highly asbestos-exposed and non-exposed patients, and on eight control lung tissue samples. Data analyses on miRNA expression, and integration of miRNA and previously obtained mRNA data were performed using Chipster. A separate analysis was used to integrate miRNA and previously obtained aCGH data. Both known and new lung cancer-associated miRNAs and target genes with inverse correlation were discovered. Furthermore, DNA copy number alterations (e.g., gain at 12p13.31) were correlated with the deregulated miRNAs. Specifically, thirteen novel asbestos-related miRNAs (over-expressed: miR-148b, miR-374a, miR-24-1*, Let-7d, Let-7e, miR-199b-5p, miR-331-3p, and miR-96 and under-expressed: miR-939, miR-671-5p, miR-605, miR-1224-5p and miR-202) and inversely correlated target genes (e.g., GADD45A, LTBP1, FOSB, NCALD, CACNA2D2, MTSS1, EPB41L3) were identified. In addition, over-expression of the well known squamous cell carcinoma-associated miR-205 was linked to down-regulation of the DOK4 gene. The miRNAs/genes presented here may represent interesting targets for further investigation and could eventually have potential diagnostic implications.
The complexity of ecosystems is staggering, with hundreds or thousands of species interacting in a number of ways from competition and predation to facilitation and mutualism. Understanding the networks that form the systems is of growing importance, e.g. to understand how species will respond to climate change, or to predict potential knock-on effects of a biological control agent. In recent years, a variety of summary statistics for characterising the global and local properties of such networks have been derived, which provide a measure for gauging the accuracy of a mathematical model for network formation processes. However, the critical underlying assumption is that the true network is known. This is not a straightforward task to accomplish, and typi- * Corresponding author. Tel: +44 (0) cally requires minute observations and detailed field work. More importantly, knowledge about species interactions is restricted to specific kinds of interactions. For instance, while the interactions between pollinators and their host plants are amenable to direct observation, other types of species interactions, like those mentioned above, are not, and might not even be clearly defined from the outset. To discover information about complex ecological systems efficiently, new tools for inferring the structure of networks from field data are needed. In the present study, we investigate the viability of various statistical and machine learning methods recently applied in molecular systems biology: graphical Gaussian models, L1-regularised regression with least absolute shrinkage and selection operator (LASSO), sparse Bayesian regression and Bayesian networks. We have assessed the performance of these methods on data simulated from food webs of known structure, where we combined a niche model with a stochastic population model in a 2-dimensional lattice.We assessed the network reconstruction accuracy in terms of the area under the receiver operator characteristics (ROC) curve, which was typically in the range between 0.75 and 0.9, corresponding to the recovery of about 60% of the true species interactions at a false prediction rate of 5%. We also applied the models to presence/absence data for 39 European warblers, and found that the inferred species interactions showed a weak yet significant correlation with phylogenetic similarity scores, which tended to weakly increase when including bio-climate covariates and allowing for spatial autocorrelation. Our findings demonstrate that relevant patterns in ecological networks can be identified from large-scale spatial data sets with machine learning methods, and that these methods have the potential to contribute novel important tools for gaining deeper insight into the structure and stability of ecosystems.
Motivation: As ArrayExpress and other repositories of genome-wide experiments are reaching a mature size, it is becoming more meaningful to search for related experiments, given a particular study. We introduce methods that allow for the search to be based upon measurement data, instead of the more customary annotation data. The goal is to retrieve experiments in which the same biological processes are activated. This can be due either to experiments targeting the same biological question, or to as yet unknown relationships.Results: We use a combination of existing and new probabilistic machine learning techniques to extract information about the biological processes differentially activated in each experiment, to retrieve earlier experiments where the same processes are activated and to visualize and interpret the retrieval results. Case studies on a subset of ArrayExpress show that, with a sufficient amount of data, our method indeed finds experiments relevant to particular biological questions. Results can be interpreted in terms of biological processes using the visualization techniques.Availability: The code is available from http://www.cis.hut.fi/projects/mi/software/ismb09.Contact: jose.caldas@tkk.fi
BackgroundDetailed and systematic understanding of the biological effects of millions of available compounds on living cells is a significant challenge. As most compounds impact multiple targets and pathways, traditional methods for analyzing structure-function relationships are not comprehensive enough. Therefore more advanced integrative models are needed for predicting biological effects elicited by specific chemical features. As a step towards creating such computational links we developed a data-driven chemical systems biology approach to comprehensively study the relationship of 76 structural 3D-descriptors (VolSurf, chemical space) of 1159 drugs with the microarray gene expression responses (biological space) they elicited in three cancer cell lines. The analysis covering 11350 genes was based on data from the Connectivity Map. We decomposed the biological response profiles into components, each linked to a characteristic chemical descriptor profile.ResultsIntegrated analysis of both the chemical and biological space was more informative than either dataset alone in predicting drug similarity as measured by shared protein targets. We identified ten major components that link distinct VolSurf chemical features across multiple compounds to specific cellular responses. For example, component 2 (hydrophobic properties) strongly linked to DNA damage response, while component 3 (hydrogen bonding) was associated with metabolic stress. Individual structural and biological features were often linked to one cell line only, such as leukemia cells (HL-60) specifically responding to cardiac glycosides.ConclusionsIn summary, our approach identified several novel links between specific chemical structure properties and distinct biological responses in cells incubated with these drugs. Importantly, the analysis focused on chemical-biological properties that emerge across multiple drugs. The decoding of such systematic relationships is necessary to build better models of drug effects, including unanticipated types of molecular properties having strong biological effects.
Modern theories of semantics posit that the meaning of words can be decomposed into a unique combination of semantic features (e.g., “dog” would include “barks”). Here, we demonstrate using functional MRI (fMRI) that the brain combines bits of information into meaningful object representations. Participants receive clues of individual objects in form of three isolated semantic features, given as verbal descriptions. We use machine-learning-based neural decoding to learn a mapping between individual semantic features and BOLD activation patterns. The recorded brain patterns are best decoded using a combination of not only the three semantic features that were in fact presented as clues, but a far richer set of semantic features typically linked to the target object. We conclude that our experimental protocol allowed us to demonstrate that fragmented information is combined into a complete semantic representation of an object and to identify brain regions associated with object meaning.
BackgroundRepositories of genome-wide expression studies such as ArrayExpress [1] have been growing rapidly over the last few years and continue to do so. The more experimental data are deposited into these repositories, the more likely it becomes that some of them can provide a meaningful biological context to aid in the planning and analysis of new studies. Retrieval of experiments based on their textual description and experimental design has several shortcomings. First of all, textual description of an experiment or its results is not as information-rich as the actual data itself. Secondly, information about the experimental design alone is only of limited use in retrieving biologically relevant data because it does not reflect the results, which contain the bulk of the information and may reveal unexpected relationships. We introduce novel retrieval methods that incorporate the actual gene expression measurements into the search process, along with visualization tools for interpreting and exploring the results [2]. MethodsWe developed a two-stage procedure, first identifying differentially active gene sets in each experiment using a recent nonparametric statistical method [3], and then combining gene set activation patterns into higher-level structures, so-called biological topics, using a state-of-theart probabilistic model [4]. The probabilistic formulation enables the use of a natural and rigorous metric for assessing the similarity between two experiments. For interpreting and exploring retrieval results, we have developed visualization methods that also provide insight into the model used to perform the retrieval. ResultsWe show that gene sets corresponding to each biological topic form highly coherent and holistic components. Several case studies performed on a subset of ArrayExpress show that our method can retrieve experiments relevant to a biological question, as long as sufficient amounts of data are available, and highlight relations between experiments, either because the same biological questions were targeted, or because of unexpected relationships that were confirmed in the literature. The visualization methods allow us to both efficiently interpret the model and put retrieval results in the context of the whole set of experiments (see Figure 1 for an example). ConclusionUsing a combination of existing and novel methods for modeling and visualizing a heterogeneous collection of gene expression experiments, we were able to
Motivation: Genome-wide measurement of transcript levels is an ubiquitous tool in biomedical research. As experimental data continues to be deposited in public databases, it is becoming important to develop search engines that enable the retrieval of relevant studies given a query study. While retrieval systems based on meta-data already exist, data-driven approaches that retrieve studies based on similarities in the expression data itself have a greater potential of uncovering novel biological insights.Results: We propose an information retrieval method based on differential expression. Our method deals with arbitrary experimental designs and performs competitively with alternative approaches, while making the search results interpretable in terms of differential expression patterns. We show that our model yields meaningful connections between biological conditions from different studies. Finally, we validate a previously unknown connection between malignant pleural mesothelioma and SIM2s suggested by our method, via real-time polymerase chain reaction in an independent set of mesothelioma samples.Availability: Supplementary data and source code are available from http://www.ebi.ac.uk/fg/research/rex.Contact: samuel.kaski@aalto.fiSupplementary Information: Supplementary data are available at Bioinformatics online.
We can easily identify a dog merely by the sound of barking or an orange by its citrus scent. In this work, we study the neural underpinnings of how the brain combines bits of information into meaningful object representations. Modern theories of semantics posit that the meaning of words can be decomposed into a unique combination of individual semantic features (e.g., "barks", "has citrus scent"). Here, participants received clues of individual objects in form of three isolated semantic features, given as verbal descriptions. We used machine-learning-based neural decoding to learn a mapping between individual semantic features and BOLD activation patterns. We discovered that the recorded brain patterns were best decoded using a combination of not only the three semantic features that were presented as clues, but a far richer set of semantic features typically linked to the target object. We conclude that our experimental protocol allowed us to observe how fragmented information is combined into a complete semantic representation of an object and suggest neuroanatomical underpinnings for this process.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
334 Leonard St
Brooklyn, NY 11211
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.