BackgroundThe increasing amount of published literature in biomedicine represents an immense source of knowledge, which can only efficiently be accessed by a new generation of automated information extraction tools. Named entity recognition of well-defined objects, such as genes or proteins, has achieved a sufficient level of maturity such that it can form the basis for the next step: the extraction of relations that exist between the recognized entities. Whereas most early work focused on the mere detection of relations, the classification of the type of relation is also of great importance and this is the focus of this work. In this paper we describe an approach that extracts both the existence of a relation and its type. Our work is based on Conditional Random Fields, which have been applied with much success to the task of named entity recognition.ResultsWe benchmark our approach on two different tasks. The first task is the identification of semantic relations between diseases and treatments. The available data set consists of manually annotated PubMed abstracts. The second task is the identification of relations between genes and diseases from a set of concise phrases, so-called GeneRIF (Gene Reference Into Function) phrases. In our experimental setting, we do not assume that the entities are given, as is often the case in previous relation extraction work. Rather the extraction of the entities is solved as a subproblem. Compared with other state-of-the-art approaches, we achieve very competitive results on both data sets. To demonstrate the scalability of our solution, we apply our approach to the complete human GeneRIF database. The resulting gene-disease network contains 34758 semantic associations between 4939 genes and 1745 diseases. The gene-disease network is publicly available as a machine-readable RDF graph.ConclusionWe extend the framework of Conditional Random Fields towards the annotation of semantic relations from text and apply it to the biomedical domain. Our approach is based on a rich set of textual features and achieves a performance that is competitive to leading approaches. The model is quite general and can be extended to handle arbitrary biological entities and relation types. The resulting gene-disease network shows that the GeneRIF database provides a rich knowledge source for text mining. Current work is focused on improving the accuracy of detection of entities as well as entity boundaries, which will also greatly improve the relation extraction performance.
We report an analysis of orientation and ocular dominance maps that were recorded optically from area 17 of cats and ferrets. Similar to a recent study performed in primates (Obermayer & Blasdel, 1997), we find that 80% (for cats and ferrets) of orientation singularities that are nearest neighbors have opposite sign and that the spatial distribution of singularities deviates from a random distribution of points, because the average distances between nearest neighbors are significantly larger than expected for a random distribution. Orientation maps of normally raised cats and ferrets show approximately the same typical wavelength; however, the density of singularities is higher in ferrets than in cats. Also, we find the well-known overrepresentation of cardinal versus oblique orientations in young ferrets (Chapman & Bonhoeffer, 1998; Coppola, White, Fitzpatrick, & Purves, 1998) but only a weak, not quite significant overrepresentation of cardinal orientations in cats, as has been reported previously (Bonhoeffer & Grinvald, 1993). Orientation and ocular dominance slabs in cats exhibit a tendency of being orthogonal to each other (Hubener, Shoham, Grinvald, & Bonhoeffer, 1997), albeit less pronounced, as has been reported for primates (Obermayer & Blasdel, 1993). In chronic recordings from single animals, a decrease of the singularity density and an increase of the ocular dominance wavelength with age but no change of the orientation wavelengths were found. Orientation maps are compared with two pattern models for orientation preference maps: bandpass-filtered white noise and the field analogy model. Bandpass-filtered white noise predicts sign correlations between orientation singularities, but the correlations are significantly stronger (87% opposite sign pairs) than what we have found in the data. Also, bandpass-filtered noise predicts a deviation of the spatial distribution of singularities from a random dot pattern. The field analogy model can account for the structure of certain local patches but not for the whole orientation map. Differences between the predictions of the field analogy model and experimental data are smaller than what has been reported for primates (Obermayer & Blasdel, 1997), which can be explained by the smaller size of the imaged areas in cats and ferrets.
Coulomb energy losses by 3-MeV protons in a capillary discharge channel are used as a diagnostics tool to measure the plasma density. By combining the proton energy loss data with the electron temperature measurements,we have been able to diagnose the free electron density n fe ϭ6.4ϫ10 19 cm Ϫ3 in a 3.3-eV CH 2 plasma to an accuracy of Ϯ17%. A considerably better accuracy can be expected for higher values of the electron temperature. ͓S1063-651X͑98͒08101-X͔ PACS number͑s͒: 52.40.Mj, 34.50.Bw
Motivation: Modern machine learning methods based on matrix decomposition techniques, like independent component analysis (ICA) or non-negative matrix factorization (NMF), provide new and efficient analysis tools which are currently explored to analyze gene expression profiles. These exploratory feature extraction techniques yield expression modes (ICA) or metagenes (NMF). These extracted features are considered indicative of underlying regulatory processes. They can as well be applied to the classification of gene expression datasets by grouping samples into different categories for diagnostic purposes or group genes into functional categories for further investigation of related metabolic pathways and regulatory networks. Results: In this study we focus on unsupervised matrix factorization techniques and apply ICA and sparse NMF to microarray datasets. The latter monitor the gene expression levels of human peripheral blood cells during differentiation from monocytes to macrophages. We show that these tools are able to identify relevant signatures in the deduced component matrices and extract informative sets of marker genes from these gene expression profiles. The methods rely on the joint discriminative power of a set of marker genes rather than on single marker genes. With these sets of marker genes, corroborated by leave-one-out or random forest cross-validation, the datasets could easily be classified into related diagnostic categories. The latter correspond to either monocytes versus macrophages or healthy vs Niemann Pick C disease patients.
2؉ dynamics in space and time can be reliably described as a superposition of only two spatiotemporally separable patterns based on the fast and slow components. However, the distributions of both components over space turn out to differ from each other, and more work has to be done in order to specify their relationship with neuronal activity.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.